Alexander Shekhovtsov presents Learning of Stochastic Binary Networks (Current Work)

On 2019-12-17 11:00 at G205, Karlovo náměstí 13, Praha 2

Neural networks with binary activations and binary weights have been shown
to achieve recognition rates close to that of common full precision
networks while being significantly more efficient to compute. There are
several heuristic approaches to train binary NNs. The main difficulty is
that in a pure binary network one cannot develop a locally linear
approximation based on the gradient. In this work we look at Stochastic
binary NNs that add a small noise in front of all activations. At the test
time, we obtain an ensemble of binary networks, by drawing several noise
samples. At the training time, the gradient of the expected output (an
infinite ensemble) is well defined and can be approximated. In this current
work we discuss several learning formulations with stochastic binary
networks and their properties. I will present a new stochastic
approximation method that leads to a low variance and low bias estimate,
experimentally verified on small examples. Applying it to large scale
convolutional networks is possible with a modified convolution that is only
twice more computationally expensive (at training time) than the standard
one. The real implementation however (and respectively experiments with
CNNs) is currently in development. The discussion and feedback are very
welcome.