# Alexander Shekhovtsov presents Learning of Stochastic Binary Networks (Current Work)

On 2019-12-17 11:00
at G205, Karlovo náměstí 13, Praha 2

Neural networks with binary activations and binary weights have been shown

to achieve recognition rates close to that of common full precision

networks while being significantly more efficient to compute. There are

several heuristic approaches to train binary NNs. The main difficulty is

that in a pure binary network one cannot develop a locally linear

approximation based on the gradient. In this work we look at Stochastic

binary NNs that add a small noise in front of all activations. At the test

time, we obtain an ensemble of binary networks, by drawing several noise

samples. At the training time, the gradient of the expected output (an

infinite ensemble) is well defined and can be approximated. In this current

work we discuss several learning formulations with stochastic binary

networks and their properties. I will present a new stochastic

approximation method that leads to a low variance and low bias estimate,

experimentally verified on small examples. Applying it to large scale

convolutional networks is possible with a modified convolution that is only

twice more computationally expensive (at training time) than the standard

one. The real implementation however (and respectively experiments with

CNNs) is currently in development. The discussion and feedback are very

welcome.

to achieve recognition rates close to that of common full precision

networks while being significantly more efficient to compute. There are

several heuristic approaches to train binary NNs. The main difficulty is

that in a pure binary network one cannot develop a locally linear

approximation based on the gradient. In this work we look at Stochastic

binary NNs that add a small noise in front of all activations. At the test

time, we obtain an ensemble of binary networks, by drawing several noise

samples. At the training time, the gradient of the expected output (an

infinite ensemble) is well defined and can be approximated. In this current

work we discuss several learning formulations with stochastic binary

networks and their properties. I will present a new stochastic

approximation method that leads to a low variance and low bias estimate,

experimentally verified on small examples. Applying it to large scale

convolutional networks is possible with a modified convolution that is only

twice more computationally expensive (at training time) than the standard

one. The real implementation however (and respectively experiments with

CNNs) is currently in development. The discussion and feedback are very

welcome.