One of the observations we make is that the transferred representations support only a few modes of separation and much of their dimensionality is underutilized. To alleviate this, we suggest to learn, in the source domain, multiple orthogonal classifiers. We prove that this leads to a reduced rank representation, which, however, supports more discriminative directions. For example, we obtain the 2nd best result on LFW for a single network. This is achieved using a training set that is a few orders of magnitude smaller than that of the leading literature network, and using a very compact representation of only 51 dimensions.
Given time, I will describe recent achievements in other computer vision domains: optical flow, action recognition, image annotation, and more.