SEMISOL

Semi-supervised Structured Output Learning from partially labeled data

Marie Curie Re-integration grant
PERG04-GA-2008-239455


Investigator: Vojtech Franc
Scientist in charge: Vaclav Hlavac
Hosting organization: Czech Technical University in Prague
Project duration: 36 months (from June 1, 2009 to May 31, 2012)

Summary

The goal of this project was to develop methods for semi-supervised learning of the structured output classifires from partially annotated examples and application of these methods in computer vision.

We have developed a probabilistic interpretation of the Support Vector Machines (SVMs). The SVMs are the most frequently used discriminative methods for learning classifires from fully annotated examples. We showed how the SVM can be viewed as a maximum likelihood estimate of a class of probabilistic model. The developed probabilistic model explains existing SVM based heuristics for dealing with incomplete data and it allows principled extensions of SVMs for semi-supervised learning. The model explains only the two-class variant of the SVMs but we believe that it is the necessary first step towards understanding the more complex structured output SVMs (SO-SVMs).

We developed statistical formulation of the problem of learning structured output classifiers from partially annotated examples. We showed that learning from partially annotated examples can be translated to an instance of the empirical risk minimization problem which can be solved approximately in a vein similar to the SO-SVMs. Unlike the supervised SO-SVMs, semi-supervised learning from partially labeled examples leads to a harder non-convex optimization problem. We proposed a convex relaxation of the semi-supervised problem which uses an additional prior knowledge about the problem. Experiments on real-life computer vision applications show that learning from partially annotated examples can achieve results comparable to the fully supervised learning while it significantly reduces demands on manually annotated data. However, a more precise tractable convex approximation of the semi-supervised SOL problem remains an open problem.

We have developed two novel optimization strategies for supervised SOL which are necessary routines called by approximate algorithms for semi-supervised SOL methods. The proposed SOL solvers provide certificate of optimality and they were shown to outperform current state-of-the-art algorithms in terms of the computational time.

The theory developed in this project was applied to several computer vision problems including detection of face landmarks, gender estimation from face images and image segmentation. Two open source libraries have been developed which provide algorithms for the facial landmark detection and the gender estimation from face images. The methods developed in the project has become also a part of the Shogun machine learning toolbox.

List of publications

Software


Last update 17-Jul-2012