Support Vector Machines (SVM)
In this task we will construct a classifier which will
maximize the distance between the decision boundary (hyper-plane) and the data
points. This classifier is called Support Vector Machine (SVM). Today we will
implement linear version of SVM.
Preparation (at home and on the lecture)
- Study the formulation of
optimization problem, which is solved by SVM (see [2]:
equation (1) and (2) for linearly separable, (4) and (5) for
non-separable). What kind is the optimization problem? Compare it with task
of perceptron learning.
- Study the geometrical
meaning of the variables in both primal and dual task.
In the exercise:
- Implement SVM algorithm for
linearly separable and non-separable data. Use the dual task formulation. For
solving the dual task use the quadratic programming solver function gsmo.
Notes on usage of gsmo:
- Function is the part
of STPR toolbox, but because it must be compiled from gsmo_mex.c
into matlab binary module (MEX), which may
differ for different systems. Compiled module can be downloaded form here. Attention: In MATLAB this function
should be called as gsmo (help is available by
typing `help gsmo'), not gsmo_mex.
- It is usefull to set `options.verb'
(see help), so that you see the progress of optimization.
- It is also usefull to bound the maximal
number of iterations (parameter `options.tmax').
The reason is, that when primal QP task is not
feasible (there is no parameters satisfying constraints, the dual problem
is not bounded (see [2]) and the algorithm will infinitely diverge.
- You should check
output parameter `stat' to detect if the optimization finished
successfully. If it was not successful, we should figure out why.
- Show on the separable data
how the value of constant C influences the position
of separating hyper-plane. For creation of the training set use createdata
or load data_33rpz_cv08-separable.mat.
Remember to convert labels y from {1,2} to {1,-1}!
For visualization use functions ppatterns and pline, e.g.:
ppatterns(data); pline(model); hold on; plot(X(1,I),X(2,I),'ks','MarkerSize',12);
for training samples data.X, labels data.y
and classifier given by model.W, model.b and support vector indices I.
C = Inf |
C = 1 |
|
|
For your control, the numerical results on data_33rpz_cv08-separable:
- model.W = [-24.3726; 23.9039]; model.b = -22.1428; % C=inf
- model.W = [ -2.3024; 2.1573]; model.b = -1.6498; % C=1
- Train a classifier for
character recognition using training data data_33rpz_cv08.mat
with two features
x = (sum of pixel values in
the left half of image) - (sum of
pixel values in the right half of
image)
y = (sum of pixel values in the upper half of image) - (sum of pixel values in the lower half of image)
Normalize the features to interval <-1,1> before learning SVM.
Recommended literature
[1] Christopher J. C. Burges. A Tutorial On Support
Vector Machines for Pattern Recognition
[2] Text of exercise from previous course.
[3] Lecture slides
[4] Quadratic programming