The Perceptron Algorithm
The task today is to implement the perceptron algorithm for learning of a
linear classifier. Once implemented, the algorithm will also be used for
learning of a non-linear extension of the classifier, for the case of a
quadratic discriminant function.
Task formulation
The Perceptron algorithm for finding of a linear discriminant function is
described in [1], you can also consult
lecture slides [4].
The extension to quadratic
discriminant functions is described in [2].
The task:
- Implement the perceptron algorithm. Test its functionality on synthetic
two-dimensional linearly separable data. For data creation use the createdata
script.
Output:
- Show the decision boundary of the linear classifier you have found (a
line), and the classified data. Use functions
pline
and ppatterns.
- Plot evolution of the classification error as the learning proceeds.
The error should generally decrease, but it does not have to be
monotonous.
Question:
What can you tell about the solution? Is it optimal? In what sense?
- Test the algorithm for data, which cannot be linearly separated (e.g. the
XOR problem).
Question:
What's happening in the non-separable case?
- Extend the perceptron algorithm for the case of quadratic discriminant
function. Demonstrate the functionality on a simple example.
Hint: For visualisation use pboundary
function. This function takes a single parameter (model),
which is expected to be a struct. It's member .fun
refers to your classification function (e.g.. 'classif_quadrat_perc'),
which is defined as
classif
= classif_quadrat_perc(test_data, model)
The function is called automatically by the visualisation function, and
it's role is to classify the supplied test_data
(in our case 2D data) to two classes (i.e. the classif
output is either 1 or 2). Parameters of the classifiers should be stored in
the model
parameter, which is the same one as the one passed to the pboundary
function. In our case, the model
will contain parameters of the current state of the quadratic classifier.
Bonus task:
The task is not compulsory.
Implement the Kozinec's algorithm for learning of a linear discriminant
function. For a set of points {xj, j
∈ J}, the Kozinec's algorithm finds vector α, which satisfies a
set of linear inequations
<α,xj> >
0,
j ∈ J
I.e. the problem is the same as in the Perceptron. The Kozinec's algorithm
creates a sequence of vectors α1,
α2, ..., αt, αt+1,
... in the following way
The algorithm: Vector
α1 is any vector from convex hull of set {xj, j ∈ J},
e.g. one of the vectors xj, j ∈ J.
If vector αt has already been computed, αt+1
is obtained using the following rules:
- A misclassified vector xj, j ∈ J
is sought, which satisfies
<α,xj> ≤ 0
- If such xj
does not exist, the solution to the task has been found, and αt+1
equals to αt.
- If such xj
does exist, it will be denoted as xt.
Vector αt+1
then lies on a line passing through αt
and xt
. From all the points on the line, αt+1 is the closest one
to the origin of the coordinate system. In other words
αt+1
= (1 - k) αt
+ k xt
where
k = argmink
| (1 - k) αt
+ k xt |.
Output:
Identical to th eoutput to the task 1 in the non-bonus part.
Hint: The Kozinec algorithm is described in Chapter 5 of [3]
References:
[1] Chapter 5.5 of Duda, Hart, Stork: Pattern
classification (algorithm 4, page 230)
[2] Nonlinear perceptron (short support text for labs)
[3] Michail I. Schlesinger, Vaclav Hlavac. Ten Lectures on Statistical and Structural Pattern Recognition. Kluwer 2002
[4] Jiri Matas: Learning and Linear Classifiers slides
Created
by Jan Šochman, last updated 18.7.2011