The Perceptron Algorithm

The task today is to implement the perceptron algorithm for learning of a linear classifier. Once implemented, the algorithm will also be used for learning of a non-linear extension of the classifier, for the case of a quadratic discriminant function.

Task formulation

The Perceptron algorithm for finding of a linear discriminant function is described in [1], you can also consult lecture slides [4]. The extension to quadratic discriminant functions is described in [2].

The task:

  1. Implement the perceptron algorithm. Test its functionality on synthetic two-dimensional linearly separable data. For data creation use the createdata script.
    Output: Question: What can you tell about the solution? Is it optimal? In what sense?

  2. Test the algorithm for data, which cannot be linearly separated (e.g. the XOR problem).
    Question: What's happening in the non-separable case?
  3. Extend the perceptron algorithm for the case of quadratic discriminant function. Demonstrate the functionality on a simple example.
    Hint:  For visualisation use pboundary function. This function takes a single parameter (model), which is expected to be a struct. It's member .fun refers to your classification function (e.g.. 'classif_quadrat_perc'), which is defined as
classif = classif_quadrat_perc(test_data, model)

The function is called automatically by the visualisation function, and it's role is to classify the supplied test_data  (in our case 2D data) to two classes (i.e. the classif output is either 1 or 2). Parameters of the classifiers should be stored in the model parameter, which is the same one as the one passed to the pboundary function. In our case, the model will contain parameters of the current state of the quadratic classifier.

Bonus task:

The task is not compulsory. Implement the Kozinec's algorithm for learning of a linear discriminant function. For a set of points {xj, j ∈ J}, the Kozinec's algorithm finds vector α, which satisfies a set of linear inequations
<α,xj> > 0,   j ∈ J

I.e. the problem is the same as in the Perceptron. The Kozinec's algorithm creates a sequence of vectors α1, α2, ..., αt, αt+1, ... in the following way

The algorithm: Vector α1 is any vector from convex hull of set {xj, j ∈ J}, e.g. one of the vectors xj, j ∈ J. If vector αt has already been computed, αt+1 is obtained using the following rules:

  1. A misclassified vector xj, j ∈ J is sought, which satisfies 
    <α,xj> ≤ 0
  2. If such xj does not exist, the solution to the task has been found, and αt+1 equals to αt.
  3. If such xj does exist, it will be denoted as xt. Vector αt+1 then lies on a line passing through αt and xt . From all the points on the line, αt+1 is the closest one to the origin of the coordinate system. In other words
    αt+1 = (1 - k) αt + k xt
    where
    k = argmink | (1 - k) αt + k xt |.
Output: Identical to th eoutput to the task 1 in the non-bonus part.
Hint:
The Kozinec algorithm is described in Chapter 5 of [3] 

References:

[1] Chapter 5.5 of Duda, Hart, Stork: Pattern classification (algorithm 4, page 230)
[2] Nonlinear perceptron (short support text for labs)
[3] Michail I. Schlesinger, Vaclav Hlavac. Ten Lectures on Statistical and Structural Pattern Recognition. Kluwer 2002
[4] Jiri Matas: Learning and Linear Classifiers slides

Created by Jan Šochman, last updated 18.7.2011