Generalized Anderson's Task

Introduction

The Generalized Anderson's Task (GAT) belongs among the non-Bayesian method of the statistical decision making with non-random interventions. The task definition is the following. The classified object is described by the observation vector $x\in{\cal X}\subseteq{\mathbb{R}}^n$ and a binary hidden state $y\in\{1,2\}$ . The class conditional distributions $P(x\vert y)$ , $y\in\{1,2\}$ are known to be multi-variate Gaussian distributions. The parameters $(\mu_1,\Sigma_1)$ and $(\mu_2,\Sigma_2)$ of these class distributions are unknown. However, it is known that the parameters $(\mu_1,\Sigma_1)$ belong to a certain finite set of parameters $\{(\mu^i,\Sigma^i)\colon i\in{\cal I}_1\}$ . Similarly $(\mu_2,\Sigma_2)$ belong to a finite set $\{(\mu^i,\Sigma^i)\colon i\in{\cal I}_2\}$ . Let $q\colon {\cal X}\rightarrow\{1,2\}$ be a binary linear classifier with discriminant function $f(x) = \langle w \cdot x\rangle + b$ defined as follows

$\begin{displaymath} q(x)=\left \{ \begin{array}{rcl} 1 & \mbox{for} & f(... ...angle w\cdot x\rangle + b < 0 \:.\\ \end{array} \right . \end{displaymath}$

(6)

The probability of misclassification is defined as

$\begin{displaymath} Err(w,b) = \max_{i\in{\cal I}_1\cup{\cal I}_2} \varepsilon (w,b,\mu^i,\Sigma^i) \:, \end{displaymath}$

(7)

where $\varepsilon (w,b,\mu^i,\Sigma^i)$ is probability that the Gaussian random vector

with mean vector $\mu^i$ and covariance matrix $\Sigma^i$ satisfies

when $i\in{\cal I}_2$ or

when $i\in{\cal I}_1$ . In other words, it is the probability that the vector

will be misclassified by the linear rule

. The GAT is to find such linear classifier (6) that the

is minimal

$\begin{displaymath} (w^*,b^*) = \mathop{\rm argmin}_{w,b} Err(w,b) = \math... ...\cal I}_1\cup{\cal I}_2} \varepsilon (w,b,\mu^i,\Sigma^i) \:. \end{displaymath}$

(8)

The criterion (7) is not differentiable but is convex. Efficient algorithms to solve the task (8) are implemented in the STPRtool.

A key assumption is the knowledge of the Gaussians $\{(\mu^i,\Sigma^i)\colon i\in{\cal I}_1\cup{\cal I}_2\}$ which describe the class conditional distributions. Usually, the Gaussians are not explicitly given but a set of training examples ${\cal T}_{XV}=\{(x_1,v_1),\ldots,(x_l,v_l)\}$ is presented instead. The labels denote the examples generated from the Gaussian component with parameters $(\mu^v,\Sigma^v)$ . The component labels can be divided into two classes: if $v\in{\cal I}_1$ and if $v\in{\cal I}_2$ . If the training set ${\cal T}_{XV}$ was randomly sampled from the underlying distribution then the class conditional distributions

$\begin{displaymath} P(x\vert y) = \sum\limits_{v\in{\cal I}_y} P(x\vert v)P(v\vert y)\:, \qquad y\in\{1,2\}\:, \end{displaymath}$

(9)

could be simply estimated (e.g., using the ML principle). Also the a priori probabilities

and

could be simply computed as relative occurrences. However, it is specific for the GAT that

cannot be assumed to be a realization of a random variable! Therefore, only the distributions

$\begin{displaymath} P(x\vert v)\:, v\in {\cal I}_1 \cup {\cal I}_2 \:, \end{displaymath}$

can be properly estimated from the training data ${\cal T}_{XV}$ .

To make the estimation easier we assume that the input examples are projected to a lower dimension. To this end, a linear projection

$\begin{displaymath} z_i = \langle w\cdot x_i \rangle \:, \end{displaymath}$

(10)

is used where the projection vector

is found based on the Principal Component Analysis and the Fisher Discriminant Analysis which are both implemented in the STPRtool. Applying the linear projection (10) on the input data ${\cal T}_{XV}$ yields one-dimensional training examples ${\cal T}_{ZV} = \{(z_1,v_1),\ldots,(z_l,v_l)\}$ .

Task assignment

Use the Brodatz textures $\{1,2,3\}$ as the components of the first class and textures $\{4,5,6\}$ as the components of the second class [brodatz3_trn.mat].
Estimate the class conditional probabilities $P(x\vert y)$ , $y\in\{1,2\}$ and a priory probabilities , $y\in\{1,2\}$ as if occurrence of the component could be modeled by a random variable. The conditional probabilities model by the Gaussian Mixture Models (9). Design the plug-in Bayesian classifier.
Estimate the probabilities $P(x\vert v)$ , $v\in{\cal I}_1\cup{\cal I}_2$ using the Gaussian model. Design the linear classifier based on the GAT.
Use the testing data to compare the Bayesian and the GAT classifier in terms of the Bayesian error and the GAT error (7) [brodatz3_tst.mat].

Useful functions

anders_exp1 Example on estimation of density model for Anderson's task.

anders_exp2 Example on solving Generalized Anderson's task.

mlcgmm Maximal Likelihood estimation of Gaussian mixture model.

pdfgmm Evaluates Gaussian Mixture Model.

pca Principal Component Analysis.

fld Fisher Linear Discriminant.

linproj Linear data projection.

bayescls Bayesian classifier with reject option.

Vojtech Franc
2004-08-31

anders_exp1	Example on estimation of density model for Anderson's task.
anders_exp2	Example on solving Generalized Anderson's task.
`mlcgmm`	Maximal Likelihood estimation of Gaussian mixture model.
`pdfgmm`	Evaluates Gaussian Mixture Model.
`pca`	Principal Component Analysis.
`fld`	Fisher Linear Discriminant.
`linproj`	Linear data projection.
`bayescls`	Bayesian classifier with reject option.