Subsections

Generalized Anderson's Task

Introduction

The Generalized Anderson's Task (GAT) belongs among the non-Bayesian method of the statistical decision making with non-random interventions. The task definition is the following. The classified object is described by the observation vector $x\in{\cal X}\subseteq{\mathbb{R}}^n$ and a binary hidden state $y\in\{1,2\}$. The class conditional distributions $P(x\vert y)$, $y\in\{1,2\}$ are known to be multi-variate Gaussian distributions. The parameters $(\mu_1,\Sigma_1)$ and $(\mu_2,\Sigma_2)$ of these class distributions are unknown. However, it is known that the parameters $(\mu_1,\Sigma_1)$ belong to a certain finite set of parameters $\{(\mu^i,\Sigma^i)\colon
i\in{\cal I}_1\}$. Similarly $(\mu_2,\Sigma_2)$ belong to a finite set $\{(\mu^i,\Sigma^i)\colon
i\in{\cal I}_2\}$. Let $q\colon {\cal X}\rightarrow\{1,2\}$ be a binary linear classifier with discriminant function $f(x) = \langle w \cdot x\rangle + b$ defined as follows

\begin{displaymath}
q(x)=\left \{ \begin{array}{rcl}
1 & \mbox{for} & f(...
...angle w\cdot x\rangle + b < 0 \:.\\
\end{array}
\right .
\end{displaymath} (6)

The probability of misclassification is defined as
\begin{displaymath}
Err(w,b) = \max_{i\in{\cal I}_1\cup{\cal I}_2} \varepsilon (w,b,\mu^i,\Sigma^i) \:,
\end{displaymath} (7)

where $\varepsilon (w,b,\mu^i,\Sigma^i)$ is probability that the Gaussian random vector $x$ with mean vector $\mu^i$ and covariance matrix $\Sigma^i$ satisfies $q(x)=1$ when $i\in{\cal I}_2$ or $q(x)=2$ when $i\in{\cal I}_1$. In other words, it is the probability that the vector $x$ will be misclassified by the linear rule $q$. The GAT is to find such linear classifier (6) that the $Err(w,b)$ is minimal
\begin{displaymath}
(w^*,b^*) = \mathop{\rm argmin}_{w,b} Err(w,b) = \math...
...\cal I}_1\cup{\cal I}_2} \varepsilon (w,b,\mu^i,\Sigma^i) \:.
\end{displaymath} (8)

The criterion (7) is not differentiable but is convex. Efficient algorithms to solve the task (8) are implemented in the STPRtool.

A key assumption is the knowledge of the Gaussians $\{(\mu^i,\Sigma^i)\colon
i\in{\cal I}_1\cup{\cal I}_2\}$ which describe the class conditional distributions. Usually, the Gaussians are not explicitly given but a set of training examples ${\cal T}_{XV}=\{(x_1,v_1),\ldots,(x_l,v_l)\}$ is presented instead. The labels $v=v_i$ denote the examples generated from the Gaussian component with parameters $(\mu^v,\Sigma^v)$. The component labels can be divided into two classes: $y=1$ if $v\in{\cal I}_1$ and $y=2$ if $v\in{\cal I}_2$. If the training set ${\cal T}_{XV}$ was randomly sampled from the underlying distribution $P(x,v)$ then the class conditional distributions

\begin{displaymath}
P(x\vert y) = \sum\limits_{v\in{\cal I}_y} P(x\vert v)P(v\vert y)\:, \qquad y\in\{1,2\}\:,
\end{displaymath} (9)

could be simply estimated (e.g., using the ML principle). Also the a priori probabilities $P(y=1)$ and $P(y=2)$ could be simply computed as relative occurrences. However, it is specific for the GAT that $v$ cannot be assumed to be a realization of a random variable! Therefore, only the distributions

\begin{displaymath}
P(x\vert v)\:, v\in {\cal I}_1 \cup {\cal I}_2 \:,
\end{displaymath}

can be properly estimated from the training data ${\cal T}_{XV}$.

To make the estimation easier we assume that the input examples are projected to a lower dimension. To this end, a linear projection

\begin{displaymath}
z_i = \langle w\cdot x_i \rangle \:,
\end{displaymath} (10)

is used where the projection vector $w$ is found based on the Principal Component Analysis and the Fisher Discriminant Analysis which are both implemented in the STPRtool. Applying the linear projection (10) on the input data ${\cal T}_{XV}$ yields one-dimensional training examples ${\cal T}_{ZV} = \{(z_1,v_1),\ldots,(z_l,v_l)\}$.

Task assignment

Useful functions

anders_exp1 Example on estimation of density model for Anderson's task.
anders_exp2 Example on solving Generalized Anderson's task.
mlcgmm Maximal Likelihood estimation of Gaussian mixture model.
pdfgmm Evaluates Gaussian Mixture Model.
pca Principal Component Analysis.
fld Fisher Linear Discriminant.
linproj Linear data projection.
bayescls Bayesian classifier with reject option.



Vojtech Franc
2004-08-31