Dynamic texture recognition using optic flow features and temporal periodicity

Dmitry Chetverikov (SZTAKI Budapest, Hungary)

This talk is devoted to the problem of dynamic texture (DT) classification using optic flow features. We provide a principled analysis of local image distortions and their relation to optic flow, then propose a framework for quantitative temporal periodicity analysis of DTs. Adapting the SVD-based algorithm for signal period estimation by Kanjilal et al.(1999), we measure the degree of periodicity of natural dynamic textures. Then we present the results of a comprehensive DT classification study that compares the performances of different flow features for a normal flow algorithm and four different complete flow algorithms. The efficiencies of two flow confidence measures are also studied. Finally, we demonstrate that adding the optic flow based temporal periodicity features improves the classification rates.


Spectral prediction models for color reproduction

Prof. Roger Hersch (EPFL Lausanne, Switzerland)

The present talk aims at explaining the physical phenomena governing the interaction of light, paper and ink halftones and at presenting classical spectral reflectance prediction models. Physical phenomena comprise surface reflections and refractions at the air-paper interface, the propagation of light within the paper, internal reflections at the paper-air interface, ink spreading and trapping. We review classical reflectance prediction models such as the spectral Neugebauer model, the Yule-Nielsen modified spectral Neugebauer model, and the multiple reflection Clapper-Yule model. We then discuss the phenomena of dot gain and ink spreading and show how to take them into account. Finally, we give an overview of recent progress in the field and point to yet unsolved problems.


Local object description using Gabor features

Joni-Kristian Kamarainen (U. of Lappeenranta, Finland)
Recently, local object descriptors have been one of the most active research areas in computer vision. Their importance partly yields from the fact that structural variance in visual appearance of real objects and object categories can be efficiently modeled using a connected structure, spatial constellation, of local descriptors. Representing local visual information is also an intrinsic property of features constructed from Gabor filter responses; That was actually the original motivation in the first computer vision related publication of Gabor features in 1978 - claiming Gabor filter as "general image processing operator". This talk will cover the fundamentals of feature extraction using Gabor filters, fast implementations, important properties of Gabor features, and finally, demonstrate their use as local object descriptors.


Tree-reweighted message passing: advances in discrete energy minimization for computer vision

Vladimir Kolmogorov (University College London, UK)

Algorithms for MAP estimation in MRFs (or, alternatively, for minimizing energy functions of discrete variables) are now routinely used in computer vision. Minimization techniques based on graph cuts are probably the most popular ones; they often outperform other algorithms and give state-of-the art results for many problems.

In this talk I will concentrate on an alternative minimization algorithm tree-reweighted max-product message passing (TRW) introduced recently by Wainwright et al. I will present some new developments which show that TRW is a serious competitor to graph cuts.

The algorithm of Wainwright et al. gives strong results in practice, but unfortunately it is not guaranteed to converge. I will present a new algorithm called TRW-S which is related to TRW but has guaranteed convergence properties. Experimental results demonstrate that on a real stereo problem TRW-S significantly outperforms the algorithm of Wainwright et al. and obtains lower energy than graph cuts and max-product belief propagation.


Solving Markov Random Fields using Second Order Cone Programming Relaxations

M. Pawan Kumar (Oxford Brookes University, UK)

In this talk, I will present a generic method for solving Markov random fields (MRF) by formulating the problem of MAP estimation as 0-1 quadratic programming (QP). Though in general solving MRFs is NP-hard, we propose a second order cone programming relaxation scheme which solves a closely related (convex) approximation. In terms of computational efficiency, our method significantly outperforms the semidefinite relaxations previously used whilst providing equally (or even more) accurate results.

Unlike popular inference schemes such as Belief Propagation and Graph Cuts, convergence is guaranteed within a small number of iterations. Furthermore, we also present a method for greatly reducing the runtime and increasing the accuracy of our approach for a large and useful class of MRF. We compare our approach with the state-of-the-art methods for subgraph matching and object recognition and demonstrate significant improvements.

Joint work with Philip Torr and Andrew Zisserman.


Object Detection in Crowded Scenes

Bastian Leibe (ETH Zurich, Switzerland)

The detection of object classes in real-world images is a challenging problem which is further complicated by the effects of overlaps and partial occlusions. We present a novel algorithm which addresses this problem by considering object categorization and top-down segmentation as two interleaved processes that closely collaborate towards a common goal. As we will show, the close coupling between those two processes allows our method to accumulate additional evidence about object hypotheses and resolve ambiguities caused by overlaps and partial visibility.

The core part of our approach is a flexible formulation for object shape that can combine the information observed on different training examples in a probabilistic extension of the Generalized Hough Transform. The resulting approach can detect categorical objects in novel images and automatically infer a top-down segmentation from the recognition result. The segmentation is then used to again improve recognition by allowing the system to focus on object pixels and discard misleading influences from the background. Moreover, the information from where in the image a hypothesis draws its support is used in an MDL based verification stage to resolve ambiguities between overlapping hypotheses and factor out the effects of partial occlusion.

As an application, we address the problem of detecting objects such as cars, motorbikes, and pedestrians in real-world street scenes. Qualitative and quantitative results on several challenging data sets confirm that our method is able to reliably detect objects in crowded scenes, even when they overlap and partially occlude each other. In addition, the flexible nature of our approach allows it to operate on very small training sets.


Efficient MRF Deformation Model for Image Matching

Alexander Shekhovtsov (CMP Prague, Czech Republic)

We propose a novel MRF-based model for image matching. Given two images, the task is to estimate a mapping from one image to another maximizing the matching quality. We consider mappings defined by a discrete deformation field constrained to preserve 2-dimensional "continuity".

We approach the corresponding optimization problem by the TRW-S (sequential Tree-reweighted message passing) algorithm [Wainwright2003, Kolmogorov2005]. Our model design allows for a considerably wider class of smooth transformations and yields a compact representation of the optimization task. For this model, the TRW-S algorithm demonstrated nice practical performance in experiments.

We also propose a concise derivation of the TRW-S algorithm as a sequential maximization of the lower bound on the energy function.

Joint work with Ivan Kovtun and Vaclav Hlavac.


Multi-Target Tracking in Visual Images -- Finding, Following and Identifying Football Players During a Game

Josephine Sullivan (KTH, Sweden)

Successful multi-target tracking requires solving two problems - localising the targets and labelling their identity. An isolated target's identity can be unambiguously preserved from one frame to the next. However, for long sequences of many moving targets, like a football game, grouping scenarios will occur in which identity labellings cannot be maintained reliably by using continuity of motion or appearance.

In this talk I'll describe how a graph structure can be built and used to describe the interactions that occur. Given this platform, I'll present two separate strategies for labelling the identity of the nodes of this graph. Results will be shown from an international football match captured by a multi-camera rig that produces a wide screen video that is full stationary view of the pitch.