Kinematic Tracking and Annotation Deva Ramanan ramanan@robots.ox.ac.uk http://www.cs.berkeley.edu/~ramanan/ We describe a tracker that can track multiple moving people in long sequences without manual initialization. Moving people are modeled with the assumption that, while configuration can vary quite substantially from frame to frame, appearance does not. This leads to an algorithm that firstly builds a model of the appearance of the body of each individual by clustering candidate body segments, and then uses this model to find all individuals in each frame. We show results on real footage of people that suggest we can count people and fairly accurately track arms and legs. We can extend our people tracking algorithm into a "generalized" tracker; one that is capable of building models of objects while simultaneously tracking them. We show results on real footage of animals. We use the tracker to create a system that can automatically annotate a video sequence. The system works by (1) tracking people in 2D and then, using an annotated motion capture dataset, (2) synthesizing an annotated 3D motion sequence matching the 2D tracks. The motion capture data is manually annotated using a class structure that describes everyday motions and allows motion annotations to be composed --- one may jump while running, for example.