XE33PVR::Lab4 - Video Repainting

Task specification

video repaint
demonstration Currently it is possible to automatically change the visual content of video sequences. One of the commercial applications of this technology is the automatic replacement of billboards. In this application a billboard with publicity of a certain brand is detected and seamlessly removed or replaced by another. This may be due to legal reasons (i.e. tobacco or alcohol publicity is allowed in the country where a sporting event takes place but not where it is retransmitted), previous economical agreements or to better target the audience. An example of this last reason is seen on the following image on which the Gaz de France logo (top) is replaced by an EMB logo (bottom). This particular example was taken from "Real-Time Billboard Substitution in a Video Stream," Proceedings of the 10th Tyrrhenian International Workshop on Digital Communications, G. Medioni et al.

The main goal of the final assignment consists on detecting a logo in a video sequence and smoothly replacing it by another. This is very similar to what you have done in the repainting lab. The most challenging task here is how to detect the object and compute the necessary homography mapping automatically. Ideally you should detect any occurrences of the selected logo and replace it unobtrusively with another one. An uninformed observer should desirably not recognize that the sequences has been faked.

Your general work-flow may consist of the following steps:

Define an object of interest, find a suitable frame and crop out the relevant part.
Perform image matching between the cropped sample and all images in the video sequence.
For each matching object compute robustly the homography mapping for the repainting.
Repainting all occurrences of the logo in the sequence with a new one.

The complexity of this final assignment is higher than the previous ones. We recommend you to start solving the problem straight away and use Wednesday lab hours to solve any doubts that may come up.

Video Sequences

Several image sequences on which you must detect a predefined object under various conditions (different illumination, indoor/outdoor, different cameras, etc...) are provided. Feel free to capture or download your own sequence. Frame 43 from sequence seq007.zip (top) shows an outdoor scene where the object of interest was extracted from a database(beer sign). Frame 595 from indoor sequence newspaper.zip (bottom) shows an object of interest (newspaper headline) cropped from a different frame.

Available sequences, please note that each has several tens of MB:

Indoor	Outdoor
`cartoon.zip` `newspaper.zip` `ng.zip`	`seq007.zip`

More sequences may be added in feature.

Practical Issues

Detection

Note that the two beer logos are not exactly the same as they differ in size and orientation. Yet, a human has no problem identifying them as the same object. A wide area of computer vision deals with robustly detecting and matching objects and with the measurement of their similarity . As you will have to detect the same object over a large number of images we suggest that you use a SIFT detector. Scale-invariant feature transforms (SIFT) is an algorithm for extracting distinctive features from images. An implementation for non-commercial use together with the articles describing it's theoretical background are available at the author's web-site (http://www.cs.ubc.ca/~lowe/keypoints/ ). It has a simple interface with Matlab. Here is an example of the common features found in the grayscale version of the image pairs shown above.

Homography computation

You can see from the examples above that many, but not all, of the matches are correct. Some corresponding pairs are clearly mismatches - outliers. Part of the task is to select robustly which points are correct according to a predefined geometric model (inliers) and which are incorrect (outliers). One of the most used tools for this is the RANSAC algorithm. RANSAC algorithm was explained in the ransac.pdf lecture and homographies in the homography.pdf lecture.

External Code

You may use some functions or codes in general that are available and free for education/research purposes. The responsibility is, however, still yours. You should fully understand how the codes work and be able to explain why you have used it. No excuses for malfunction of external codes will be accepted. Besides, we cannot promise any support in debugging of the external codes.

For the matching part of the task you may consider to use some support codes of the advanced digital image processing course.

Expected output (submission)

Report

A short report describes your solution of the problem. It should contain analysis of the problem, proposed solution, discussion about results, and references to the sources you used. The references are understood both literature and external codes. The report must also contain a user guide how to run your software. Accepted formats (only!): PDF, html. You may consider to use Matlab function publish in order to automate the generation of the output.

Codes

All codes must be properly commented. External codes are to be clearly separated from your work.

Results

Prepare repainted video sequences from your results. A bit outdated guide about making videos from images.

Grading, upto 25 points in total.

I will evaluate you work mainly based on the following criteria:

Functionality, performance, robustness, etc.:: Simply, your code should work on as many sequences as possible without manually tuning parameters.
Clarity and correctness:: Clarity of the code, comments, parsing parameters, proper decomposition into subtasks, ...
Report quality:: Anyone should be able to reproduce your results and evaluate your work just by reading your report.
Results quality:: The faked sequences should look as natural as possible, possible no shaking, jittering etc.

[XE33PVR labs | Responsible: Tomáš Svoboda ]
Last modified at 16:45, 20 November 2007 CET.