Speaker Notes
Press s to open speaker notes. The sole presentation was not intended to be self contained.
Rolling Shutter Camera Synchronization with Sub-millisecond Accuracy
Matěj Šmíd, Jiří Matas
Center for Machine Perception
Czech Technical University Prague
smidm@cmp.felk.cvut.cz
1 flap of wings takes 5 ms.
def:
independent videos
associate time-point in one video to time-points in other videos
method requires: rolling shutter and a global lighting change
Honeybee in slow motion. Our accuracy 1/5 to 1/10 of one wings flap.
slides with speaker notes available
Multiple Cameras
are common
synchronization is required
are the only solution for some applications
Video effect based on interpolation between multiple synchronized cameras.
motivation for multi-camera
3D reconstruction
human pose estimation
tracking
occlusion handling
multi directional camera rigs
hw solution for sync not always feasible
Tracking Application
our motivation: multi target tracking
Ice Hockey Dataset
synchronization is essential
IP cameras without hardware trigger
independently acquired videos
dataset:
ice hockey sequence
4 cameras
5 minutes
The Core Idea
synchronize using lighting changes affecting the whole scene
A flash during a rolling shutter sensor exposure.
image overlap not required (!)
horizontal bright stripe covering approximately 1/3 of an image
professional photographers flashes
common in indoor sports and cultural events (concerts)
easy to produce in experimental setting (event outdoor)
Detected Events
Lighting events detected in 4 video sequences.
detection
line wise median
difference between consecutive images
frames with lighting changes detected
Synchronization by Mapping Frames
choose reference camera
synchronize by adding \(n\) whole frames
assumption: same fps
\(f_\mathrm{ref} = f + n\)
\(f_\mathrm{ref}\) reference camera frame nr.
\(f\) second camera frame nr.
synchronization transformation (arrows)
explain figure
common in synchronization literature (!)
different fps (fix by adding fps ratio)
Frame Drops
common phenomena
often ignored
cause: high load, lost packets
synchronization error cumulative
example error: 25 fps, 3 dropped frames:
ice hockey player travelled 1 m
puck more than 4 m
TODO: frame drops statistics in my videos
Video Timing
$ ffprobe -select_streams v
-show_entries frame=
best_effort_timestamp_time
video.mp4
frame,0.000000
frame,0.040000
frame,0.080000
frame,0.120000
frame,0.160000
frame,0.200000
frame,0.240000
frame,0.280000
frame,0.320000
frame,0.360000
frame,0.400000
frame,0.440000
use frame timestamps instead of indices
available:
video containers
streaming protocols
why is it ignored?
all synchronization literature operates on frame indices
ffprobe from ffmpeg package
do not rely on opencv video timing queries
Timestamp based Sync
fps invariant, frame drops resistant
\(\color{red}{t}_\mathrm{ref} = \color{red}{t} + \Delta \color{red}{t}\)
\(t\) frame timestamp
\(\Delta t\) time shift
new parts of synchronization transformation are in red
different fps (25, 12.5)
dropped frame in red camera at 120 ms
Rolling Shutter
frame rows not exposed simultaneously
small delay between consecutive rows
disadvantage: image distortion
advantage: high-speed optical line sensor
we can exploit rolling shutter properties for accurate synchronization
almost all sensors sold
sampling rate: \(\mathrm{fps} \cdot H\) , where H is vertical resolution
theoretical sampling time in 10ths of microseconds
Distortion
Objects on top of the image and on the bottom are captured in different time instants.
camera moving in a train
common when moving camera or objects move horizontally
Lighting Change
e.g. flash, room light switch
profile: line-wise median, difference between consecutive images
easy to produce
light change analysis:
reduction of 2D image into 1D: mean / median over rows
difference between consecutive frames
shape formed by physical properties of a light source
fit exp
Going Sub-frame
we can associate time of a row in arbitrary camera to a time of a row in the reference camera
\(t'_\mathrm{ref} = t + \color{red}{r T_\mathrm{row}} + \Delta t\)
\(r\) row number
\(T_\mathrm{row}\) time per row
we can associate time of a row in arbitrary camera to a time of a row in reference camera
Clock Skew
millisecond scale + image sensor clock generator inaccuracy
compensation needed (\(\beta\) )
\(t''_\mathrm{ref} = \color{red}{\beta} t + r T_\mathrm{row} + \Delta t\)
\(\beta\) relative clock skew to the reference camera
Sync Parameters
lighting event \((t, r)\) frame time, row pair
pair of corresponding events in 2 cameras = 1 equation
system of linear equations
least squares solution
extract row and frame time pairs for all detected events
manually or automatically match the events
not all events need to be matched
construct system of linear equations
solve to get time shifts, camera skew and sub-frame constants
Results
every colour represent moving objects from one camera
merged using ground plane homography -> points on the ground are aligned
Results
residual errors < 1ms
ice hockey dataset, 5 minutes, 4 cameras
Open Source
import flashvideosynchronization
sync = flashvideosynchronization.FlashVideoSynchronization()
sync.detect_flash_events(filenames)
matching_events = {1 : 3 , 3 : 2 , 2 : 8 , 4 : 2 }
offsets = {
cam: sync.events[cam][matching_events[cam]]['time' ]
for cam in cameras}
# synchronize cameras: find parameters of transformations
# that map camera time to reference camera time
sync.synchronize(cameras, offsets, base_cam= 1 )
# get sub-frame sychronized time for camera 1, frame 10 and row 100
print sync.get_time(cam= 1 , frame_time= timestamps[1 ][10 ], row= 100 )
https://github.com/smidm/flashvideosynchronization
python module
easy to use
Summary
sub-millisecond accuracy for rolling shutter cameras
requirement: global detectable lighting changes
open source
http://cmp.felk.cvut.cz/~smidm/flash_synchronization