The talk will cover the original version (2014 version) of our scene text
localization and recognition pipeline, as summarized in our TPAMI 2015 paper.
The method's real-time performance is achieved by posing the character
and segmentation problem as an efficient sequential selection from the set of
Extremal Regions. The ER detector is robust against blur, low contrast and
illumination, color and texture variation. In the first stage, the probability
of each ER being a character is estimated using features calculated by a novel
algorithm in constant time and only ERs with locally maximal probability are
selected for the second stage, where the classification accuracy is improved
using computationally more expensive features.
A highly efficient clustering algorithm then groups ERs into text lines and an
OCR classifier trained on synthetic fonts is exploited to label character
regions. The most probable character sequence is selected in the last stage
the context of each character is known.