(by J. Dai, H. Qi, Y. Xiong, Y. Li, Zhang, H. Hu, Y. Wei, ICCV 2017)
Presented by Tomáš Jeníček.
Convolutional neural networks (CNNs) are inherently limited to model geometric
transformations due to the fixed geometric structures in its building modules.
In this work, we introduce two new modules to enhance the transformation
modeling capacity of CNNs, namely, deformable convolution and deformable RoI
pooling. Both are based on the idea of augmenting the spatial sampling locations
in the modules with additional offsets and learning the offsets from target
tasks, without additional supervision. The new modules can readily replace their
plain counterparts in existing CNNs and can be easily trained end-to-end by
standard back-propagation, giving rise to deformable convolutional networks.
Extensive experiments validate the effectiveness of our approach on
sophisticated vision tasks of object detection and semantic segmentation. The
code would be released.
See the page of Reading Groups