Deep Models of Visual Aesthetics for Image Retrieval and In-Painting

John Collomosse (University of Surrey, UK)

Abstract:

This talk will explore the disentanglement of visual structure and aesthetics using convolutional neural networks (CNN), and the applications of such capability to visual search, and content aware image completion (in-painting). We first describe how an annotated dataset derived from the creative portfolio website Behance.Net can be used to learn a deep representation for style [1,2]. We then show how structure and style can be teased apart to allow independent specification of these as visual search criteria [2]. For example, a query comprising a sketch of a dog and a handful of watercolor images could return artwork of dogs in that watercolor style, uniquely enabling fine-grain control of search at an aesthetic level. We then show how image completion algorithms can leverage both this search framework and style model to enhance the performance of in-painting [3]. The covers work recently presented at ICCV 2017 and to appear at CVPR 2018

[1] "Disentangling Structure and Aesthetics for Content-aware Image
Completion". A. Gilbert, J. Collomosse, H. Jin and B. Price. CVPR 2018

[2] "Sketching with Style: Visual Search with Sketches and Aesthetic
Context". J. Collomosse, T. Bui, M. Wilber, C. Fang and H. Jin. ICCV 2017

[3] "BAM! The Behance Artistic Media Dataset for Recognition Beyond
Photography". M. Wilber, C. Fang, H. Jin, A. Hertzmann, J. Collomosse
and S. Belongie. ICCV 2017

Bio:

Dr John Collomosse is a Reader (Assoc. Prof.) in the Centre for Vision Speech and Signal Processing (CVSSP) at the University of Surrey, and a Visiting Professor at Adobe Research within the Creative Intelligence Lab (CIL). John joined CVSSP in 2009, following 4 years lecturing at the University Bath where he also completed his PhD in Computer Vision and Graphics (2004). John has spent periods of time at IBM UK Labs, Vodafone R&D Munich, and HP Labs Bristol. John's research is cross-disciplinary, spanning Computer Vision, Computer Graphics and Artificial Intelligence, focusing on ways to search and manipulate large, unstructured iamge and video collections - to visually search media collections, and present them in aesthetic and comprehensible ways. Recent projects spanning Vision and Graphics include: sketch based search of images/video; plagiarism detection in the arts; visual search of dance; structuring and presenting large visual media collections using artistic rendering; developing characters animation from 3D multi-view capture data. John holds ~80 refereed publications, including oral presentations at ICCV, BMVC, and journal papers in IJCV, IEEE TVCG and TMM. He was general chair for NPAR 2010-11 (at SIGGRAPH), BMVC 2012, and CVMP 2014-15 and is an AE for C&G and Eurographics CGF.