Learning Transformation Invariant Representations with Weak Supervision

Benjamin Coors, Alexandru Condurache, Alfred Mertins, Andreas Geiger

Abstract

Deep convolutional neural networks are the current state-of-the-art solution to many computer vision tasks. However, their ability to handle large global and local image transformations is limited. Consequently, extensive data augmentation is often utilized to incorporate prior knowledge about desired invariances to geometric transformations such as rotations or scale changes. In this work, we combine data augmentation with an unsupervised loss which enforces similarity between the predictions of augmented copies of an input sample. Our loss acts as an effective regularizer which facilitates the learning of transformation invariant representations. We investigate the effectiveness of the proposed similarity loss on rotated MNIST and the German Traffic Sign Recognition Benchmark (GTSRB) in the context of different classification models including ladder networks. Our experiments demonstrate improvements with respect to the standard data augmentation approach for supervised and semi-supe rvised learning tasks, in particular in the presence of little annotated data. In addition, we analyze the performance of the proposed approach with respect to its hyperparameters, including the strength of the regularization as well as the layer where representation similarity is enforced.
Original languageEnglish
Pages64-72
Number of pages9
DOIs
Publication statusPublished - 01.01.2018
Event13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Funchal, Madeira, Portugal
Duration: 27.01.201829.01.2018

Conference

Conference13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
Abbreviated titleVISIGRAPP 2018
Country/TerritoryPortugal
CityFunchal, Madeira
Period27.01.1829.01.18

Fingerprint

Dive into the research topics of 'Learning Transformation Invariant Representations with Weak Supervision'. Together they form a unique fingerprint.

Cite this