Learning Transformation Invariant Representations with Weak Supervision

Benjamin Coors, Alexandru Condurache, Alfred Mertins, Andreas Geiger

Abstract

Deep convolutional neural networks are the current state-of-the-art solution to many computer vision tasks. However, their ability to handle large global and local image transformations is limited. Consequently, extensive data augmentation is often utilized to incorporate prior knowledge about desired invariances to geometric transformations such as rotations or scale changes. In this work, we combine data augmentation with an unsupervised loss which enforces similarity between the predictions of augmented copies of an input sample. Our loss acts as an effective regularizer which facilitates the learning of transformation invariant representations. We investigate the effectiveness of the proposed similarity loss on rotated MNIST and the German Traffic Sign Recognition Benchmark (GTSRB) in the context of different classification models including ladder networks. Our experiments demonstrate improvements with respect to the standard data augmentation approach for supervised and semi-supe rvised learning tasks, in particular in the presence of little annotated data. In addition, we analyze the performance of the proposed approach with respect to its hyperparameters, including the strength of the regularization as well as the layer where representation similarity is enforced.
OriginalspracheEnglisch
Seiten64-72
Seitenumfang9
DOIs
PublikationsstatusVeröffentlicht - 01.01.2018
Veranstaltung13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Funchal, Madeira, Portugal
Dauer: 27.01.201829.01.2018

Tagung, Konferenz, Kongress

Tagung, Konferenz, Kongress13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
KurztitelVISIGRAPP 2018
Land/GebietPortugal
OrtFunchal, Madeira
Zeitraum27.01.1829.01.18

Fingerprint

Untersuchen Sie die Forschungsthemen von „Learning Transformation Invariant Representations with Weak Supervision“. Zusammen bilden sie einen einzigartigen Fingerprint.

Zitieren