Abstract
In this work we apply a fully differentiable Recurrent Model of Visual Attention to unconstrained real-world images. We propose a deep recurrent attention model and show that it can successfully learn to jointly localize and classify objects. We evaluate our model on multiple digit images generated from MNIST data, Google Street View images, and a fine-grained recognition dataset of 200 bird species, and show that its performance is either comparable or superior to that of alternative models.
Originalsprache | Englisch |
---|---|
Titel | 2016 IEEE Symposium Series on Computational Intelligence (SSCI) |
Herausgeber (Verlag) | IEEE |
Erscheinungsdatum | 09.02.2017 |
Aufsatznummer | 7850113 |
ISBN (Print) | 978-1-5090-4241-8 |
ISBN (elektronisch) | 978-1-5090-4240-1 |
DOIs | |
Publikationsstatus | Veröffentlicht - 09.02.2017 |
Veranstaltung | 2016 IEEE Symposium Series on Computational Intelligence - Athens, Griechenland Dauer: 06.12.2016 → 09.12.2016 Konferenznummer: 126460 |