Abstract
In this work we apply a fully differentiable Recurrent Model of Visual Attention to unconstrained real-world images. We propose a deep recurrent attention model and show that it can successfully learn to jointly localize and classify objects. We evaluate our model on multiple digit images generated from MNIST data, Google Street View images, and a fine-grained recognition dataset of 200 bird species, and show that its performance is either comparable or superior to that of alternative models.
Original language | English |
---|---|
Title of host publication | 2016 IEEE Symposium Series on Computational Intelligence (SSCI) |
Publisher | IEEE |
Publication date | 09.02.2017 |
Article number | 7850113 |
ISBN (Print) | 978-1-5090-4241-8 |
ISBN (Electronic) | 978-1-5090-4240-1 |
DOIs | |
Publication status | Published - 09.02.2017 |
Event | 2016 IEEE Symposium Series on Computational Intelligence - Athens, Greece Duration: 06.12.2016 → 09.12.2016 Conference number: 126460 |