Abstract
Multi-Layer Perceptrons (MLPs) have been in use for decades. It seemed for a long time, that MLPs have reached their limits, but recent advances caught our attention. Ciresan et al. [1] show that by using a proper training set deep MLPs can outperform all other state of the art machine learning algorithms on the MNIST dataset for handwritten digits. However, a vast amount of training samples is needed, which has to be generated artificially with special transfor- mations. The drawback is that appropriate transformations might be known for handwritten digits but not in general. Work by Hinton et al. and Bengio et al. [2, 3] suggest that unsupervised pre-training helps to find deep MLPs with better generalization from the training set, and thus avoiding to generate extra training data. We are interested in how the better generalization is archived and tested therefore alternative similar architectures. Here we present some observations we made in our first tests on the MNIST dataset. There is a lot of room for improvements to reach the error rates of Ciresan’s approach.
Original language | English |
---|---|
Title of host publication | Workshop New Challenges in Neural Computation 2012 |
Editors | Barbara Hammer, Thomas Villmann |
Number of pages | 3 |
Volume | 3 |
Publication date | 21.08.2012 |
Pages | 113-115 |
Publication status | Published - 21.08.2012 |