Experience in Training (Deep) Multi-Layer Perceptrons to Classify Digits

Jens Hocke, Thomas Martinetz


Multi-Layer Perceptrons (MLPs) have been in use for decades. It seemed for a long time, that MLPs have reached their limits, but recent advances caught our attention. Ciresan et al. [1] show that by using a proper training set deep MLPs can outperform all other state of the art machine learning algorithms on the MNIST dataset for handwritten digits. However, a vast amount of training samples is needed, which has to be generated artificially with special transfor- mations. The drawback is that appropriate transformations might be known for handwritten digits but not in general. Work by Hinton et al. and Bengio et al. [2, 3] suggest that unsupervised pre-training helps to find deep MLPs with better generalization from the training set, and thus avoiding to generate extra training data. We are interested in how the better generalization is archived and tested therefore alternative similar architectures. Here we present some observations we made in our first tests on the MNIST dataset. There is a lot of room for improvements to reach the error rates of Ciresan’s approach.
Original languageEnglish
Title of host publicationWorkshop New Challenges in Neural Computation 2012
EditorsBarbara Hammer, Thomas Villmann
Number of pages3
Publication date21.08.2012
Publication statusPublished - 21.08.2012


Dive into the research topics of 'Experience in Training (Deep) Multi-Layer Perceptrons to Classify Digits'. Together they form a unique fingerprint.

Cite this