TY - JOUR
T1 - Evolutionary training and abstraction yields algorithmic generalization of neural computers
AU - Tanneberg, Daniel
AU - Rueckert, Elmar
AU - Peters, Jan
N1 - Funding Information:
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement nos. 713010 (GOAL-Robots) and 640554 (SKILLS4ROBOTS), and from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under no. 430054590. This research was supported by NVIDIA. We want to thank K. O’Regan for inspiring discussions on defining algorithmic solutions.
Publisher Copyright:
© 2020, The Author(s), under exclusive licence to Springer Nature Limited.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/11/16
Y1 - 2020/11/16
N2 - A key feature of intelligent behaviour is the ability to learn abstract strategies that scale and transfer to unfamiliar problems. An abstract strategy solves every sample from a problem class, no matter its representation or complexity—similar to algorithms in computer science. Neural networks are powerful models for processing sensory data, discovering hidden patterns and learning complex functions, but they struggle to learn such iterative, sequential or hierarchical algorithmic strategies. Extending neural networks with external memories has increased their capacities to learn such strategies, but they are still prone to data variations, struggle to learn scalable and transferable solutions, and require massive training data. We present the neural Harvard computer, a memory-augmented network-based architecture that employs abstraction by decoupling algorithmic operations from data manipulations, realized by splitting the information flow and separated modules. This abstraction mechanism and evolutionary training enable the learning of robust and scalable algorithmic solutions. On a diverse set of 11 algorithms with varying complexities, we show that the neural Harvard computer reliably learns algorithmic solutions with strong generalization and abstraction, achieves perfect generalization and scaling to arbitrary task configurations and complexities far beyond seen during training, and independence of the data representation and the task domain.
AB - A key feature of intelligent behaviour is the ability to learn abstract strategies that scale and transfer to unfamiliar problems. An abstract strategy solves every sample from a problem class, no matter its representation or complexity—similar to algorithms in computer science. Neural networks are powerful models for processing sensory data, discovering hidden patterns and learning complex functions, but they struggle to learn such iterative, sequential or hierarchical algorithmic strategies. Extending neural networks with external memories has increased their capacities to learn such strategies, but they are still prone to data variations, struggle to learn scalable and transferable solutions, and require massive training data. We present the neural Harvard computer, a memory-augmented network-based architecture that employs abstraction by decoupling algorithmic operations from data manipulations, realized by splitting the information flow and separated modules. This abstraction mechanism and evolutionary training enable the learning of robust and scalable algorithmic solutions. On a diverse set of 11 algorithms with varying complexities, we show that the neural Harvard computer reliably learns algorithmic solutions with strong generalization and abstraction, achieves perfect generalization and scaling to arbitrary task configurations and complexities far beyond seen during training, and independence of the data representation and the task domain.
UR - http://www.scopus.com/inward/record.url?scp=85096069972&partnerID=8YFLogxK
U2 - 10.1038/s42256-020-00255-1
DO - 10.1038/s42256-020-00255-1
M3 - Journal articles
AN - SCOPUS:85096069972
SN - 2522-5839
VL - 2
SP - 753
EP - 763
JO - Nature Machine Intelligence
JF - Nature Machine Intelligence
ER -