Re: [Comp-neuro] Deep Learning Overview - Draft

May 29, 2014

      Dear computational neuroscientists,

thanks a lot for numerous helpful comments! 
This was great fun; I learned a lot.

A revised version (now with 850+ references) is here: http://arxiv.org/abs/1404.7828
PDF: http://www.idsia.ch/~juergen/DeepLearning28May2014.pdf
LATEX source: http://www.idsia.ch/~juergen/DeepLearning28May2014.tex
The complete BIBTEX file is also public: http://www.idsia.ch/~juergen/bib.bib

Please do NOT reply to the entire list (only to juergen@idsia.ch).

Juergen Schmidhuber
http://www.idsia.ch/~juergen/deeplearning.html

---Revised Table of Contents:---

1 Introduction to Deep Learning (DL) in Neural Networks (NNs) 

2 Event-Oriented Notation for Activation Spreading in Feedforward NNs (FNNs) and Recurrent NNs (RNNs)

3 Depth of Credit Assignment Paths (CAPs) and of Problems 

4 Recurring Themes of Deep Learning 
4.1 Dynamic Programming for Supervised / Reinforcement Learning (SL / RL) 
4.2 Unsupervised Learning (UL) Facilitating SL and RL 
4.3 Learning Hierarchical Representations Through Deep SL, UL, RL 
4.4 Occam’s Razor: Compression and Minimum Description Length (MDL) 
4.5 Fast Graphics Processing Units (GPUs) for DL in NNs 

5 Supervised NNs, Some Helped by Unsupervised NNs 
5.1 Early NNs Since the 1940s (and the 1800s)
5.2 Around 1960: Visual Cortex Provides Inspiration for DL 
5.3 1965: Deep Networks Based on the Group Method of Data Handling (GMDH)
5.4 1979: Convolution + Weight Replication + Subsampling (Neocognitron) 
5.5 1960-1981 and Beyond: Development of Backpropagation (BP) for NNs
5.5.1 BP for Weight-Sharing Feedforward NNs (FNNs) and Recurrent NNs (RNNs) 
5.6 Late 1980s-2000: Numerous Improvements of NNs 
5.6.1 Ideas for Dealing with Long Time Lags and Deep CAPs 
5.6.2 Better BP Through Advanced Gradient Descent 
5.6.3 Searching For Simple, Low-Complexity, Problem-Solving NNs 
5.6.4 Potential Benefits of UL for SL
5.7 1987: UL Through Autoencoder (AE) Hierarchies
5.8 1989: BP for Convolutional NNs (CNNs)
5.9 1991: Fundamental Deep Learning Problem of Gradient Descent 
5.10 1991: UL-Based History Compression Through a Deep Hierarchy of RNNs 
5.11 1992: Max-Pooling (MP): Towards MPCNNs 
5.12 1994: Early Contest-Winning NNs
5.13 1995: Supervised Recurrent Very Deep Learner (LSTM RNN) 
5.14 2003: More Contest-Winning/Record-Setting NNs
5.15 2006/7: UL For Deep Belief Networks (DBNs) / AE Stacks Fine-Tuned by BP
5.16 2006/7: Improved CNNs / GPU-CNNs / BP-Trained MPCNNs / LSTM Stacks 
5.17 2009: First Official Competitions Won by RNNs, and with MPCNNs 
5.18 2010: Plain Backprop (+Distortions) on GPU Yields Excellent Results 
5.19 2011: MPCNNs on GPU Achieve Superhuman Vision Performance 
5.20 2011: Hessian-Free Optimization for RNNs 
5.21 2012: First Contests Won on ImageNet & Object Detection & Segmentation
5.22 2013-: More Contests and Benchmark Records 
5.23 Currently Successful Supervised Techniques: LSTM RNNs / GPU-MPCNNs 
5.24 Recent Tricks for Improving SL Deep NNs (Compare Sec. 5.6.2, 5.6.3)
5.25 Consequences for Neuroscience 
5.26 DL with Spiking Neurons? 

6 DL in FNNs and RNNs for Reinforcement Learning (RL) 
6.1 RL Through NN World Models Yields RNNs With Deep CAPs 
6.2 Deep FNNs for Traditional RL and Markov Decision Processes (MDPs) .
6.3 Deep RL RNNs for Partially Observable MDPs (POMDPs) 
6.4 RL Facilitated by Deep UL in FNNs and RNNs 
6.5 Deep Hierarchical RL (HRL) and Subgoal Learning with FNNs and RNNs 
6.6 Deep RL by Direct NN Search / Policy Gradients / Evolution
6.7 Deep RL by Indirect Policy Search / Compressed NN Search
6.8 Universal RL

On May 15, 2014, at 3:36 PM, Juergen Schmidhuber <juergen@idsia.ch> wrote:
...
Dear computational neuroscientists,
here the preliminary draft of an invited Deep Learning overview:
http://www.idsia.ch/~juergen/DeepLearning15May2014.pdf
It mostly consists of references (about 800 entries so far). Important citations are still missing though. As a machine learning researcher, I am obsessed with credit assignment. In case you know of references to add or correct, please send them with brief explanations to juergen@idsia.ch (NOT TO THE ENTIRE LIST!), preferably together with URL links to PDFs for verification. Please also do not hesitate to send me additional corrections / improvements / suggestions / Deep Learning success stories with feedforward and recurrent neural networks. I'll post a revised version later. Thanks a lot!
Abstract. In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Juergen Schmidhuber
http://www.idsia.ch/~juergen/
http://www.idsia.ch/~juergen/whatsnew.html