r24 - 30 Sep 2009 - 18:12:21 - You are here: > > > ReadingOnDeepNetworks
Preliminary stuff that can be useful, depending on your background
Another bibliography: .
About neural networks
See the page on neural nets introductory material:
Also see the page on recurrent neural nets:
About distributed representations
About learning distributed representations for words
- Yoshua Bengio, Rejean Ducharme, Pascal Vincent, and Christian Jauvin. A Neural Probabilistic Language Model, Journal of Machine Learning Research, 3(Feb):1137-1155, 2003.
About auto-encoders
- Hinton, G.E. and Zemel, R.S. Minimizing description length in an unsupervised neural network (technical report?)
- R. Hadsell, S. Chopra, Y. LeCun, "Dimensionality Reduction by Learning an Invariant Mapping," in Proc. of Computer Vision and Pattern Recognition Conference (CVPR 2006), 2006.
Learning about relations between symbols
- Paccanaro, A., and Hinton, G.E. Learning Distributed Representations by Mapping Concepts and Relations into a Linear Space ICML-2000, Proceedings of the Seventeenth International Conference on Machine Learning, Langley P. (Ed.), 711-718, Stanford University, Morgan Kaufmann Publishers, San Francisco.
- 2000 Paccanaro, A. and Hinton, G.E Extracting Distributed Representations of Concepts and Relations from Positive and Negative Propositions Proceedings of the International Joint Conference on Neural Networks, IJCNN 2000
- 2005 Memisevic, R. and Hinton, G. E. Multiple Relational Embedding. Advances in Neural Information Processing Systems, 17, MIT Press, Cambridge, MA
About Monte-Carlo methods
- MathWorld entry on Monte-Carlo methods: and Monte-Carlo integration:
- Nando de Freitas seminar on Monte-Carlo methods
About graphical models
- Graphical models: Probabilistic inference. M. I. Jordan and Y. Weiss. In M. Arbib (Ed.), The Handbook of Brain Theory and Neural Networks, 2nd edition. Cambridge, MA: MIT Press, 2002.
- Graphical models. M. I. Jordan. Statistical Science (Special Issue on Bayesian Statistics), 19, 140-155, 2004.
- A Comparison of Algorithms for Inference and Learning in Probabilistic Graphical Models Brendan J. Frey, and Nebojsa Jojic IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 9, Sept. 2005
- Hinton, G. E., Dayan, P., To, A. and Neal R. M. The Helmholtz machine through time. F. Fogelman-Soulie and R. Gallinari (editors) ICANN-95, 483-490
About Boltzmann machines and related energy-based models
One of the founding papers:
- "Learning and Relearning in Boltzmann Machines" by G. E. Hinton and T. J. Sejnowski
- spiking Boltzmann machines paper with ideas to deal with temporal data:
- Yann LeCun and Fu Jie Huang, "Loss Functions for Discriminative Training of Energy-Based Models," in Proc. of the 10-th International Workshop on Artificial Intelligence and Statistics (AIStats'05) , 2005.
About Products of Experts, Restricted Boltzmann Machines and Contrastive Divergence
- Hinton, G.E. Products of experts Proceedings of the Ninth International Conference on Artificial Neural Networks [ICANN 99 Vol 1 pages 1-6].
- 2000. Hinton, G.E. Training Products of Experts by Minimizing Contrastive Divergence Technical Report: GCNU TR 2000-004
- Carreira-Perpignan, M. A. and Hinton. G. E. On Contrastive Divergence Learning. In: Artificial Intelligence and Statistics, 2005, Barbados
- Yee-Whye Teh, Geoffrey Hinton Rate-coded Restricted Boltzmann Machines for Face Recognition Advances in Neural Information Processing Systems 13, MIT Press, Cambridge, MA, 2001.
- Hinton, G. E. and Teh, Y. W. Discovering Multiple Constraints that are Frequently Approximately Satisfied. Proceedings of Uncertainty in Artificial Intelligence (UAI-2001), pp 227-234.
- 2002 Welling, M. and Hinton, G. E. A New Learning Algorithm for Mean Field Boltzmann Machines. International Joint Conference on Neural Networks, Madrid.
- Hinton, G. E. (2002) Training Products of Experts by Minimizing Contrastive Divergence. Neural Computation, 14, pp 1771-1800.
- 2004 Hinton, G. E., Welling, M. and Mnih, A. Wormholes Improve Contrastive Divergence. Advances in Neural Information Processing Systems, 16, MIT Press, Cambridge, MA
- Welling, M,, Rosen-Zvi, M. and Hinton, G. E. (2005) Exponential Family Harmoniums with an Application to Information Retrieval. Advances in Neural Information Processing Systems, 17, MIT Press, Cambridge, MA
About deep belief networks as such
- Bengio, Y. (2009) Learning deep Architectures for AI. To appear in Foundations and Trends in Machine Learning, 2009.
This is a review paper on deep architectures and deep belief networks in general.
Early version: wake-sleep algorithm and Helmholtz machine
- Hinton, G. E., Dayan, P., Frey, B. J. and Neal, R. The wake-sleep algorithm for unsupervised Neural Networks. Science, 268, 1158-1161, 1995.
- Frey, B. J., Hinton, G. E. and Dayan, P. Does the wake-sleep algorithm learn good density estimators? Advances in Neural Information Processing Systems 8. MIT Press, Cambridge, MA.
- Hinton, G. E. and Ghahramani, Z. Generative models for discovering sparse distributed representations Philosophical Transactions of the Royal Society of London, B, 352: 1177-1190, 1997.
The first Deep Nets papers
- 2006 Hinton, G. E., Osindero, S. and Teh, Y. A fast learning algorithm for deep belief nets.(submitted to Neural Computation)
yet another version of the deep belief net paper, easier to read:
- Hinton, G. E., Osindero, S., Welling, M. and Teh, Y. Unsupervised Discovery of Non-linear Structure using Contrastive Backpropagation. to appear in Cognitive Science, 30:4, 2006.
One the use of DBN for training deep auto-encoders:
- Hinton, G. E. and Salakhutdinov, R. R Reducing the dimensionality of data with neural networks. Science, Vol. 313. no. 5786, pp. 504 - 507, 28 July 2006.
Papers at NIPS'2006:
- Greedy Layer-Wise Training of Deep Networks Yoshua Bengio, Pascal Lamblin, Dan Popovici, Hugo Larochelle Advances in Neural Information Processing Systems 19, B. Scholkopf, J. Platt and T. Hoffman editors, MIT Press, Cambridge, MA, 2007.
- Modeling Human Motion Using Binary Latent Variables Graham Taylor, Geoff Hinton, Sam Roweis Advances in Neural Information Processing Systems 19, B. Scholkopf, J. Platt and T. Hoffman editors, MIT Press, Cambridge, MA, 2007.
The recent Deep Nets papers
- Yoshua Bengio, Learning deep architectures for AI, in: Foundations and Trends in Machine Learning, volume to appear, 2009.
- Joseph Turian , James Bergstra and Yoshua Bengio, Quadratic Features and Deep Architectures for Chunking, in: North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT), 2009.
- Yoshua Bengio, Jerome Louradour, Ronan Collobert and Jason Weston, Curriculum Learning, number 1330, 2009.
- Yoshua Bengio and Olivier Delalleau, Justifying and Generalizing Contrastive Divergence, in: Neural Computation, volume 21, number 6, pages 1601-1621, 2009.
- Dumitru Erhan, Pierre-Antoine Manzagol , Yoshua Bengio, Samy Bengio and Pascal Vincent, The Difficulty of Training Deep Architectures and the effect of Unsupervised Pre-Training, in: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS 2009), Clearwater (Florida), USA, 2009.
- Dumitru Erhan, Yoshua Bengio, Aaron Courville and Pascal Vincent, Visualizing Higher-Layer Features of a Deep Network, number 1341, 2009.
- Hugo Larochelle, Dumitru Erhan and Pascal Vincent, Deep Learning using Robust Interdependent Codes, in: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS 2009), Clearwater (Florida), USA, 2009.
- Hugo Larochelle, Yoshua Bengio, Jerome Louradour and Pascal Lamblin, Exploring Strategies for Training Deep Neural Networks, in: The Journal of Machine Learning Research, pages 1-40, 2009.
- Guillaume Desjardins and Yoshua Bengio, Empirical Evaluation of Convolutional RBMs for Vision, number 1327, 2008.
- Hugo Larochelle and Yoshua Bengio, Classification using Discriminative Restricted Boltzmann Machines, in: International Conference on Machine Learning proceedings, 2008.
- Nicolas Le Roux and Yoshua Bengio, Representational Power of Restricted Boltzmann Machines and Deep Belief Networks, in: Neural Computation, volume 20, number 6, pages 1631-1649, 2008.
- Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. Extracting and composing robust features with denoising autoencoders. pages 1096-1103, 2008.