D. Ackley and M. Littman, Interactions Between Learning and Evolution, in Artificial Life II, SFI Studies Sc, Complexity, pp.487-509, 1991.

A. G. Barto, R. S. Sutton, and C. W. Anderson, Neuron like elements that can solve difficult learning control problems, IEEE Trans. Sys. Man Cybern, vol.13, pp.835-846, 1983.

P. Dayan and T. Sejnowski, TD(?) convergences with probability 1, Machine Learning, pp.295-301, 1994.

*. Dorigo, M. , and M. Colombetti, Precis of Robot Shaping: An Experiment in Behavior Engineering, Adaptive Behavior, vol.5, issue.3-4, 1998.
DOI : 10.1177/105971239700500308

*. Kaelbling, L. , M. Littman, and A. Moore, Reinforcement Learning: A Survey, Journal of Artificial Intelligence Research, vol.4, pp.237-285, 1996.

R. A. Mccallum, Instance-based State Identification for Reinforcement Learning, Advances In Neural Information Processing Systems, 1995.

L. Lin, Reinforcement Learning for Robots Using Neural Networks, pp.93-103, 1993.

S. Mahadevan and J. Connell, Automatic programming of behavior-based robots using reinforcement learning, Artificial Intelligence, vol.55, issue.2-3, pp.2-3, 1992.
DOI : 10.1016/0004-3702(92)90058-6

J. M. Santos and C. Touzet, Dynamic Update of the Reinforcement Function During Learning, Connection Science, vol.11, issue.3-4, 1999.
DOI : 10.1080/095400999116250

J. W. Sheppard and S. L. Salzberg, A Teaching Strategy for Memory-Based Control, Lazy Learning, pp.343-370, 1997.
DOI : 10.1007/978-94-017-2053-3_13
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.51.8621

C. Touzet, Neural Reinforcement Learning for Behaviour Synthesis Robotics and Autonomous Systems, Special issue on Learning Robot: the New Wave, pp.3-4, 1997.

C. Touzet, Programming robots with associative memories, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339), 1999.
DOI : 10.1109/IJCNN.1999.832705

J. C. Watkins, Learning from Delayed Rewards King's College, 1989.