J. A. Anderson, An introdution to neural networks, 1995.

M. Asada, S. Noda, S. Tawaratsumida, and K. Hosoda, Purposive Behavior Acquisition for a Real Robot by Vision-based Reinforcement Learning, Mataric) on Learning in Autonomous Robots Machine Learning), and Autonomous Robots, pp.3-4, 1998.

D. M. Ackley and . Littman, Interactions Between Learning and Evolution, Artificial Life II, SFI Studies Sc. Complexity, pp.487-509, 1991.

A. G. Barto and P. Anandan, Pattern-recognizing stochastic learning automata, IEEE Transactions on Systems, Man, and Cybernetics, vol.15, issue.3, pp.360-375, 1985.
DOI : 10.1109/TSMC.1985.6313371

J. Baxter, Theoretical Models of Learning to Learn, 1998.
DOI : 10.1007/978-1-4615-5529-2_4

V. Braitenberg, Vehicles: Experiments in Synthetic Psychology, 1984.

R. Brooks, Intelligence without reason, IJCAI'91, 1991.

Y. U. Cao, A. Fukuaga, and A. Kahng, Cooperative Mobile Robotics: Antecedent and Directions, Autonomous Robots, vol.4, issue.1, pp.7-27, 1997.
DOI : 10.1023/A:1008855018923

M. Colombetti, M. Dorigo, and G. Borghi, Behavior analysis and training-a methodology for behavior engineering, IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), vol.26, issue.3, pp.365-380, 1996.
DOI : 10.1109/3477.499789

R. Crites and A. Barto, Improving elevator performance using reinforcement learning, Advances in Neural Information Processing Systems: Proceedings of the 1995 Conference, pp.1017-1023, 1996.

T. Darrell, Reinforcement Learning of Active Recognition Behaviors Interval Research Technical Report 1997-045. (http://www.interval.com/papers/1997-045 -Portions of this paper previously appeared in, Advances in Neural Information Processing Systems 8, (NIPS '95), pp.858-864, 1998.

P. Dayan and T. Sejnowski, TD(l) convergences with probability 1, Machine Learning, 1994.

M. Dorigo, Introduction to the Special Issue on Learning Autonomous Robots, IEEE Trans. on Systems, Man and Cybernetics -part B, vol.26, issue.3, pp.361-364, 1996.

M. Dorigo and H. Bersini, A comparison of Q-learning and classifier systems, Proc. of the 3rd Int. Conf. on Simulation of Adaptive Behavior (SAB'94), 1994.

M. &. Dorigo and . Colombetti, Training Agents to Perform Sequential Behavior, Adaptive Behavior, vol.2, issue.3, pp.247-276, 1994.

M. Dorigo and M. Colombetti, Precis of Robot Shaping: An Experiment in Behavior Engineering, Adaptive Behavior, vol.5, issue.3-4, 1998.
DOI : 10.1177/105971239700500308

J. Heemskerk and N. Sharkey, Learning Subsumptions for an Autonomous Robot IEE seminar on selflearning robot, 1996.

H. Hexmoor, L. Meeden, and R. Murphy, Is Robot Learning a New Subfield?, 1997.

L. Kaelbling, M. Littman, and A. Moore, Reinforcement Learning: A Survey, Journal of Artificial Intelligence Research, vol.4, pp.237-285, 1996.

Z. Kalmar, C. Szepesvari, and A. Lorincz, Experiments with a Real Robot, Mataric) on Learning in Autonomous Robots Machine Learning), and Autonomous Robots, pp.3-4, 1998.

T. Kohonen, Self-Organisation and Associative Memory, 1984.

R. M. Kretchmar and C. W. Anderson, Comparison of CMACs and radial basis functions for local function approximators in reinforcement learning, Proceedings of International Conference on Neural Networks (ICNN'97), 1997.
DOI : 10.1109/ICNN.1997.616132

L. Lin, Reinforcement Learning for Robots Using Neural Networks, -CS-93-103, 1993.

L. J. Lin, Self-improving reactive agents based on reinforcement learning, planning and teaching, Machine Learning, vol.8, pp.293-321, 1992.

M. L. Littman, Memoryless policies: theoretical limitations and practical results From Animals to Animats 3: Proc. of the Third Int, Conf. on Simulation of Adaptive Behavior, 1996.

R. A. Mccallum, Instance-based State Identification for Reinforcement Learning, Advances In Neural Information Processing Systems, p.77, 1995.

R. Maclin and J. W. Shavlik, Creating Advice-Taking Reinforcement Learners, Learning to Learn, 1998.

S. Mahadevan and J. Connell, Automatic programming of behavior-based robots using reinforcement learning, Artificial Intelligence, vol.55, issue.2-3, pp.311-365, 1991.
DOI : 10.1016/0004-3702(92)90058-6

S. Mahadevan, Rapid Concept Learning for Mobile Robots, Mataric) on Learning in Autonomous Robots Machine Learning), and Autonomous Robots, pp.3-4, 1998.

M. J. Mataric, Reinforcement Learning in the Multi-Robot Domain, Autonomous Robots, vol.4, pp.73-83, 1997.
DOI : 10.1007/978-1-4757-6451-2_4

F. Michaud and M. J. Mataric, Learning from History for Behavior-Based Mobile Robots in Non- Stationary Conditions, Special joint issue Machine Learning and Autonomous Robots Journals, H. Hexmoor and M. Mataric, 1998.

J. Millán and R. Del, Rapid, safe, and incremental learning of navigation strategies, IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), vol.26, issue.3, pp.408-420, 1996.
DOI : 10.1109/3477.499792

F. Mondada, E. Franzi, and &. P. Ienne, Mobile robot miniaturisation: A tool for investigation in control algorithms, Third International Symposium on Experimental Robotics, 1993.
DOI : 10.1007/BFb0027617

A. W. Moore and C. G. Atkeson, Prioritized sweeping: Reinforcement learning with less data and less time, Machine Learning, 1993.
DOI : 10.1007/BF00993104

L. E. Parker, Cooperative motion control for multi-target observation, Proceedings of the 1997 IEEE/RSJ International Conference on Intelligent Robot and Systems. Innovative Robotics for Real-World Applications. IROS '97, p.78, 1997.
DOI : 10.1109/IROS.1997.656570

J. Peng and R. J. Williams, Efficient learning and planning within the Dyna framework, IEEE International Conference on Neural Networks, pp.437-454, 1993.
DOI : 10.1109/ICNN.1993.298551

D. Rumelhart, G. Hinton, and &. R. Williams, Learning Internal Representations by Error Propagation, Parallel Distributed Processing, vol.1, pp.318-362, 1986.
DOI : 10.1016/B978-1-4832-1446-7.50035-2

J. M. Santos and C. Touzet, Exploration tuned reinforcement function, Neurocomputing, vol.28, issue.1-3, 1999.
DOI : 10.1016/S0925-2312(98)00117-9

N. Sharkey, Learning from Innate Behaviors: A Quantitative Evaluation of Neural Network Controllers, " joint special issue, Mataric) on Learning in Autonomous Robots Machine Learning), and Autonomous Robots, pp.3-4, 1998.

J. W. Sheppard and S. L. Salzberg, A Teaching Strategy for Memory-Based Control, Lazy Learning, pp.343-370, 1997.
DOI : 10.1007/978-94-017-2053-3_13

J. Schmidhuber and J. Zhao, Multi-Agent Learning with the Success-Story Algorithm Distributed Artificial Intelligence Meets Machine Learning ? Learning in Multi-Agent Environments, Lecture Notes in Artificial Intelligence, vol.1221, pp.82-93, 1997.

S. P. Singh, T. Jaakkola, and M. Jordan, Learning Without State-Estimation in Partially Observable Markovian Decision Processes, Proc. of the Eleventh Int. Machine Learning Conf, 1994.
DOI : 10.1016/B978-1-55860-335-6.50042-8

R. S. Sutton, Reinforcement Learning Architectures for Animats, Proc. of the First Int. Conf. on Simulation of Adaptive Behavior, From Animals to Animats, pp.288-296, 1991.

G. Tesauro, Temporal difference learning and TD-Gammon, Communications of the ACM, vol.38, issue.3, pp.58-68, 1995.
DOI : 10.1145/203330.203343

S. Thrun, Efficient Exploration In Reinforcement Learning, 1992.

S. Thrun, Exploration and model building in mobile robot domains, IEEE International Conference on Neural Networks, 1993.
DOI : 10.1109/ICNN.1993.298552

S. Thrun and L. Pratt, Introduction and Overview, 1998.

C. Touzet, Neural Reinforcement Learning for Behaviour Synthesis Special issue on Learning Robot: the New Wave, Robotics and Autonomous Systems, pp.251-281, 1997.

C. Touzet, S. Sehad, and N. Giambiasi, Improving Reinforcement Learning of Obstacle Avoidance Behavior with Forbidden Sequences of Actions, International Conference on Robotics and Manufacturing, pp.14-16, 1995.

C. J. Watkins, Learning from Delayed Rewards King's College, 1989.

C. Watkins and P. Dayan, Q-Learning, Machine Learning, pp.279-292, 1992.

S. Whitehead, J. Karlsson, and J. Tenenberg, Learning Multiple Goal Behavior via Task Decomposition and Dynamic Policy Merging, Robot Learning, 1993.
DOI : 10.1007/978-1-4615-3184-5_3

S. D. Whitehead, A Complexity Analysis of Cooperative Mechanisms in Reinforcement Learning, Proc. of AAAI, pp.607-613, 1991.