Human-Robot Collaboration
Active Inverse Reinforcement
Learning
Inverse reinforcement learning addresses the general problem of
recovering a reward function from samples of a policy provided
by an expert/demonstrator. In this paper, we introduce active
learning for inverse reinforcement learning. We propose an
algorithm that allows the agent to query the demonstrator for
samples at specific states, instead of relying only on samples
provided at arbitrary states. The purpose of our algorithm is
to estimate the reward function with similar accuracy as other
methods from the literature while reducing the amount of policy
samples required from the expert. We also discuss the use of
our algorithm in higher dimensional problems, using both Monte
Carlo and gradient methods. We present illustrative results of
our algorithm in several simulated examples of different
complexities.
Active Learning for Reward Estimation in Inverse
Reinforcement Learning,Manuel Lopes,
Francisco Melo and Luis Montesano. European Conference on Machine Learning
(ECML/PKDD), Bled, Slovenia, 2009. (pdf)
|
Simultaneous Acquisition of Task and Feedback
Models
We present a system to learn task representations from
ambiguous feedback. We consider an inverse reinforcement
learner that receives feedback from a teacher with an unknown
and noisy protocol. The system needs to estimate simultaneously
what the task is (i.e. how to find a compact representation to
the task goal), and how the teacher is providing the feedback.
We further explore the problem of ambiguous protocols by
considering that the words used by the teacher have an unknown
relation with the action and meaning expected by the robot.
This allows the system to start with a set of known signs and
learn the meaning of new ones. We present computational results
that show that it is possible to learn the task under a noisy
and ambiguous feedback. Using an active learning approach, the
system is able to reduce the length of the training period.
Robot Learning Simultaneously a Task and How to Interpret
Human Instructions, Jonathan Grizou,
Manuel Lopes and Pierre-Yves Oudeyer. ICDL-Epirob, Osaka, Japan, 2013. (pdf)
Simultaneous Acquisition of Task and Feedback Models, Manuel Lopes, Thomas Cederborg and Pierre-Yves
Oudeyer. IEEE - International
Conference on Development and Learning (ICDL), Germany,
2011. (pdf)
|
Robot Self-Initiative and
Personalization by Learning through Repeated
Interactions
We have developed a robotic
system that interacts with the user, and through repeated
interactions, adapts to the user so that the system becomes
semi-autonomous and acts proactively. In this work we show how
to design a system to meet a user’s preferences, show how
robot pro-activity can be learned and provide an integrated
system using verbal instructions. All these behaviors are
implemented in a real platform that achieves all these
behaviors and is evaluated in terms of user acceptability and
efficiency of interaction.
A demonstration of a sequence of actions (video).
Robot Self-Initiative and Personalization by Learning
through Repeated Interactions, Martin
Mason and Manuel Lopes. 6th ACM/IEEE
International Conference on Human-Robot (HRI’11),
Lausanne, Switzerland, 2011. (pdf)
|
Task Inference and
Imitation
In this demo the robot is
shown a demonstration of a task. It knows how to grasp, how to
tap, and by knowing the object affordances it is able to
recognize such actions from the effects on objects. After
observing the demonstration and segment it in its components it
needs to infer the goal/intention of it. Inverse reinforcement
learning algorithms can be used as a way to infer the reward
function that better explains the observed behavior. It is
interesting that different tasks (e.g. considering or not
subgoals) are infered depending on different knowledges about
the world or on contextual restrictions (see ).
A demonstration of a sequence of actions (video 1) is
imitated by Baltazar (video 2).
Relevant publications: Lopes et al AdBeh09 (pdf)
Lopes et al IROS07(pdf)
|
Affordance based
imitation
It is possible to imitate (parts) of the demonstration withouth
having to completely track and recongize all components in a
demonstration. It is interesting to see that to copy actions
applied on objects it is not necessary to follow the hand of
the demonstrator. If the affordances of objects are
know/learned then we can create a action-object-effect
statistical relation. To recognize a given action, or plan the
action that achieves a given effect, a statistical inference
can be made in such model.
This experiment (video)
shows how to copy the effects of actions just by using the
information provided by the affordances.
Affordance-based imitation learning in robots, Manuel Lopes, Francisco S. Melo and Luis
Montesano. IEEE – Intelligent
Robotic Systems (IROS), San Diego, USA, 2007. (pdf)
|
Behavioral Switching in Animals and
Children
Learning by imitation is an
easy way to program robots to perform tasks. To allow such
capability it is necessary to have knowledge about the world
properties (e.g. object affordances, world dynamics), infer the
intention of the demonstrator (at least the motor intention)
and to map actions and effects between different body
morphologies. It is interesting to remark that all the
assumptions we do relating to these processes will change the
imitating behavior. This behavioral switching has been deeply
studied in psychology and neurophysiology. Our paper discusses
this aspects and provides a computational method that models
behavioral switching experiments from psychology and is also
applied to robot learning.
A Computational Model of Social-Learning Mechanisms, Manuel
Lopes, Francisco S. Melo, Ben Kenward and José Santos-Victor.
Adaptive Behaviour, 467(17), 2009.
|
Reflexive imitation based
on perceptual-motor maps
This experiment
shows Baltazar mimicking human gestures(movements). While
seeing a gesture performed by and human (video 1) the robot
process the image (video
2) in order to extract the backgournd to be able to
estimate the body, shoulder and hand position. In the end,
using the visuo-motor map the system is able to mimic the
gesture (video 3).
Relevant publications:
Lopes et. al SMC(B)05 (pdf)
Lopes et al ICRA03 (pdf)
|
|