Icra 2010

Preprint version; ﬁnal version available at http://ieeexplore.ieee.
org
IEEE International Conference Robotics and Automation (2010), pp: 2889-2894
DOI: 10.1109/ROBOT.2010.5509429
Towards One Shot Learning by Imitation for Humanoid Robots

Yan Wu, and Yiannis Demiris, Senior Member, IEEE
Abstract— Teaching a robot to learn new knowledge is statistical fitness of the algorithm using data from human
a repetitive and tedious process. In order to accelerate the demonstrations will be presented before discussions on an
process, we propose a novel template-based approach for robot experiment to implement the algorithm for a real-life tic-
arm movement imitation. This algorithm selects a previously
observed path demonstrated by a human and generates a path tac-toe game on a humanoid robot.
in a novel situation based on pairwise mapping of invariant
feature locations present in both the demonstrated and the II. RELATED WORK
new scenes using a combination of minimum distortion For a given observed trajectory, in order to reproduce it in
and minimum energy strategies. This One-Shot Learning an unseen situation, simple copying does not always work
algorithm is capable of not only mapping simple point-to-point
paths but also adapting to more complex tasks such as well. Furthermore, in a new situation, additional constraints
those involving forced waypoints. As compared to traditional might be present, such as forced waypoints that require
methodologies, our work require neither extensive training plan adaptation along the imitated route. Thus, a robot
for generalisation nor expensive run-time computation for must have the ability to adapt in addition to generalise.
accuracy. This algorithm has been statistically validated using Research work in the area focused on approaches either to
cross-validation of grasping experiments as well as tested for
practical implementation on the iCub humanoid robot for produce a new path based on a subset of competent tasks
playing the tic-tac-toe game. to accommodate additional constraints [5], or to generate a
unique yet exact corresponding imitation of the previously
Index Terms - movement imitation, path planning, grasping, demonstrated trajectories ([6], [7]).
learning by imitation, tic-tac-toe Most imitation algorithms require several demonstrations
of a single task to work successfully ([5], [8]). This tedious
I. INTRODUCTION and time-consuming process is not favourable to human
demonstrators, especially in emergency situations. To speed-
Since the mid 1980s, programming by demonstration up the learning process and reduce fatigue in giving demon-
(PbD) has emerged as a promising research topic in robotics strations becomes crucial in future robotic advancement.
due to its relative merits over traditional methodologies [1]. Thus, One-Shot Learning, a popular niche area in machine
An increased interest in learning algorithms that will equip learning might help to address this issue. However, limited
robots to learn by imitation of actions from both humans and literature[9] has been found to address this issue. Some of
other robotic agents has resulted in many research directions such works focus on explanation-based methods[10] which
within the area. For instance, a hierarchical model [2] was are slightly difficult to be implemented on physical robots.
proposed in the domain of human-robot interaction (HRI) for In [11], the authors argue that paradigms generalising
attention and perception, while some other research focused observed behaviours into a set of intrinsic complex model
on skill teaching methodologies ([3], [4]) using a range parameters limit the ability of user interaction after demon-
of techniques. Many of these research directions can be stration. However, many state-of-the-art paradigms in path
encapsulated by the 5 “W”s of imitation - namely who, imitation are in favour of such models. For instance, in
where, when, what and how. Very often, it takes numerous algorithms that use Gaussian Mixture Models[1], the gen-
trials of learning for a robot to generalise each of these eralised parameters are the weights with associated Gaus-
abstract aspects in an action. sian parameters. These algorithms deter direct user interac-
In an attempt to reduce the number of trials involved in tion/intervention, as the abstract meaning of these parameters
learning, we propose an algorithm to address the “how-to” is not easily manipulable by users. For example, when
question in imitation. We introduce a novel computational a demonstrator accidentally perform a wrong movement,
model for learning path planning by imitation which makes instead of amending the model parameters, the user has to
use of a fundamental idea in plan adaptation - the presence of either redo all demonstrations or perform many more correct
invariant feature points in both the demonstration and a given ones.
situation - to generate a motion path for the new scenario. To address the above issues and the “how-to” question, a
The following sections of this paper will present some related path imitation problem can be perceived as a plan adaptation
work to our approach followed by the detailed description in which the demonstrated path with a set of environmental
of the methodology. Experimental results to validate the features is projected into a warping space. Based on the
preservation of the spatial relationship between the imitated
Yan Wu and Yiannis Demiris are with Department of Electrical
& Electronic Engineering, Imperial College London, United Kingdom and the original features in that space, an imitated path can
{yan.wu08, y.demiris}@imperial.ac.uk be generated. Such way of morphing a scene in the current
Preprint version; ﬁnal version available at http://ieeexplore.ieee.org
DOI: 10.1109/ROBOT.2010.5509429
context into a new scene exists in the field of both image points should be considered only when the planned path m′
processing and motion plan adaptation in which energy contradicts b.
involved in morphing is minimised ([12], [13]).
IV. METHODOLOGY
III. PROBLEM FORMULATION In this section, we will describe, in details, our novel
In this work, we consider the general case of path planning approach for path imitation. Briefly, we first generate a
in a 3-D environment inferred from the demonstration of a distortional mapping of each spatial point present in the
similar task. We assume that all required input features are template path into a set of possible locations based on
observable from vision, i.e. in our case a pair of stereo cam- minimal distortional energy mapping between the k pairs of
eras. Path planning for a planar task can be thus considered as coordinates extracted from feature sets F and F ′ as well
a special case of this work which requires only one camera. as the pair of starting positions. Based on the time series
The ultimate aim of the algorithm is to produce a desirable information of the template path, the task path is created
path for a given scenario. Generation of the path should be from the cloud of possible waypoints using minimum-energy
an inference from a past demonstrated case. Furthermore, strategy.
not only should the algorithm be able to generate such path A. Feature Distortion Warping
with great level of stability, but more importantly have some
We define the k Cartesian coordinates ai from the template
resemblance to the path produced by human under similar
as the invariant control points(ICP) P and the corresponding
circumstances.
ICPs a j in the task P′ . If we can define each mapping from P
For a given demonstration viewed from each of the pair
to P′ as a function f , to minimise the distortion of feature in
of stereo cameras, we describe the motion path as a set of
space is equivalent to minimise the following energy function
p discretised spatial feature points ml : (xl , yl ), l ∈ {1...p}
[15]:
in the time series images. We also assume that the target
destination in the scene can be fully described by a set, F k
of n point-like features, where each is described by (ai , Ai ), E= ∑ k P′ w − Pw k +λ E f (2)
i ∈ {1...n}. While ai represents the Cartesian coordinates of w=1
the feature point, Ai encapsulates additional information that where Z Z

′′ ′ ′′
will help to match invariant points of similar objects/targets, Ef = ( fxx + 2 fxy + fyy )dxdy (3)
R2
such as SIFT[14] features and textual features. We refer to
such a given demonstration as a template and a new situation The introduction of the regularisation parameter, λ , in (2)
as the task. is to trade-off between the exact matching of points and the
In a new situation described similarly by a set F ′ : smoothness, which is particularly useful in the presence of
(a j , A′ j ), j ∈ {1...n′ }, we assume that there exists a con-
′ noise. According to [15], the mapping function f shown can
fidence function for correspondence, fc (Ai , A′ j ), where be defined as:
k
if Ai & A′ j are uncorrelated
½
fc (Ai , A′ j ) =
0
(1) f (x, y) = α0 + αx x + αy y + ∑ ωi φ (k (xi , yi ) − (x, y) k) (4)
1 if Ai matches A′ j i=1
Depending on the context, we can employ different feature where

mapping algorithms to match As and A′ s. We, thus, can φ (r) = r2 log(r) (5)
identify a maximum k pairs of coordinates in the image (5) is a 2nd order polyharmonic spline commonly known as
space of both the task, a′ j , and one of the learned templates, a Thin Plate Spline. In order to ensure that E f exists, the
ai , where fc (Ai , A′ j ) = 1, k ≤ n, k ≤ n′ . This k pairs of 2nd derivatives of f (x, y) must be square integrable, i.e. the
coordinates should also include the pair of starting positions following three conditions have to be met:
in the task and the template.
In a general path planning situation, we might be given k
more features in both the task and the templates, such as ∑ ωi = 0 (6)
i=1
objects at far sight and textual features of the background.
However, there are cases which inclusion of such features k k
generates excessive output distortion which is undesirable. ∑ ωi xi = ∑ ωi yi = 0 (7)
i=1 i=1
Thus, we should not impose the matching constraints for
such features in order to preserve the spatial relationship By letting Φi j = φ (k (xi , yi ) − (x j , y j ) k) and υi = f (xi , yi ),
between the cardinal features and generate a route for the based on (4) - (7), we can form a linear equation as follows:
task, m′ : (x, y).
Φ ω υ
· ¸· ¸ · ¸
L
In a more complex situation, when the agent is required to = (8)
LT 0 α 0
pass through waypoints or to avoid en-route obstacles, these
extra feature points can also be described by an additional where ω is a column vector of ωi , α = [ α0 αx αy ]T
set Fx′ (b j , B j ). If Fx′ exists in a task, these additional feature and the ith row of L, Li = [ 1 xi yi ].
DOI: 10.1109/ROBOT.2010.5509429
In [16], it has been shown that the square matrix in (8) is were conducted to test the statistical fitness (Experiment A)
non-singular. Thus, we can define the upper left k × k sub- and practical application (Experiment B) of the algorithm re-
matrix of the inverse of this square matrix by M′ k . It can spectively. In both cases, we made use of the SIFT matching
be shown that E f ∝ υ T M′ kυ = ω T Mω . Thus, the optimal algorithm as the confidence function fc defined in (1).
solution of ω and a with minimum bending energy can be
solved either by analytical method or approximation methods
described in [17] and [18] depending on the importance of
accuracy or computational cost.
For each ml in the template, there exists q ≥ 0 mapped
coordinates as possible candidates for m′ l in the task defined
by (4). As these coordinates are often at sub-pixel level, the
(b) (c)
minimum energy enforcement is relaxed and thresholded to
accommodate neighbouring coordinates.
B. Minimum-Energy Route Plan
Given the time series point clouds of m′ , the goal-directed
movement is simply represented by stepping through the
(a) (d)
variable l, and connect the best points from each m′ l to
form the trajectory. We shall make use of the translational Fig. 1: The experiment set-up for testing the path planning algorithm.
energy as the cost function to derive the task path. Thus, the The iCub in (a) is developed by the RobotCub Consortium. It has a total
of 53 Degrees of Freedom, 32 of which are distributed on the arms. (b)
optimisation criterion CE is to minimise this cost function and (c) are an instance of a human subject with markers captured by the
which is proportional to the sum of changes in positions. left and right cameras of the iCub respectively.(d) shows the locations of
markers placed on the left arm of the human subjects in Experiment A.
p
CE = ∑ (k m′ i − m′ i−1 k)2 (9)
i=2
Instead of evaluating the full mesh of discrete energies A. Experimental Setup
through steps, if we define SDi j =k m′ i − m′ j k2 , U as a The pair of stereo cameras on-board were used to obtain
possible waypoint in a given time-stage l, CEl (U) as the the demonstrated information. In the following reported
minimum energy from the starting point to the waypoint U, experiments, the iCub captured the demonstrations at the
we can simplify the computation into a dynamic program- frame rate of 20Hz and frame resolution of 320×240 pixels
ming problem[19]: (example shown in Fig. 1b & 1c). Markers were placed on
human subjects (e.g. Fig. 1d) to track the points of interest.
CEl (U) = min {SDuv +CEl−1 (V )} (10)
waypoints v in l−1 B. Experiment A
C. Iterative Plan Adjustment A human demonstrator was instructed to perform a grasp-
Recall that we have defined Fx′ as the features denoting oriented task while the iCub observed the actions. As
extra features in a more complex situation. After the task path extended from our previous work [20], we conducted 5
is generated, we will then check if any m′ l contradicts with experimental tasks, which consisted 20 trials each performed
b j in Fx′ . In the case of having forced waypoints, suppose by different subjects. This has resulted in 100 experiments for
we define the contradiction evaluation Ct as cross-validation of the model against human demonstrations.
1 if b j 6⊆ m′
½
Ct(b j ) = (11)
0 otherwise
The algorithm should iterate the following steps until all
Ct(b j ) = 0:
1) Compute Ct(b j ) for all i (a) (b) (c) (d) (e)
2) For any Ct(b j ) = 1, locate the point m′ l on the task
path that is nearest to b j . Find the corresponding point Fig. 2: The sketches of the 5 conducted Experiments. All subjects have
been requested to use their inferior arm (in all cases, left arms). The hand
ml in the template. Put ml into set F and b j into F ′ positions in the diagrams indicate the starting points of the experiments.
3) Re-perform the distortion mapping algorithm described The subjects have also been requested to approach the object with their
fore-arms orthogonal to the blue strips indicated in the diagrams. The black
above to find a new task path patch indicated in (e) denotes the waypoint area the subject have to navigate
V. EXPERIMENTS their arms through. The hypothesised paths are denoted by black slashes in
the diagrams.
Our planning algorithm was implemented and validated
on the iCub (Fig. 1a), a humanoid robot developed by the
The 5 experiments were designed to benchmark the ro-
RobotCub Consortium1 . Two different sets of experiments
bustness of the proposed model in various ways, which are
1 www.RobotCub.org described below and illustrated in Fig. 2.
DOI: 10.1109/ROBOT.2010.5509429
1) Experiment 1 shown in Fig. 2a is the most general accuracy in extracting the markers. Hence, Gaussian Radial
case of grasping, hypothesised to be most useful for Basis Function (RBF) smoothing with smoothing parameter
mapping in complex situations. of 0.5 pixels was applied to the extracted path.
2) Experiment 2 shown in Fig. 2b is intended to test the The algorithm with intrinsic parameter λ = 0 was then
algorithm with only pure angular rotation of the entire set up as we believe that the RBF smoothing should have
scene. helped to remove the noise present in the experiments. Thus,
3) Experiment 3 shown in Fig. 2c is designed to test the any attempt to relax λ might result in the distortion of the
ability of generalisation of the algorithm into a 3D optimal path. We also thresholded the distortional energy to
situation. accommodate the inclusion of pixel locations up to 2 pixels
4) Experiment 4 shown in Fig. 2d is to test the general away from the mapped sub-pixel location.
performance of the warping algorithm. With the set of predefined intrinsic parameters of the
5) Experiment 5 shown in Fig. 2e is to test the robustness stereo cameras, the two independently generated paths can
of the algorithm in a more complex situation, i.e. in be integrated to form the final 3-D action path. In Experiment
this case with a waypoint. A, the generated paths were then used to cross-validate the
results produced by other experimental trials. For Experiment
C. Experiment B B, during the iCub’s term in the game, it generated a
The task of this experiment is to make the iCub to play corresponding path to place a mark in the intended grid cell.
the tic-tac-toe game by imitating the basic movement of the This path was then passed to the inverse kinematics module
game. During the training phase shown in Fig. 3a, the iCub of the iCub for execution.
was given one single demonstration of how to place a mark E. Statistical Performance Evaluation
in the grid space. In the demonstration, the human subject
was instructed to constrain his arm movement on a plane. Apart from visual inspection, we introduce two per-
The iCub was then given a new grid of different size at a formance metrics, namely Mean Squared Difference and
completely new location, as shown in Fig. 3b, to play the Correlation Coefficient, to evaluate the performance of the
game with a human until the game is finished. The iCub’s algorithm quantitatively.
arm position was randomly parked some distance above and 1) Mean Squared Difference (MSD): We make use of
to the left of the playing board at the end of each move. This MSD to estimate the squared difference between the cal-
parking position would be used as the initial position for culated path and the demonstrated path to gauge how close
the next move. All these measures are to test the robustness the generated paths are to human demonstrated ones.
of the algorithm to generalise in a 3-D environment. In this 1 N
experiment, we assumed that the pen was always on the hand MSD = ∑ k m′i − mi k2
N i=1
(12)
of the iCub and the invariant features were the four corners
of the cell on the grid and the starting position of the arms. 2) Correlation Coefficient (R2 ): Assuming that the pro-
posed algorithm is an estimation of the resulted path gen-
erated by human under similar circumstances, R2 is an
indicator of how likely our proposed algorithm can be used
to predict paths produced by human.
∑Ni=1 (mi − m̄) · (m′i − m̄′ )

R2 = q (13)
(∑Ni=1 (mi − m̄)2 )(∑Ni=1 (m′i − m̄′ )2 )
(a) (b)
where m̄ denotes the arithmetic mean of mi .
Fig. 3: Environmental setup for Experiment B. (a) shows an instance
of the demonstrator drawing a circle in a cell of the A4-sized grid. The In (12) and (13), both the performance metrics require
iCub was expected to play the game in scene (b), where the grid is 20% the input vectors to be of the same lengths. However, in
smaller, rotated and 20cm above that in (a). actual practice, we cannot ensure all demonstrations to be
completed at the same duration. Thus, we employed the
Cubic Spline Interpolation method to lengthen the path with
D. Implementation of the Algorithm fewer waypoints to match that of the longer one.
As both cameras on the iCub have certain degrees of fish-
VI. RESULTS AND DISCUSSIONS
eye distortion (Fig. 1b & 1c), we undistorted the images
with a set of calibration parameters discussed in [21] before A. Experiment A
processing the captured frames. The marker positions were For each trial as an input to the algorithm, we generated
extracted using an efficient colour segmentation technique 100 paths based on the constraint-mapping into all trials,
proposed in [22]. As the image resolution is limited and including the input scenario itself. This has produced a
sub-pixel data cannot be recorded, uncertainty in extraction 100×100 matrix of paths for cross-validation.
process with the present of noise is unavoidable. The least TABLE I shows the performance metrics for the 100 self-
possible amount of uncertainty is thus 0.5 pixel given full mapping cases, i.e. the input and output constraints are the
DOI: 10.1109/ROBOT.2010.5509429
same. Assuming a given demonstration is the optimal path, from complex cases (Exp 5) back to the simplest form (Exp
the output from the algorithm should preserve maximally 1) does not yield good results due to the lack of precise
the input path. We can see that in all cases, the confidence constraints in the task space corresponding to that in the
indicator is greater than 99% with very low MSD which hints template space.
good preservation of paths. There are inevitable cases that the algorithm does not
reproduce a good path most likely due to the quality of
the demonstrated path. Additional layer of algorithm can
TABLE I: Performances of self-mapping cases grouped in experiments.
be added to generalise repeating templates to address such
Indicators Exp 1 Expt 2 Expt 3 Expt 4 Expt 5 problem. It is also interesting to note that Exp 3 yields high
MSD 25.9 14.4 35.3 31.4 30.6
R2 0.995 0.998 0.992 0.993 0.993
R2 with high MSD, which implies it might not perform better
than others. On the other hand, as the templates are stored
as a set of path waypoints and invariant features, users will
In TABLEs II and III, we grouped the performance indica- have the flexibility to understand the underlying contexts
tors according to input/output experiments and take the mean and make necessary adjustments such as removing templates
of these indicators. As shown in TABLE III, 88% of the cases having high MSDs without retraining the model.
have R2 ≥ 0.7 which suggests that the path generated by the
algorithm are close enough to what human demonstrated. B. Experiment B
Simple hypothesis testing indicated little statistical evidence Fig. 4a captures the camera view of the demonstrated
in performance difference for R2 ≥ 0.83. From TABLE II, path, while Fig. 4b shows the generated paths in the iCub’s
we can see that mapping a simpler path, e.g. straight line or coordinates system for execution. We can see from Fig. 4a,
without forced waypoints, to any scenario results in closer the human hand started from the far left and drew a circle on
performance to that of human. Such observation is somewhat the right before leaving the board vertically in space. This
expected due to the lack of complete invariant information has been well-preserved by the algorithm in the attempt to
in the complex case. From these results, we can believe perform the same task elsewhere in space as shown in Fig.
that utilisation of this One-Shot Learning algorithm can 4b. Although there are some minor imperfections in terms of
reduce the cost and burden of repeated demonstrations, while shape preservation, nevertheless the algorithm demonstrated
maintaining high accuracy. great level of stability with the board being shifted to a new
We can also see that the algorithm is capable of mapping location, and the arm being relaxed from planar movement.
paths from Experiments 1-4 accurately to Experiment 5 with And these imperfections can be reduced by introducing more
the constraint of an additional waypoint. This implies that invariant feature points.
the IPA algorithm works sensibly in such situations. Thus, At every move, one of these generated paths was then
we believe that with similar strategy, obstacle avoidance passed to the iCub’s inverse kinematics module for execution.
can also be executed with IPA by carefully moving the The sequence of drawings performed by the iCub is shown in
path point to the neighbour of the obstacle with lowest Fig. 5. All the symbols marked by the iCub during the game
bending energy. As such, this algorithm can also be treated were fairly accurately placed in the correct cell. Comparing
as an incremental learning paradigm in path planning to to the circles generated by the algorithm shown in Fig. 4b,
accommodate additional constraints which is in-line with it appears that the drawn circles are much less smooth. This
human learning pattern [23]. This is probably why mapping is likely due to the path generated by the iCub’s inverse
kinematics module which is not the shortest distance between
two points. However, this does not affect the discrimination
TABLE II: The averaged Mean Squared Difference for mapping from one between the 2 different symbols in the game.
experiment to another. Columns indicate input while rows indicate output.
VII. CONCLUSIONS
Exp 1 Exp 2 Exp 3 Exp 4 Exp 5
Exp 1 184 312 1158 7165 8244 We have presented a One-Shot Learning algorithm for
Exp 2 120 70 1526 3041 5105 robot path imitation in this paper. This algorithm has been
Exp 3 781 2785 536 891 2848
implemented and statistically validated using cross-validation
Exp 4 1512 645 1216 193 332
Exp 5 1119 865 1391 464 288 results from the grasp-oriented paths demonstrated by hu-
man subjects. It has also been implemented to equip a
humanoid robot with the capability to play the tic-tac-toe
TABLE III: The averaged Correlation Coefficients for mapping from one game without being constrained to the same location. The
experiment to another. Columns indicate input while rows indicate output. experimental results show that this generic algorithm is
Exp 1 Exp 2 Exp 3 Exp 4 Exp 5 capable of reproducing satisfactory path by imitating simple
Exp 1 0.960 0.949 0.819 0.453 0.402 tasks. However, the experiments have been conducted with
Exp 2 0.993 0.984 0.932 0.751 0.710 assumptions, such as sufficient invariant feature points were
Exp 3 0.864 0.482 0.896 0.793 0.752 given for mapping and the destination features were static.
Exp 4 0.773 0.827 0.868 0.967 0.941
Exp 5 0.835 0.817 0.874 0.890 0.962 Computationally, this algorithm with the implementation
of the approximation method is an O(n3 ) problem. This
DOI: 10.1109/ROBOT.2010.5509429
iCub view of the demonstrated path

150
R EFERENCES
155
y/pixel 160 [1] A. Billard, S. Calinon, R. Dillmann, and S. Schaal, “Robot program-
165 ming by demonstration,” Handbook of robotics, pp. 1371–1394.
170 [2] Y. Demiris and G. Simmons, “Perceiving the unusual: Temporal
175 properties of hierarchical motor representations for action perception,”
180
20 40 60 80 100 120 140 160 Neural Networks, vol. 19, no. 3, pp. 272–284, 2006.
x/pixel
[3] J. Peters, S. Vijayakumar, and S. Schaal, “Reinforcement learning for
(a) humanoid robotics,” in Proceedings of the Third IEEE-RAS Interna-
tional Conference on Humanoid Robots, pp. 1–20, 2003.
[4] Y. Yoshikawa, K. Shinozawa, H. Ishiguro, N. Hagita, and
Generated path on the iCub coordinates system
0.32 T. Miyamoto, “Responsive robot gaze to interaction partner,” in
Proceedings of robotics: Science and systems, 2006.
0.3 [5] R. Dillmann, “Teaching and learning of robot tasks via observation
of human performance,” Robotics and Autonomous Systems, vol. 47,
0.28 no. 2-3, pp. 109–116, 2004.
[6] J. Demiris and G. Hayes, “Imitation as a dual-route process featuring
0.26
predictive and learning components: a biologically-plausible computa-
tional model,” Imitation in animals and artifacts, pp. 327–361, 2002.
[7] A. Ude, C. Atkeson, and M. Riley, “Programming full-body move-
−x/m
0.24
ments for humanoid robots by observation,” Robotics and Autonomous
0.22
Systems, vol. 47, no. 2-3, pp. 93–108, 2004.
[8] M. Nicolescu and M. Mataric, “Natural methods for robot task learn-
ing: Instructive demonstrations, generalization and practice,” in Pro-
0.2
ceedings of the second international joint conference on Autonomous
agents and multiagent systems, pp. 241–248, ACM, NY, USA, 2003.
0.18
[9] M. Lagarde, P. Andry, and P. Gaussier, “The role of internal oscillators
for the one-shot learning of complex temporal sequences,” LECTURE
0.16
−0.3 −0.25 −0.2 −0.15 −0.1 −0.05 NOTES IN COMPUTER SCIENCE, vol. 4668, p. 934, 2007.
y/m
[10] T. Mitchell, R. Keller, and S. Kedar-Cabelli, “Explanation-based
generalization: A unifying view,” Machine learning, pp. 47–80, 1986.
(b) [11] H. Friedrich, R. Dillmann, and O. Rogalla, “Interactive robot pro-
gramming based on human demonstration and advice,” Lecture notes
Fig. 4: Paths imitated from one single demonstration. (a) shows the in computer science, pp. 96–119, 1999.
demonstrated path seen from the left camera of the iCub. (b) shows the [12] H. Chui and A. Rangarajan, “A new point matching algorithm for
generated paths for marking different cells in the iCub’s coordinates system. non-rigid registration,” Computer Vision and Image Understanding,
vol. 89, no. 2-3, pp. 114–141, 2003.
[13] B. Takács and Y. Demiris, “Multi-robot plan adaptation by constrained
minimal distortion feature mapping,” in IEEE International Confer-
ence on Robotics and Automation, ICRA ’09, May 2009.
[14] D. Lowe, “Object recognition from local scale-invariant features,” in
Int’l Conference on Computer Vision, vol. 2, pp. 1150–1157, 1999.
[15] F. Bookstein, “Principal warps: Thin-plate splines and the decompo-
sition of deformations,” IEEE Transactions on pattern analysis and
(a) (b) (c) machine intelligence, vol. 11, no. 6, pp. 567–585, 1989.
[16] M. Powell, “A thin plate spline method for mapping curves into
curves in two dimensions,” Computational Techniques and Applica-
tions (CTAC ’95), 1995.
[17] J. Zhu and M. Lyu, “Progressive finite newton approach to real-time
nonrigid surface detection,” in IEEE Conference on Computer Vision
and Pattern Recognition, 2007. CVPR’07, pp. 1–8, 2007.
[18] J. Pilet, V. Lepetit, and P. Fua, “Real-time nonrigid surface detection,”
(d) (e)
in IEEE Computer Society Conference on Computer Vision and Pattern
Fig. 5: iCub playing tic-tac-toe. The iCub started the game by marking Recognition, 2005. CVPR 2005, vol. 1, 2005.
the centre cell (a) with the path learned from demonstration. Subsequently, [19] R. Bellman, “Some problems in the theory of dynamic programming,”
it then marked on top-left (b), top-right (c) cells and won by marking the Econometrica: Journal of the Econometric Society, pp. 37–48, 1954.
top-centre (d) cell. (e) shows the resulted grid after the game. [20] Y. Wu and Y. Demiris, “Efficient template-based path imitation by
invariant feature mapping,” in Robotics and Biomimetics, 2009. ROBIO
2009. IEEE International Conference on, pp. 913–918, Dec 2009.
[21] J. Heikkila and O. Silven, “A four-step camera calibration procedure
algorithm is much inexpensive with preservation of good with implicit image correction,” Computer Vision and Pattern Recog-
nition, IEEE Computer Society Conference on, vol. 0, p. 1106, 1997.
accuracy as compared to imitation algorithms that focus more [22] J. Bruce, T. Balch, and M. Veloso, “Fast and inexpensive color image
on accuracy using fluid dynamics principles [24]. We plan segmentation for interactiverobots,” in 2000 IEEE/RSJ Int’l Confer-
to extend our research in segmenting the observed templates ence on Intelligent Robots and Systems(IROS 2000). Proceedings,
vol. 3, 2000.
into smaller sub-templates that can be reused and recycled [23] R. Mayer and R. Moreno, “Nine Ways to Reduce Cognitive Load
in a new situation by possible combination of specific sub- in Multimedia Learning,” EDUCATIONAL PSYCHOLOGIST, vol. 38,
templates in the right order. no. 1, pp. 43–52.
[24] H. Mayer, I. Nagy, A. Knoll, E. Braun, R. Lange, and R. Bauern-
schmitt, “Adaptive Control for Human-Robot Skilltransfer: Trajectory
VIII. ACKNOWLEDGMENT Planning Based on Fluid Dynamics,” in 2007 IEEE International
The authors acknowledge the comments and support from Conference on Robotics and Automation, pp. 1800–1807, 2007.
members of the BioART Lab. The authors would also like
to thank the participants in the experiments.

Icra 2010

Uploaded by

Copyright:

Available Formats

Icra 2010

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Icra 2010

Uploaded by

Copyright:

Available Formats

Preprint version; ﬁnal version available at http://ieeexplore.ieee.

Towards One Shot Learning by Imitation for Humanoid Robots

the feature point, Ai encapsulates additional information that where Z Z

Depending on the context, we can employ different feature where

∑Ni=1 (mi − m̄) · (m′i − m̄′ )

iCub view of the demonstrated path

You might also like