Abstract
Purpose
The purpose of this paper is to propose a new algorithm based on programming by demonstration and exception strategies to solve assembly tasks such as peg-in-hole.
Design/methodology/approach
Data describing the demonstrated tasks are obtained by kinesthetic guiding. The demonstrated trajectories are transferred to new robot workspaces using three-dimensional (3D) vision. Noise introduced by vision when transferring the task to a new configuration could cause the execution to fail, but such problems are resolved through exception strategies.
Findings
This paper demonstrated that the proposed approach combined with exception strategies outperforms traditional approaches for robot-based assembly. Experimental evaluation was carried out on Cranfield Benchmark, which constitutes a standardized assembly task in robotics. This paper also performed statistical evaluation based on experiments carried out on two different robotic platforms.
Practical implications
The developed framework can have an important impact for robot assembly processes, which are among the most important applications of industrial robots. Our future plans involve implementation of our framework in a commercially available robot controller.
Originality/value
This paper proposes a new approach to the robot assembly based on the Learning by Demonstration (LbD) paradigm. The proposed framework enables to quickly program new assembly tasks without the need for detailed analysis of the geometric and dynamic characteristics of workpieces involved in the assembly task. The algorithm provides an effective disturbance rejection, improved stability and increased overall performance. The proposed exception strategies increase the success rate of the algorithm when the task is transferred to new areas of the workspace, where it is necessary to deal with vision noise and altered dynamic characteristics of the task.
Keywords
Citation
Abu-Dakka, F.J., Nemec, B., Kramberger, A., Buch, A.G., Krüger, N. and Ude, A. (2014), "Solving peg-in-hole tasks by human demonstration and exception strategies", Industrial Robot, Vol. 41 No. 6, pp. 575-584. https://doi.org/10.1108/IR-07-2014-0363
Publisher
:Emerald Group Publishing Limited
Copyright © 2014, Emerald Group Publishing Limited
1. Introduction
Automatic assembly has been a challenging area of research over the past decades. Generally, it requires an extremely tight positional accuracy and a good knowledge of parts’ geometrical characteristics. The inaccuracy of the positioning devices used during part mating adds to the complexity of the process. One approach to help the assembly and avoid generation of large impact forces between the parts is to employ compliance and online adaptation of the desired trajectories (Nemec et al., 2013a). Human operators perform compliant movements easily by using force feedback and tactile information. Contacts and forces that occur during assembly are sensed and used to comply with the environment and adapt the assembly procedures. This information may be also used in the subsequent operations to speed up the assembly process (Nemec et al., 2013b). There has been a significant amount of work to provide such ability for industrial robots using force feedback and modifying the position or velocity of the tool-tip. The process, however, often results in task-specific solutions that are application-oriented and, thus, unsatisfactory as general assembly solutions.
One of the standard operations needed for the automatic assembly is peg insertion, also referred to as peg-in-hole task (PiH). There are many approaches to solve the PiH problem, most of them are based on force feedback control (Yun, 2008, Hirana et al., 2002, Giordano et al., 2008, Newman et al., 1999).
In this paper, the tasks for evaluation were taken from the Cranfield benchmark, (Figure 1) (Collins et al., 1985), which constitutes a standardized assembly task in robotics. It requires the insertion of different pegs (round and square) into the corresponding holes of the base plate and the placement of a separator onto the two previously inserted pegs.
Visual serving and reactive task controllers can also play an important role (Hamner et al., 2010). Because the appropriate strategy depends on the geometry of the manipulated workpieces, in most traditional approaches an engineer has to hardcode a different strategy for every new workpiece geometry (Bruyninckx et al., 1995; Xiao, 1997).
In this paper, an approach has been developed for learning of force-based skills. This approach consists of four stages:
Kinesthetic guiding is first used to perform the desired task in a specific part of the robot workspace. Besides positions, velocities and accelerations, we also acquire the associated force and torque profiles that led to the successful task execution. Note that unlike standard programming by demonstration (PbD) (ARGALL et al., 2009), we also make use of force/torque profiles that occur during the task execution to learn the appropriate behavior. There are only few other works that exploit forces and torques acquired during human demonstration. For example, Kormushev et al. (2011), Rozo et al. (2013) designed a system where a haptic interface is utilized to demonstrate the desired movements and the associated forces and torques.
If necessary, the acquired force/torque profiles are refined via reinforcement learning (Kalakrishnan et al., 2011), and the optimal stiffness parameters can be learned.
Once the new workpiece pose is acquired by vision, the demonstrated skill is transferred to new configurations, and the transferred movement is adapted to the trained forces and torques.
Exception strategies are applied when the peg fails to enter the hole due to errors in the estimation of the new workpiece pose by the three-dimensional (3D) vision system.
The main focus of this paper is on Item 4, i. e. the application of exception strategies to resolve situations where the robot fails to accomplish the task outright. We also explain how Item 3 is solved. We focus on situations where the PiH task needs to be executed in a different part of the workspace than during training. In our system, a new workpiece pose is estimated by vision, which introduces noise, thus simply transforming the previously trained Cartesian space trajectory and replaying it in a new configuration does not lead to a successful execution of the task. Such an approach is guaranteed to fail due to various reasons including vision noise and uncertainties in the grasping posture of the peg. Moreover, Cartesian space tracking errors can occur due to a different joint space configuration of the robot arm.
In our approach, we, therefore, first try to improve the transformed control policy by matching it with the previously trained performance in the force/torque space using force feedback control. The main idea is to match the recorded forces and torques with the currently measured ones by iteratively modifying the transformed Cartesian space trajectory. Through subsequent repetitions of the task execution, our learning procedure moves the force feedback errors to the encoded trajectory, thus improving the execution of the task and making it closer to the originally trained performance. Our experimental results show that with the proposed approach the robot can effectively learn and transfer assembly operations to new situations (Nemec et al., 2013a). To further improve the effectiveness of the algorithm, we introduce exception strategies, which can be utilized to solve the problem of missing the hole due to vision errors, thereby increasing the success rate of the algorithm. When the vision system acquires a new pose of the baseplate of the Cranfield benchmark (Figure 1), the algorithm adapts the Cartesian space trajectory to this new pose. However, due to the improper calibration and vision error, the initial pose is often shifted and rotated with respect to the true initial (hole) pose. We solved the abovementioned problem by integration of exception strategies into the learning framework.
The paper consists of six sections. In Section 2, we briefly present the data acquisition procedure used for acquiring trajectories and forces for an assembly operation. In Section 3, we formulate the problem and present the dynamic movement primitives (DMP) framework for encoding of discrete trajectories. Moreover, an approach to iteratively adapt the learned trajectory to improve the task performance is reviewed. Section 4 presents the main result of this paper, which is the exception strategies developed to improve the success ratio of the PiH operation. Results of the experimental evaluation are given in Section 5, followed by final remarks, summary of main contributions and conclusions in Section 6.
2. Programming by demonstration for force-based tasks
In this section, we briefly describe the basic procedure of learning PiH operations by a human demonstration. The human demonstrations are performed by kinesthetic guiding using the Kuka LWR arm in gravity compensation mode (Figure 2). The human guides the robot’s tool center point along the desired trajectory, for example, so that the peg insertion task is successfully executed. We measure the Cartesian space trajectory by proprioception during the execution. In the Kuka LWR arm, the force sensors are located in the robot joints and consequently the forces exerted from the human operator during the demonstration affect the measured forces and torques. Therefore, to obtain the net forces and torques at the robot tool center point, we repeat the measured trajectory with the robot, record the resulting joint torques and transform them into the corresponding tool center point forces and torques. We take care that the workspace configuration does not change during this process and since our robot can accurately track the joint space trajectories, the measured forces and torques provide a good reference for later adaptation.
With this system, each successful demonstration of PiH task, the algorithm acquires the Cartesian space tool center point positions and orientations (represented as quaternions), velocities and accelerations: Equation 1
and the associated forces and torques: Equation 2
all acquired at times t i , i = 0, . . . , T. Positions and orientations are represented as a frame Q = { p , q }, where p ∈ R3 and q = + u is a unit quaternion where ∈ R and u ∈ R3.
3. Problem formulation
In this section, technologies used for the trajectory adaptation of an assembly process will be introduced. The robot trajectories are represented with DMPs framework (Ijspeert et al., 2013). The main feature of this framework for our purposes is that it enables to slow down or speed up the trajectories. In Section 3.1, this framework is presented. In Section 3.2, we explain how the demonstrated trajectories are modified to successfully execute a task in different parts of the robot workspace. In Section 3.3, we used the Iterative Learning Control to learn and adapt the trajectories. This adaptation is performed in several iterations, where the robot repeats the same task. Finally, a scheme for the overall procedure is introduced (Section 3.4).
3.1 DMPs for Cartesian space
The demonstrated Cartesian space trajectories that result in a successful execution of the PiH task are encoded by DMPs. A DMP for a single df trajectory y is defined by the following set of nonlinear differential equations (Ijspeert et al., 2013): Equation 3 Equation 4 Equation 5
where x is the phase variable, z is an auxiliary variable and τ is the time variable. Parameters α z and β z define the behavior of the second-order system described by equations (3) and (4). The evolution of the phase variable x is defined by equation (5). With the choice of the time constant τ > 0, α z = 4β z and α x > 0, the convergence of the underlying dynamic system to a unique attractor point at y = g; z = 0 is ensured (Ijspeert et al., 2013). f(x) is defined as a linear combination of N nonlinear radial basis functions, which enables the robot to follow any smooth trajectory from the initial position y 0 to the final configuration g (goal): Equation 6 Equation 7
where Ψ i are fixed basis functions and w i are adjustable weights. c i are the centers of Gaussians distributed along the phase of the trajectory with h i are their widths. For a given N and setting the time constant τ = t T , we can define Inline Equation 1, Inline Equation 2, h N = h N − 1, i = 0, . . ., N. For each Cartesian df, the weights w i are estimated from the measured data equation (1) using regression (Ude et al., 2010), while g is the last-recorded configuration on the trajectory. Thus, the resulting DMPs encode the desired peg insertion trajectory.
In the basic DMP equations (3) − (4), each df of the position/orientation dimension is encoded as a single DMP with a common phase variable x. Note that we encode eh quaternion component as a single DMP; therefore, the resulting quaternion has to be normalized after each integration step. The exact solution is proposed in Pastor et al.’s study (2011), but because we confirmed experimentally that the differences between both approaches are negligible, we use equation (3) – (4) also for quaternion integration. Such an approach simplifies the implementation.
3.2 Error feedback and DMP phase stopping
When the robot executes the demonstrated trajectory, the resulting forces and torques differ from the ones that were measured during human demonstration. This happens due to small displacements arising from inaccurate pose estimation, reduced accuracy of the robot tracking control and uncertainties in the placement of the peg in the gripper. This could worsen or even prevent the successful execution of the PiH task. To adapt to a new situation, we propose to modify the demonstrated trajectory according to the stiffness control law (Villani and Schutter, 2008): Equation 8 Equation 9
where p r (x) is the position vector fed to the robot controller, ( p ′D M P , q ′D M P ) is the displaced demonstrated trajectory computed by integrating the learned DMPs ( p D M P , q D M P ) and applying t workpiece displacement (Δ t w , Δ q w ), Equation 10 Equation 11
K s1 and K s2 are 3 × 3 diagonal matrices, q r (x) is the orientation quaternion fed to the robot controller. q D M P (x) is the reference quaternion obtained integrating the learned DMP, and Inline Equation 3 denotes conjugated quaternion Δ q w . The quaternion product “∗”, defined as q 1* q 2 = (1, u 1)*(2, u 2) = (1 2, 2 u 1 + 1 u 2 + u 1 × u 2), has been used in equation (9). The feedback error terms, i.e. K s1 e p (x) and X( K s2 e q (x)), respectively, provide force/torque feedback control. The errors e p (x) and e q (x) are defined as: Equation 12 Equation 13
where F d (x), M d (x) are the desired reference forces and torques at phase x as acquired from human demonstration, F d = [ F d, X , F d, Y , F d, Z ]T and M d = [ M d, X , M d, Y , M d, Z ]T. Linear equations systems similar to equation (17) are needed to be solved to estimate F d and M d from the measured force/torque data equation (2). Subscript s (.) X, Y, Z denote the components of F d vectors in directions of Cartesian axes or rotation components of M d around Cartesian axes, respectively. Please note that the uppercase letters X, Y, Z denote Cartesian axis, while the lower case letters x, y, z denote DMP variables.
F and M are the current measured forces and torques, q ( x ) is the quaternion specifying the current orientation of the robot’s tool, and Inline Equation 4 denotes conjugated quaternion q ( x ). X denotes the transformation which maps an angular velocity to a unit quaternion describing the resulting rotation within the sampling time Δt: Equation 14
The proposed controller tracks simultaneously the desired position and orientations and forces and torques. Force/torque adaptation requires low gains (matrices K s ) for stable and robust operation. Thus, force adaptation is usually slow. To effectively minimize the force/torque error, we propose slowing down the trajectory execution using DMP slowdown feedback. For DMP phase stopping (Schaal et al., 2007), the original equation for phase equation (5) is replaced with: Equation 15
where e is the combined force/torque error, Equation 16
Note that in the case of large forces or torques, the error e becomes large which, in turn, makes the phase change x˙ small. Thus, the phase evolution is stopped until the robot reduces the force/torque error (Nemec et al., 2013a).
During the execution of the demonstrated trajectory, the resulting positions and orientations offsets are captured depending on the phase variable x, which ensures that the sampling is independent of the trajectory duration (Nemec et al., 2013a). Thus, the trajectory is sampled exactly the same number of times as during the human demonstration, even when the phase is slowed down during the execution according to equation (15).
3.3 Trajectory adaptation
The goal of learning is to iteratively modify in sequential steps the positional and orientation part of the demonstrated trajectory so that the robot-generated trajectory results in the same forces and torques as during the human demonstration. For this purpose, we apply the Iterative Control Learning (ILC) framework, where the control signal from the previous iteration cycle is reused in the next iteration (Bristow et al., 2006). For that, we simply calculate new updated trajectory using control law equations (10) and (11), save entire trajectory and calculate new DMP weights wk, p for positional part p′DMP(x) and wk, q for rotational part q′DMP(x), respectively. Subscript (.)k refers to the kth component of positional vector and quaternion, respectively.
The optimal weights wk, p are then computed by solving the following linear system of equations: Equation 17
where Ψi are the Gaussian kernel functions from equation (7) and pk are signals sampled from equation (8). Similarly we compute also the weights wk, q. Reference position vectors p′DMP(x) and quaternions q′DMP(x) are then computed by integrating equations (3)-(5). Note that the quaternion part of the trajectory q′DMP has to be normalized before being used in equation (9). An attention should be paid also when sampling signals from equations (8) and (9). They should not be sampled at each control cycle, as the phase variable is modified according to the equation (15). Instead, they have to be sampled exactly at xi = exp( − αxti), ti, i = 0, . . . , T.
3.4 Control scheme
Figure 3 shows the basic implementation scheme for the proposed learning control. In this scheme, Inline Equation 5 [equation (1)] denotes the original demonstrated trajectory, which is encoded by DMPs. Inline Equation 6 refers to the output signal obtained by integrating. Equations (4) and (5) and applying the workpiece displacement (Δ t w , Δ q w ) as estimated by vision which occurs in block T 1 in equations (10) and (11). Note that this input is calculated and used only in the first learning cycle. F d [equation (2)] denotes the forces and torques captured during the execution of the demonstrated trajectory. Due to the noise induced by the errors in the vision system, robot tracking errors and imperfect grasping, the measured forces/torques F , M differ from the desired demonstrated forces F d [equation (2)]. Our goal is to minimize that error. In block T 2, the error in equations (12) and (13) is transformed into robot base coordinates. Using admittance control in equations (8) and (9), the new reference trajectory is computed. Motor commands Inline Equation 7, are given as the aggregation of the DMP generated trajectory and force feedback equations (8) and (9). In following learning cycles, the reference trajectory from the previous learning cycle is used to calculate Inline Equation 8 using equations (19), (3)-(7) instead of the reference trajectory as calculated in the first cycle. The whole procedure is repeated until the measured forces match the desired forces.
4. Exception strategies
The proposed approach is robust to the small to moderate deviations of the workpieces positions and dimensions. But can fail at large deviations. We investigated the human behavior in such situations and found out, that humans in such cases mostly interrupt the pre-learned actions and perform additional actions, called exception strategies. Exception strategies are generally triggered together with the phase stopping at high force/torque errors. Offset learned by the exception strategies is added to the offset learned with the iterative learning framework. Eventual jumps in trajectory are automatically filtered by applying DMP framework for offset encoding. Exception strategies can be either deterministic or stochastic. When sufficient a priori knowledge of the process/environment is available, deterministic strategies are generally faster than stochastic. Stochastic exception strategy, which basically performs random motions in small increments, does not require any previous knowledge and is more robust to the sensor noise. As an example, we will describe the stochastic exception strategy, which is used to compensate the error of the initial pose of the work-piece. These errors are mostly due the vision subsystem and are normally more evident in the extremes of the working space. In other words, due to calibration errors and distortion the pose estimation is less accurate at the boundaries.
The main uncertainties in our system are related to the vision aspect. We use a Kinect sensor for acquiring 3D scene data, and we apply an existing system (Buch et al., 2013) for estimating the pose of the known objects based on point cloud data. For the objects shown in Figure 1, we have the corresponding CAD models, and these are resampled to point clouds for use by the pose estimation algorithm.
The primary source of uncertainties is the internal calibration of the Kinect sensor. It has been shown recently (Herrera et al., 2012) that the depth error for this sensor is up to 5 mm at a distance of 1 m.
The second source of errors lies in the external calibration between the Kinect and the robot. We performed the external hand-eye calibration by a manual process where the robot is placed in several random configurations, followed by a measurement of the position of a known fixation point at the tool center point in both the robot base frame (using forward kinematics) and in the camera sensor frame (using a graphical interface which allows the user to mark the fixation point in the point cloud). From the corresponding position pairs, the relative transformation between the frames can be estimated by the singular value decomposition of the point covariance matrix (Kabsch, 1976). Although this method provides an optimal estimate, the result is still greatly affected by biases and noise in the measurements.
Finally, the pose estimation system introduces errors. Although pose refinement (Besl and McKay, 1992) is applied to each pose estimate, the accuracy of the end result is still affected by the sensor noise. Even for an optimal alignment, the pose estimation can thus still produce a translation error in the range of the sensor noise, i.e. up to 5 mm at a distance of 1 m from the sensor.
To handle all these uncertainties in a principled manner, and to increase the success rate of the algorithm, we introduced an exception strategy to the algorithm. This exception strategy is combined by two phases:
Verification phase: This phase is divided into following steps:
Approaching: Move the robot tool down in global z-coordinate using implicit force control to detect the surface of the baseplate and overcome any pose estimation errors in z-coordinates.
Verification: Due to the limited precision of the vision system, we never know whether the robot is at the required starting position above the hole which required for a successful accomplishment of the PiH operation. Therefore, first step should be verification of the required starting pose. Because we cannot rely on vision, we use force-sensing instead. Verification is executed in following steps, where we assume that the workpiece lies in the X-Y plane (Figure 4):
establish the contact with the environment using hybrid force-torque control law [Equations (19) and (20)] by prescribing the desired force in the Z direction;
perform linear motion in the global X coordinate starting from the current point with the displacement of d x and –d x;
determine minimum of Z coordinate and move to that point;
perform linear motion in the global Y coordinate with the displacement d y and –d y; and
determine the minimum of Z coordinate. If minimum exists, we have hit the hole and the resulting pose p_p corresponds to the pose of the maximal penetration of the peg into a hole.
We might not found a minimum in Z coordinate during this verification procedure. In this case, we have to proceed with the next step, which is hole-search exception strategy. Note that maximal penetration point p_p does not correspond to the center of the hole.
If the previously described procedure has not succeeded to detect the maximal penetration point p_p, we proceed with the following exception strategy.
Searching phase (Figure 5): Using stochastic search algorithm to detect the hole. This algorithm does small random increments ɛ in X and Y coordinates. The values of ɛ are taken randomly in a circle with a specific diameter. In the experiment that diameter was 1.5 cm: Equation 18
In the searching phase, linear DMPs are used to encode the trajectories between those random positions. Linear DMPs imply that f(x) = 0 in equation (3). During the execution of these trajectories, a force feedback is used to update the learned trajectory to maintain constant contact force with the surface of the baseplate. The end-effector velocity is calculated by: Equation 19
where v r stands for the resolved velocities vector, S v for the velocity selection matrix, v v for the desired velocities vector, K F for the force gain matrix, S F for the force selection matrix, F m for the measured force and F d denotes the force offset which determines the behavior of the robot when it is not in contact with the environment. To get the desired positions, we use: Equation 20
where y r is the desired initial position/orientation and y is the actual position/orientation. Using this approach, we can online modify the trajectory of the learned movement by updating the selected df. Matrices S v and S F are often diagonal and are used to select the df that are updated using force feedback. This leads to what is known as hybrid control. In a hybrid control scheme, the manipulator is required to follow the environment surface at a given speed while maintaining a desired contact force F d . Such control schemes can be written with a proper selection of vectors σ, for example, σ = [ 1; 1; 1; 0; 0; 0]T, which denotes that the translational dfs of a 6-df manipulator are trajectory-controlled, while the rotations are force-controlled. Therefore, equation (19) becomes: Equation 21
Note that the velocity selection matrix S v and the force selection matrix S F are complementary; S F = I − S v . Technically this means that a single df cannot be at the same time force- and trajectory-controlled.
The force gain matrix K F , from equation (19), multiplied by the measured force (or measured force minus the desired force), gives the end-effector velocity. Physically, the force gain matrix can be compared to the stiffness matrix (usually denoted by K), which means how much the environment or the robot deforms/moves when a force is applied to it. The stiffness and the compliance matrix (usually denoted by C) are related so that C = K − 1.
For the case of wiping a flat surface, where only the height (the Z direction) is needed to be updated, the equations are considerably simplified. By using S v = diag(1; 1; 0; 1; 1; 1), K F = diag(0; 0; k F ; 0; 0; 0) and S F = diag(0; 0; 1; 0; 0; 0). The desired end-effector height z in each discrete time step Δt becomes: Equation 22 Equation 23
Here, z 0 is the starting height, k F is the force gain [kg/s], F Z is the measured force in the Z direction and F d is the desired force that the robot needs to press on the object. Such formulation of the movement ensures constant movement in the − Z direction or constant contact when an object is encountered.
5. Experimental evaluation
The task was to evaluate the success rate of the insertion of the square and round peg into a hole of the Cranfield benchmark with and without using exception strategies. Trajectories were obtained as described in Section 2. We executed the trajectory for different randomly selected positions of the base plate. The measured trajectory data was translated and rotated into the new configurations using the pose of the base plate as estimated by vision. In this scenario, the uncertainties come from vision. Therefore, the resulting forces ring the execution of the demonstrated trajectory may exceed the expected ones.
The learning procedure was implemented in Matlab, which communicated with the Kuka LWR controller using Fast Research Interface (Schreiber et al., 2010). Kuka LWR is capable of measuring joint torques and can switch to Cartesian compliance control. This experiment is comprised of eight groups. Each group contains 50 trial execution of the PiH algorithm (Tables I and II). In each trial, the base plate was randomly put at different locations in the area of 0.5 × 0.2 m2 on the table within the robot’s workspace. The poses of the base plate were estimated by vision. From the tables, it can be clearly noticed that the exception strategy significantly improved the success ratio especially in case of square and tight round pegs. The average time needed for the execution of the exception strategy was 25 seconds over the time for the standard ones. The insertion time is always greater than or equal to the demonstrated trajectories time. In case of the shaft, the demonstrated trajectory time is 5 seconds, while the average insertion time of the experiments was 6.7 seconds. This time after the adaptation will be decreased to be more close to the demonstrated one.
Figure 6 shows graphically the improvement of the success ratio of the PiH insertion with Cranfield benchmark’s different objects.
In the next Figure 7, it can be observed that the offset of the error between the actual pose of the hole (indicated by red circle) in the baseplate and the pose estimated by vision. Those segments represent the Euclidean distance between the pose of the robot tip and the pose of the hole. As it can be shown from the figure, the offsets are more evident at the extremes of the workspace.
Further experiments were executed on the Kuka LWR robot arm to test the impact of the algorithm when phase stopping is disabled. In these experiments, the shaft and round peg were used with the same setup in the first and second groups in Table I. Running the experiments, it can be clearly observed that the application of phase stopping improves the reliability of our approach. The success rate achieved without phase stopping was 66 per cent. In these experiments, 100 per cent success rate was achieved by enabling phase stopping and integrating exception strategy for finding the hole into the system.
Additional experiments were conducted on the Universal UR5 robot that uses high-gain non-compliant controller. The learning procedure was implemented in C++ and linked to the robot using ROS (Quigley et al., 2009). From the Tables III and IV, it can be clearly noticed that the exception strategy significantly improved the success ratio especially in the case of square and tight round pegs.
Figure 8 shows graphically the improvement of the success ratio of the PiH insertion with Cranfield benchmark’s different objects. From figure and tables, we can notice that the success ratio obtained by Kuka robot is slightly higher than the one obtained by UR5. That was due to some factors, such as the grasping way in Kuka gripper was considered as fixed grasping which gives more stable grasping rather than the SCHUNK SDH. Moreover, the UR5 is stiff robot, which implies that in some cases the peg get stuck, and because of the forces, it would move inside the hand before it gives enough time for the controller to do corrections.
6. Conclusion
In this paper, we show how to integrate force-based programming by demonstration and exception strategies to adapt tasks in contact with the environment. We start with a human demonstration of the task because humans can effectively perform tasks that involve force feedback. Thus, their performance can be taken as baseline for robot performance. The effectiveness of the approach was demonstrated on tasks that cannot be performed without force feedback, such as peg insertion. Robust adaptation and exception strategies are essential for such tasks because even small errors can cause the robot execution to fail. The robot first adapts its movement in such a way that the adapted movement generates the same forces and torques as during demonstration. The core of our proposed strategy is not to develop different strategies for every shape of the object that needs to be inserted. Instead, we developed a general strategy that relies on the demonstrated policy, which includes positions/orientation and forces/torques. To overcome the abovementioned uncertainties, exception strategies were developed and integrated into the algorithm. The developed exception strategy is stochastic and performs random motions in small increments. This strategy does not require any previous knowledge and is robust to the force sensor noise.
Experimental evaluation of the proposed approach includes the use of 3D vision system and workpieces of different geometry and different number of holes. Our experimental results show that the developed system can effectively learn complex manipulation tasks that involve contact with the environment. New manipulation behaviors can quickly be learned by human demonstration and later be adapted to different configurations of the workspace. The flexibility of the approach is demonstrated by training a variety of assembly tasks involving different workpiece geometries and different number of holes. Figures 6 and 8 show that the success ratio of the PiH operation has been improved significantly by introducing the exception strategies. The proposed strategy is robust and can be used in industrial assembly applications. The tests that we conducted show the hardware independence of the proposed strategy. In this way, we proved the robustness and the reliability of the proposed method.
Majority of the existing algorithms try to solve the PiH problem without learning. Note that in the first learning cycle our algorithm is similar to most classical strategies with the exception that we applied phase stopping strategy, which actually raised the success rate substantially. After that, our algorithm improves through cycles, which is evident from the results. Therefore, it can be concluded that the existing algorithms are at most as good as our algorithm in the first cycle.
In our future work, we will investigate the possibility to accelerate the demonstrated assembly policy, which is a challenging task. Namely, the accelerated movements change also the interaction forces/torques, which have to be adjusted accordingly. Moreover, a thorough investigation can be done between the influences of the various vision-related uncertainties on the resulting PiH success rate.
About the authors
Fares J. Abu-Dakka received a BSc in Mechanical Engineering from the Birzeit University, Palestine, in 2003, MSc and PhD in Manipulators Motion Planning from the Polytechnic University of Valencia (UPV), Spain, in 2011. He was awarded the FPI Fellowship from the UPV and a Research Fellowship Grant from the Biomechanical Institute of Valencia. His research activities include robot motion planning, robot dynamics, biomechanical researches, robot control and learning. Fares J. Abu-Dakka is the corresponding author and can be contacted at: [email protected]
Bojan Nemec is Senior Research Associate at Humanoid and Cognitive Robotics Lab, Department of Automatics, Biocybernetics and Robotics, Jožef Stefan Institute. He received BSc, MSc and PhD from the University of Ljubljana. He spent his sabbatical leave at the Institute for Real-Time Computer Systems and Robotics, University of Karlsruhe. His research interests include robot control, robot learning, sensor-guided control, service robots and biomechanical measurements in sport.
Aljaž Kramberger received his master’s degree in Mechatronichs from the Faculty of Mechanical Engineering at the University of Maribor, Slovenia, in 2013. He is currently a PhD student at the Faculty of electrical Engineering, University of Ljubljana. He works as a researcher at the Department of Automation, Biocybernetics and Robotics at Jožef Stefan Institute, Ljubljana. His research interests include force-based robot interaction with environment and humanoid robotics.
Anders Glent Buch received his MSc from the Maersk Mc-Kinney Moller Institute, University of Southern Denmark, in 2010. He is currently pursuing his PhD on efficient methods for 3D shape-based matching. During the course of this, he has been a visiting researcher at Tampere University of Technology, Finland. He has published in notable conferences within the fields of computer vision and robotics. His research activities include computer vision, 3D perception, computational geometry and robot control.
Norbert Krüger is a Professor at the Mærsk McKinney Møller Institute, University of Southern Denmark. He holds an MSc from the Ruhr- Universitat Bochum, Germany, and a PhD from the University of Bielefeld. He leads the Cognitive Vision Lab which focuses on computer vision and cognitive systems, in particular the learning of object representations in the context of grasping.
Aleš Ude received a degree in Applied Mathematics from the University of Ljubljana, Slovenia, and a PhD from the Faculty of Informatics, University of Karlsruhe, Germany. He was awarded the Science and Technology Agency fellowship for postdoctoral studies in ERATO Kawato Dynamic Brain Project, Japan. He is currently the head of Humanoid and Cognitive Robotics Lab at Jožef Stefan Institute, Ljubjana. His research interests include autonomous robot learning, imitation learning, humanoid robot vision and humanoid robotics in general.
References
Argall, B.D. , Chernova, S. , Veloso, M. and Browning, B. (2009), “A survey of robot learning from demonstration”, Robotics and Autonomous Systems , Vol. 57 No. 5, pp. 469-483.
Besl, P.J. and Mckay, N.D. (1992), “A method for registration of 3-D shapes”, IEEE Transactions on Pattern Analysis and Machine Intelligence , Vol. 14 No. 2, pp. 239-256.
Bristow, D.A. , Tharayil, M. and Alleyne, A.G. (2006), “A survey of iterative learning control”, IEEE Control Systems Magazine , Vol. 26 No. 3, pp. 96-114.
Bruyninckx, H. , Dutre, S. and De Schutter, J. (1995), “Peg-on-hole: a model based solution to peg and hole alignment”, IEEE International Conference on Robotics and Automation, Nagoya, pp. 1919-1924.
Buch, A.G. , Kraft, D. , Kamarainen, J.-K. , Petersen, H.G. and Kruger, N. (2013), “Pose estimation using local structure-specific shape and appearance context”, IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, pp. 1050-4729.
Collins, K. , Palmer, A.J. and Rathmill, K. (1985), “The development of a European benchmark for the comparison of assembly robot programming systems”, in Rathmill, , K. , Macconail, , P. , O’Leary, , S. , Browne, and J. (Eds), Robot Technology and Applications , Springer-Verlag, New York, NY.
Giordano, P.R. , Stemmer, A. , Arbter, K. and Albu-Schaffer, A. (2008), “Robotic assembly of complex planar parts: an experimental evaluation”, IEEE/RSJ International Conference on Intelligent Robots and Systems , Nice, pp. 3775-3782.
Hamner, B. , Koterba, S.K. , Shi, J. , Simmons, R. and Singh, S. (2010), “An autonomous mobile manipulator for assembly tasks”, Autonomous Robots , Vol. 28 No. 1, pp. 131-149.
Herrera, C.D. , Kannala, J. and Heikkila, J. (2012), “Joint depth and color camera calibration with distortion correction”, IEEE Transactions on Pattern Analysis and Machine Intelligence , Vol. 34 No. 10, pp. 2058-2064.
Hirana, K. , Suzuki, T. and Okuma, S. (2002), “Optimal motion planning for assembly skill based on mixed logical dynamical system”, 7th International Workshop on Advanced Motion Control, Maribor, Slovenia, pp. 359-364.
Ijspeert, A.J. , Nakanishi, J. , Hoffmann, H. , Pastor, P. and Schaal, S. (2013), “Dynamical movement primitives: learning attractor models for motor behaviors”, Neural Computation , Vol. 25 No. 2, pp. 328-373.
Kabsch, W. (1976), “A solution for the best rotation to relate two sets of vectors”, Acta Crystallographica Section A , Vol. 32 No. 5, pp. 922-923.
Kalakrishnan, M. , Righetti, L. , Pastor, P. and Schaal, S. (2011), “Learning force control policies for compliant manipulation”, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), San Francisco, CA, pp. 4639-4644.
Kormushev, P. , Calinon, S. and Caldwell, D.G. (2011), “Imitation learning of positional and force skills demonstrated via kinesthetic teaching and haptic input”, Advanced Robotics , Vol. 25 No. 5, pp. 581-603.
Nemec, B. , Abu-Dakka, F. , Rytz, J.A. , Savarimuthu, T.R. , Ridge, B. , Krüger, N. , Petersen, H.G. , Jouffroy, J. and Ude, A. (2013a), “Transfer of assembly operations to new workpiece poses by adaptation to the desired force profile”, 16th International Conference on Advanced Robotics (ICAR), Montevideo.
Nemec, B. , Gams, A. and Ude, A. (2013b), “Velocity adaptation for self-improvement of skills learned from user demonstrations”, 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids), Atlanta, GA, pp. 423-428.
Newman, W. , Branicky, M. , Podgurski, H. , Chhatpar, S. , Huang, L. , Swaminathan, J. and Zhang, H. (1999), “Force-responsive robotic assembly of transmission components”, IEEE International Conference on Robotics and Automation, Detroit, MI, pp. 2096-2102.
Pastor, P. , Righetti, L. , Kalakrishnan, M. and Schaal, S. (2011), “Online movement adaptation based on previous sensor experiences”, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), San Francisco, CA, pp. 365-371.
Quigley, M. , Gerkey, B. , Conley, K. , Faust, J. , Foote, T. , Leibsz, J. , Bergery, E. , Wheeler, R. and Ng, A. (2009), “ROS: an open-source robot operating system”, ICRA Workshop on Open Source Software, Kobe.
Rozo, L. , Jiménez, P. and Torras, C. (2013), “A robot learning from demonstration framework to perform forcebased manipulation tasks”, Inteligent Service Robotics , Vol. 6 No. 1, pp. 33-51.
Schaal, S. , Mohajerian, P. and Ijspeert, A. (2007), “Dynamics systems vs optimal control – a unifying view”, Progress in Brain Research , Vol. 165 No. 1.
Schreiber, G. , Stemmer, A. and Bischoff, R. (2010), “The fast research interface for the KUKA lightweight robot”, ICRA Workshop on Innovative Robot Control Architectures for Demanding (Research) Applications – How to Modify and Enhance Commercial Controllers, Anchorage, AK.
Ude, A. , Gams, A. , Asfour, T. and Morimoto, J. (2010), “Tasks-specific generalization of discrete and periodic dynamic movement primitives”, IEEE Transactions on Robotics and Autonomous Systems , Vol. 26 No. 5, pp. 800-815.
Villani, L. and Schutter, J.D. (2008), “Force control”, in Siciliano, , B. , Khatib, and O. (Eds), Springer Handbook of Robotics , Springer-Verlag, Berlin, Heidelberg.
Xiao, J. (1997), “Goal-contact relaxation graphs for contact-based fine motion planning”, IEEE International Symposium on Assembly and Task Planning, Marina del Rey, CA, pp. 25-30.
Yun, S.-K. (2008), “Compliant manipulation for peg-in-hole: Is passive compliance a key to learn contact motion?”, IEEE International Conference on Robotics and Automation, Pasadena, CA, pp. 1647-1652.
Acknowledgements
The research leading to these results has received funding from the European Community’s Seventh Framework Programme FP7/2007-2013 (Specific Programme Cooperation, Theme 3, Information and Communication Technologies) under grant agreement no. 269959, IntellAct, and no. 600578, ACAT.