TY - GEN
T1 - Mobile robot motion control from demonstration and corrective feedback
AU - Argall, Brenna D.
AU - Browning, Brett
AU - Veloso, Manuela M.
PY - 2010/1/19
Y1 - 2010/1/19
N2 - Robust motion control algorithms are fundamental to the successful, autonomous operation of mobile robots. Motion control is known to be a difficult problem, and is often dictated by a policy, or state-action mapping. In this chapter, we present an approach for the refinement of mobile robot motion control policies, that incorporates corrective feedback from a human teacher. The target application domain of this work is the low-level motion control of a mobile robot. Within such domains, the rapid sampling rate and continuous action space of policies are both key challenges to providing policy corrections. To address these challenges, we contribute advice-operators as a corrective feedback form suitable for providing continuous-valued corrections, and Focused Feedback For Mobile Robot Policies (F3MRP) as a framework suitable for providing feedback on policies sampled at a high frequency. Under our approach, policies refined through teacher feedback are initially derived using Learning from Demonstration (LfD) techniques, which generalize a policy from example task executions by a teacher. We apply our techniques within the Advice-Operator Policy Improvement (A-OPI) algorithm, validated on a Segway RMP robot within a motion control domain. A-OPI refines LfD policies by correcting policy performance via advice-operators and F3MRP. Within our validation domain, policy performance is found to improve with corrective teacher feedback, and moreover to be similar or superior to that of policies provided with more teacher demonstrations.
AB - Robust motion control algorithms are fundamental to the successful, autonomous operation of mobile robots. Motion control is known to be a difficult problem, and is often dictated by a policy, or state-action mapping. In this chapter, we present an approach for the refinement of mobile robot motion control policies, that incorporates corrective feedback from a human teacher. The target application domain of this work is the low-level motion control of a mobile robot. Within such domains, the rapid sampling rate and continuous action space of policies are both key challenges to providing policy corrections. To address these challenges, we contribute advice-operators as a corrective feedback form suitable for providing continuous-valued corrections, and Focused Feedback For Mobile Robot Policies (F3MRP) as a framework suitable for providing feedback on policies sampled at a high frequency. Under our approach, policies refined through teacher feedback are initially derived using Learning from Demonstration (LfD) techniques, which generalize a policy from example task executions by a teacher. We apply our techniques within the Advice-Operator Policy Improvement (A-OPI) algorithm, validated on a Segway RMP robot within a motion control domain. A-OPI refines LfD policies by correcting policy performance via advice-operators and F3MRP. Within our validation domain, policy performance is found to improve with corrective teacher feedback, and moreover to be similar or superior to that of policies provided with more teacher demonstrations.
UR - http://www.scopus.com/inward/record.url?scp=74049165048&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=74049165048&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-05181-4_18
DO - 10.1007/978-3-642-05181-4_18
M3 - Conference contribution
AN - SCOPUS:74049165048
SN - 9783642051807
T3 - Studies in Computational Intelligence
SP - 431
EP - 450
BT - From Motor Learning to Interaction Learning in Robots
A2 - Sigaud, Oliver
A2 - Peters, Jan
A2 - Peters, Jan
ER -