Enrique Munoz de Cote, E. O. Garcia, E. Morales
In machine learning, learning a task is expensive (many training samples are needed) and it is therefore of general interest to be able to reuse knowledge across tasks. This is the case in aerial robotics applications, where an autonomous aerial robot cannot interact with the environment hazard free. Prototype generation is a well known technique commonly used in supervised learning to help reduce the number of samples needed to learn a task. However, little is known about how such techniques can be used in a reinforcement learning task. In this work we propose an algorithm that, in order to learn a new (target) task, first generates new samples—prototypes—based on samples acquired previously in a known (source) task. The proposed approach uses Gaussian processes to learn a continuous multidimensional transition function, rendering the method capable of reasoning directly in continuous (states and actions) domains. We base the prototype generation on a careful selection of a subset of samples from the source task (based on known filtering techniques) and transforming such samples using the (little) knowledge acquired in the target task. Our experimental evidence gathered in known reinforcement learning benchmark tasks, as well as a challenging quadcopter to helicopter transfer task, suggests that prototype generation is feasible and, furthermore, that the filtering technique used is not as important as a correct transformation model.