Learning to Control with π × sel

PiXSel is a framework for learning to control unknown systems in a single episode.
Click a link or GIF to launch an interactive demo in a new tab. Please keep the tab open until the simulation is fully loaded, which can take up to 2 minutes depending on server load.

Dimitar Ho
Computing and Mathematical Sciences Department, Caltech
dho@caltech.edu

Learning to fly on-the-fly

We are given a quadrotor with no instructions on how to fly it, not even a stable hovering control policy. The quadrotor is thrown 10 meters into the air, and the learning algorithm must quickly learn how to stabilize it mid-air and fly it to the target position (yellow box) before it crashes back down to the ground.

to run PixSel live on a random problem instance click here. (soon)

Transfer learning by design via consistent model chasing

A PixSel learning algorithm is given models and control policies for the inverted pendulum task, but then is instead asked to learn the mountain car task through transfer learning. The algorithm successfully learns the mountain car task by interpreting it as a consistent model of a pendulum that matches its observed behavior. The simulation is based on the tasks from the OpenAI Gym and showcases the learning process and the pendulum system, which allows the algorithm to transfer its expertise from the pendulum task to the mountain car task.

Efficient exploration under safety constraints

PiXSel is tested on a challenging instance of the cart pole problem that requires learning to swing up the pole while keeping the cart away from the "unsafe" red region. The task is difficult because exploration, necessary for learning, conflicts with safety constraints. Attempting a swing-up requires large motions that bring the system close to the unsafe region. However, the simulation shows that PiXSel effectively balances exploration and safety while efficiently learning the task.

to run PixSel live on a random problem instance click here. (soon)