Learning to Act from Actionless Video through Dense Correspondences
Po-Chen Ko, Jiayuan Mao, Yilun Du, Shao-Hua Sun, Joshua B. Tenenbaum
We developed a robot policy using images, which can be trained without action annotations. It’s trained on RGB videos, effective for table-top tasks and navigation. Our framework also allows rapid modeling with just 4 GPUs in a day.