Porichis, Antonios and Inglezou, Myrto and Kegkeroglou, Nikolaos and Mohan, Vishwanathan and Chatzakos, Panagiotis (2024) Imitation learning from a single demonstration leveraging vector quantization for robotic harvesting. Robotics, 13 (7). p. 98. DOI https://doi.org/10.3390/robotics13070098
Porichis, Antonios and Inglezou, Myrto and Kegkeroglou, Nikolaos and Mohan, Vishwanathan and Chatzakos, Panagiotis (2024) Imitation learning from a single demonstration leveraging vector quantization for robotic harvesting. Robotics, 13 (7). p. 98. DOI https://doi.org/10.3390/robotics13070098
Porichis, Antonios and Inglezou, Myrto and Kegkeroglou, Nikolaos and Mohan, Vishwanathan and Chatzakos, Panagiotis (2024) Imitation learning from a single demonstration leveraging vector quantization for robotic harvesting. Robotics, 13 (7). p. 98. DOI https://doi.org/10.3390/robotics13070098
Abstract
The ability of robots to tackle complex non-repetitive tasks will be key in bringing a new level of automation in agricultural applications still involving labor-intensive, menial, and physically demanding activities due to high cognitive requirements. Harvesting is one such example as it requires a combination of motions which can generally be broken down into a visual servoing and a manipulation phase, with the latter often being straightforward to pre-program. In this work, we focus on the task of fresh mushroom harvesting which is still conducted manually by human pickers due to its high complexity. A key challenge is to enable harvesting with low-cost hardware and mechanical systems, such as soft grippers which present additional challenges compared to their rigid counterparts. We devise an Imitation Learning model pipeline utilizing Vector Quantization to learn quantized embeddings directly from visual inputs. We test this approach in a realistic environment designed based on recordings of human experts harvesting real mushrooms. Our models can control a cartesian robot with a soft, pneumatically actuated gripper to successfully replicate the mushroom outrooting sequence. We achieve 100% success in picking mushrooms among distractors with less than 20 min of data collection comprising a single expert demonstration and auxiliary, non-expert, trajectories. The entire model pipeline requires less than 40 min of training on a single A4000 GPU and approx. 20 ms for inference on a standard laptop GPU.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | imitation learning; learning by demonstration; vector quantization; mushroom harvesting; visual servoing |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 12 Aug 2024 10:32 |
Last Modified: | 30 Oct 2024 21:38 |
URI: | http://repository.essex.ac.uk/id/eprint/38787 |
Available files
Filename: robotics-13-00098-v2.pdf
Licence: Creative Commons: Attribution 4.0