CHEN, TAO (2023) Object Pose Estimation and Tracking with Deep Learning for Robot Manipulation. Doctoral thesis, University of Essex.
CHEN, TAO (2023) Object Pose Estimation and Tracking with Deep Learning for Robot Manipulation. Doctoral thesis, University of Essex.
CHEN, TAO (2023) Object Pose Estimation and Tracking with Deep Learning for Robot Manipulation. Doctoral thesis, University of Essex.
Abstract
Perceiving the 6D pose of object is a longstanding question. It plays a crucial role in some areas of robotics, such as object manipulation, grasping, unmanned aerial vehicles and autonomous vehicles. Although researchers have proposed various algorithms to address this problem in the history, like template matching, point pair feature, etc. In this thesis, we utilise deep learning techniques to overcome the limitations of some existing pose estimation algorithms. Specifically, we investigate two different tasks in perceiving the orientation and translation of an object in 3D space, pose estimation from single images, and pose tracking from video sequences. For the pose estimation task from single image, we introduce a novel channel-spatial attention network, which can learn the representative features from RGB-D images. Although there are some supervised Convolutional Neutral Network (CNN) frameworks used for estimating the object pose, they simply fuse the image features and geometry features together, which result in weak representations of fusion data as these multimodal data lays in various feature spaces. To address this, our channel-spatial attention network proposes a specific CNN that learns the most important embedding from each data format, and convert them to the identical dimensional feature space. To extend the content of the pose estimation task from single image, in this thesis, the pose tracking task is also investigated. We propose a novel tracking framework that achieves stable and real-time tracking in the process. This framework is based on the correspondence of two consecutive frames, where the temporal-spatial information is utilize. In summary, the goal of this thesis is to extend the research direction of pose estimation tasks in 3D space, especially by introducing some advanced deep learning techniques to this area. In this thesis, it shows that our deep learning based methods have the advantages of dealing with occlusions, and cutter background images over some existing object pose estimation methods.
Item Type: | Thesis (Doctoral) |
---|---|
Subjects: | T Technology > TA Engineering (General). Civil engineering (General) |
Divisions: | Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
Depositing User: | Tao Chen |
Date Deposited: | 10 May 2023 15:58 |
Last Modified: | 10 May 2023 15:58 |
URI: | http://repository.essex.ac.uk/id/eprint/35598 |
Available files
Filename: University_of_Essex_PhD_THESIS__Tao.pdf