Feng, Tuo (2021) Deep Learning for Depth, Ego-Motion, Optical Flow Estimation, and Semantic Segmentation. PhD thesis, University of Essex.
Feng, Tuo (2021) Deep Learning for Depth, Ego-Motion, Optical Flow Estimation, and Semantic Segmentation. PhD thesis, University of Essex.
Feng, Tuo (2021) Deep Learning for Depth, Ego-Motion, Optical Flow Estimation, and Semantic Segmentation. PhD thesis, University of Essex.
Abstract
Visual Simultaneous Localization and Mapping (SLAM) is crucial for robot perception. Visual odometry (VO) is one of the essential components for SLAM, which can estimate the depth map of scenes and the ego-motion of a camera in unknown environments. Most previous work in this area uses geometry-based approaches. Recently, deep learning methods have opened a new door for this area. At present, most research under deep learning frameworks focuses on improving the accuracy of estimation results and reducing the dependence of enormous labelled training data. This thesis presents the work for exploring the deep learning technologies to estimate different tasks, such as depth, ego-motion, optical flow, and semantic segmentation, under the VO framework. Firstly, a stacked generative adversarial network is proposed to estimate the depth and ego-motion. It consists of a stack of GAN layers, of which the lowest layer estimates the depth and ego-motion while the higher layers estimate the spatial features. It can also capture the temporal dynamics due to the use of a recurrent representation across the layers. Secondly, digging into the internal network structure design, a novel recurrent spatial-temporal network(RSTNet)is proposed to estimate depth and ego-motion and optical flow and dynamic objects. This network can extract and retain more spatial and temporal features. Thedynamicobjectsaredetectedbyusingopticalflowdifferencebetweenfullflow and rigid flow. Finally, a semantic segmentation network is proposed, producing semantic segmentation results together with depth and ego-motion estimation results. All of the proposed contributions are tested and evaluated on open public datasets. The comparisons with other methods are provided. The results show that our proposed networks outperform the state-of-the-art methods of depth, ego-motion, and dynamic objects estimations.
Item Type: | Thesis (PhD) |
---|---|
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science T Technology > TK Electrical engineering. Electronics Nuclear engineering |
Divisions: | Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
Depositing User: | Tuo Feng |
Date Deposited: | 01 Dec 2021 16:33 |
Last Modified: | 01 Dec 2021 16:33 |
URI: | http://repository.essex.ac.uk/id/eprint/31706 |
Available files
Filename: University_of_Essex_PhD_THESIS_Tuo.pdf