Research Repository

OctreeNet: A Novel Sparse 3-D Convolutional Neural Network for Real-Time 3-D Outdoor Scene Analysis

Wang, Fei and Zhuang, Yan and Gu, Hong and Hu, Huosheng (2020) 'OctreeNet: A Novel Sparse 3-D Convolutional Neural Network for Real-Time 3-D Outdoor Scene Analysis.' IEEE Transactions on Automation Science and Engineering, 17 (2). 735 - 747. ISSN 1545-5955

[img]
Preview
Text
IEEE-TASE-V17-N2-2020-735-747.pdf - Accepted Version

Download (5MB) | Preview

Abstract

Convolutional neural networks (CNNs) for 3-D data analyses require a large size of memory and fast computation power, making real-time applications difficult. This article proposes a novel OctreeNet (a sparse 3-D CNN) to analyze the sparse 3-D laser scanning data gathered from outdoor environments. It uses a collection of shallow octrees for 3-D scene representation to reduce the memory footprint of 3-D-CNNs and performs point cloud classification on every single octree. Furthermore, the smallest non-trivial and non-overlapped kernel (SNNK) implements convolution directly on the octree structure to reduce dense 3-D convolutions to matrix operations at sparse locations. The proposed neural network implements a depth-first search algorithm for real-time predictions. A conditional random field model is utilized for learning global semantic relationships and refining point cloud classification results. Two public data sets (Semantic3D.net and Oakland) are selected to test the classification performance in outdoor scenes with different spatial sparsity. The experiments and benchmark test results show that the proposed approach can be effectively used in real-time 3-D laser data analyses. Note to Practitioners-This article was motivated by the limitations of existing deep learning technologies for analyzing 3-D laser scanning data. This technology enables robots to infer what the surroundings are, which is closely linked to semantic mapping and navigation tasks. Previous deep neural networks have seldom been used in robotic systems since they require a large amount of memory and fast computation power to apply dense 3-D operations. This article presents a sparse 3-D-Convolutional neural network (CNN) for real-time point cloud classification by exploiting the sparsity of 3-D data. This framework requires no GPUs. The practicality of the proposed method is verified on data sets gathered from different platforms and sensors. The proposed network can be adopted for other classification tasks with laser sensors.

Item Type: Article
Divisions: Faculty of Science and Health > Computer Science and Electronic Engineering, School of
Depositing User: Elements
Date Deposited: 10 Jun 2020 13:27
Last Modified: 10 Jun 2020 13:27
URI: http://repository.essex.ac.uk/id/eprint/27643

Actions (login required)

View Item View Item