In a complete framework perspective, some of the limitations in Section 4.2 have been addressed and presumed better solutions have been proposed in sections 3.2, 3.3, and 3.4, consecutively. The future directions for further research and potential solutions to other limitations, that have so far not been addressed, will be discussed below.

Two step learning. By treating the process of fitting the training data as a classification task and the process of fitting the unlabeled data as a generation task, the GAN training style (Goodfellow et al., 2014) can be adopted to simplify and speed up the A-XCRF training process.

Even though the process of finding the state of equilibrium in GANs is relatively unstable, this approach is still more stable than separating the training process into different training pipeline, like in the A-XCRF pipeline. Therefore, it is worth exploring how effective the GAN training style is with respect to improving the stability and easing the complexity of the A-XCRF proposal.

Shape representations. Several learning-based techniques have been proposed for scene completion and point cloud generation (Dai et al., 2018; Groueix et al., 2018). These techniques (individually or combined) can potentially alleviate the incomplete object representation, easing the generation of point cloud labels and the (automatic) detection of point cloud objects.

Frame rate and tracking. The (proposed) deterministic approach relies on a consistent point cloud representation to detect and track objects by location. This precondition is not present in the low-frequency LiDAR scans, and a heuristic or learning-based approach could (potentially) provide a better tracking solution. The learning-based approach, like the T-NET model (Qi et al., 2017b), might be able to perform this task better, by generating consistent and trackable features from an object in different positions. By providing a suitable objective function and fast feature matching algorithm (similar to Scale Invariant Feature Transform (SIFT) algorithm by Brown and Lowe (2002)), then a fast, accurate and reliable object tracking (technique) for a low-frequency LiDAR scan, can be deployed.


4.4. Outlook

The 3D scene understanding capabilities, specifically for object detection and semantic segmentation tasks, have been significantly improved during the last couple of years. Meanwhile, 3D laser scanners are becoming more affordable and their use is increasing rapidly. An example of a fast-growing application area of these scanners is in autonomous vehicle. The machine vision combined with the 3D laser scanner will most likely play a major role in providing full autonomy in remote sensing, autonomous vehicles, and even in virtual reality. In remote sensing, the machine vision can provide automatic generation of high resolution semantic (land cover and land use) maps, building extraction, tree identification, crop yield prediction, and more. More strikingly, in robotic and autonomous driving, several (smart autonomy) companies have started deploying the 3D scene understanding capabilities in their (pre-market) autonomous products.

Another field of applications is the virtual and augmented reality, where semantic segmentation and object detection techniques are used to transform our real and virtual world for a better and bright future.


Paper A:

Land Cover Segmentation of Airborne LiDAR