Transfer Learning - Artificial Intelligence and Machine Learning

2.2 Artificial Intelligence and Machine Learning

2.2.5 Transfer Learning

Training deep learning models requires lots of training data and computational resources.

Fortunately, many computer vision tasks can still be solved with deep learning through transfer learning. When a network is trained from scratch, its weights are initialized to random numbers. There are different theories on what probability distribution gives the best starting point [33], but either way the weights will be far from the result after many iterations of training.

If a network is pre-trained on a big, general dataset, however, the resulting weights probably form a fairly good starting point for other tasks as well. In terms of image classification and segmentation, it makes sense to use pre-trained weights since detection of line segments, corners etc. is similar for all types of images. Thus, weights for early layers can be unchangingly transferred to new tasks, whereas deeper layers detecting more complex features can be retrained, or fine-tuned, on a new target-specific dataset. Using pre-trained models designed to solve one type of task to solve a different task by fine-tuning its layers, is called transfer learning.

Transfer learning will be used in this thesis to solve the problem of corrosion damage segmentation. This will be necessary as the dataset to be used is rather small by today’s standards. In addition, designing a new network architecture with proper hyperparameters is challenging. Success is much more likely when using a network architecture proven to work well on different tasks.

ImageNet [26] is the most commonly used dataset for transfer learning for image clas-sification models. It can also be used for image segmentation to more easily detect relevant high-level features. Datasets specifically designed for image segmentation, however, are usually a better option as these contain both images and complete segmentation masks.

Pascal VOC 2012 [12], Cityscapes [8] and ADE20K [61] are popular datasets for se-mantic segmentation, whereas Microsoft COCO [37] is the de facto standard for instance segmentation.

Chapter 3

Image Segmentation of Construction Damages

Image classification networks have been very popular over the last decade. Image seg-mentation, however, is a much more challenging task and is still in the early phase of being adapted to a wide range of application areas. An increase in number of published papers on image segmentation occurred in the years following the release of Mask R-CNN [22] in 2017, which is state of the art for instance segmentation. Image segmentation of construction damages, however, is still lacking.

This chapter starts by defining use-cases of image segmentation for industrial inspec-tions in Section 3.1. This is followed by Section 3.2 discussing requirements for an image segmentation framework to be used for industrial inspections. Finally, a number of differ-ent image segmdiffer-entation algorithms are reviewed in Section 3.3. Methods based on both traditional computer vision and machine learning are presented.

3.1 Use-Cases of Construction Damage Segmentation for Industrial Inspections

For construction damages in images, there are numerous potential applications for image segmentation. Its real usefulness emerges when automatic inspection systems are used to collect a vast amount of image data. The best such solution is likely camera-enabled UAVs, as is already in the early stage of being adapted to industrial inspections [27, 2, 43].

UAVs have the great benefits of being more efficient than human inspectors, can access areas otherwise difficult to reach and allows for more frequent inspections – all at lower costs. Combined with intelligent damage detection software, this can eliminate the need for most human inspectors as well as being more accurate and objective.

One straight forward use-case of UAVs with image segmentation software is remote controlling the UAV to take images of a construction. Images containing a damage can then be saved along with sensor data locating where and at what altitude the images were taken. This allows an inspector to study the damages remotely and initiate maintenance

where needed.

Furthermore, image segmentation can aid a UAV in becoming autonomous, i.e. re-moving the need for an operator to control the UAV. For this to work optimally, the system should be combined with 3D models of the construction and SLAM¹to help keep track of what parts of the construction is covered and to avoid many duplicate reports of the same damage.

Pure image segmentation has no concept of actual sizes, only the size of segmented objects relative to the image frame. Combined with 3D models and SLAM, however, the real size of damages can be computed. Furthermore, the total area of damaged surface can be tracked. This is particularly useful as it is an often-used metric to initiate maintenance.

If the total damaged area is monitored for all constructions of relevance, maintenance can be automatically and efficiently scheduled.

With autonomous UAVs able to recognize damages, we can efficiently monitor dam-ages over time, e.g. weekly take an image of all known damdam-ages. Time series predictions can then potentially be used to forecast how a damage will continue to develop. If this is successful, it can help estimate future expenses related to maintenance and prevent dan-gerous damages from developing.

To summarize, the benefits of image segmentation of construction damages are vast and many. There are, however, many challenging requirements needed to be overcome for the image segmentation software to be useful in the above use-cases. These are discussed next.

3.2 Requirements of Image Segmentation for Industrial

In document Image Segmentation of Corrosion Damages in Industrial Inspections using State-of-the-Art Neural Networks (sider 33-36)