Organization - Computational Shape Understanding for 3D Reconstruction and Modeling

The remainder of this dissertation is organized as follows:

Chapter 2, 3D Urban Modeling Revisited. This chapter provides an extensive overview of state-of-the art techniques proposed to enable fast and accurate 3D reconstruction of urban spaces.

We also introduce the key steps of a traditional image-based reconstruction pipeline, underlining the specific challenges encountered at each step. Finally, we provide a brief introduction on the notion of symmetry and how it can be utilized to overcome these challenges.

Chapter 3, Factored Acquisition of Buildings.This chapter covers an image-based 3D recon-struction framework for piecewise-planar buildings containing symmetric parts. Given a set of input images, together with the corresponding camera parameters, this framework utilizes geometric priors in the form of line and plane features to capture local spatial coherence in the data. We exploit large scale symmetries among the elements of the building, e.g. window frames, to provide structural priors that explore non-local coherence. Our reconstructions provide a factored representation where the individual building elements are nicely encoded. We provide evaluations performed on challenging data sets, both synthetic and real.

Chapter 4, Symmetry and Structure-from-Motion. Repeated structures are ubiquitous in buildings leading to ambiguity in establishing correspondences across sets of unordered images.

Chapter 1. Introduction

This chapter presents a coupled approach to resolve such ambiguities by explicit detection of the repeating structures. We show that this approach simultaneously computes accurate camera parameters corresponding to each image and recovers the repetition pattern both in 2D and 3D. We evaluate the robustness of the proposed scheme on a variety of examples and provide comparisons with other structure-from-motion methods. We also show that the recovered repetitions patterns enable a range of novel image editing operations that maintain consistency across the images.

Chapter 5, Understanding Structured Variations. Many architectural data sets not only contain elements that are exact replicas of the same geometry, but also consist of elements that exhibit variations of a base geometry, e.g. windows with similar arch but varying height. In this chapter, we investigate the problem of understanding such variations in the context ofmulti-view stereo reconstructionsof ornate historic buildings. Utilizing a database of template models, each equipped with a deformation model, we detect patterns across element instances of a building by matching them to templates and extracting similarities in the resulting deformation modes.

Chapter 6, Designing Functional Models. In this chapter, we introduce an automated system that takes a motion sequence of a humanoid character and generates the design of a mechanical figure that mimics the input motion. The generated designs consist of parts that can be easily fabricated or obtained. A central goal of our approach is to observe patterns typically occurring in human motions to provide a high-level understanding of the target motions. This understanding is useful both to guide the design process and identify the motion types that are better suitable for our system.

Chapter 7, Conclusion and Future Directions.We provide a summary of the dissertation with an emphasis on the take-home messages and suggest future research directions.

2 3D Urban Modeling Revisited

With the advances in the acquisition technology, we see an increasing interest in digitizing objects and scenes. 3D acquisition devices, such as the hand-held scanners or LIDAR scanners, enable to capture physical objects ranging from specific human body parts to large cities. The wide applicability of the collected data in various fields (e.g. industrial design, prototyping, prosthetics, entertainment industry) has triggered the development of advanced reconstruction and processing algorithms.

Urban reconstruction is one such field that is still under active research due to its wide spread domain (see Figure 2.1). Digital mapping and navigation systems, such as Google Earth and Microsoft Bing Maps, require 2D or 3D building models. 3D reconstruction of urban areas provide useful interaction metaphors for urban planning and design. Several movies and games rely on 3D digital cities. Virtual urban worlds are also useful for applications including emergency planning and virtual touristic tours. The variety of these potential applications has attracted a significant amount of attention from the researchers. In this chapter, we provide an overview of the various methods proposed for fast and accurate reconstruction of 3D buildings with a particular focus on image-based modeling techniques. We further describe the main steps of a traditional image-based reconstruction pipeline, underline the specific challenges, and outline how we plan to overcome these challenges. Please note that some of the methods we review are general-purpose and thus can be applicable for other problem domains as well.

2.1 Overview of Approaches

The modeling of urban spaces has been performed using a variety of different strategies (see Figure 2.2). Procedural modelingis one of these approaches which has been mainly utilized for fast generation of complex urban structures from a set of parameters and rules. In one of the early efforts, Wonka et al. [114] use split grammars and an attribute matching system to synthesize buildings with a large variety of different styles. Müller et al. [75] also use the idea of splitting to analyze single images of facades. They combine auto-correlation based analysis

Chapter 2. 3D Urban Modeling Revisited

Modelling air pollution (New Zealand) ©Nextspace

Offices owned by Brookfield Office Properties (Toronto) ©Cube Cities

Urban planning

©City Engine City building computer game

©CitiesXL

Figure 2.1: Urban reconstruction has a wide spread application domain including urban design, emergency planning and entertainment industry.

of rectified images with shape grammars to generate rules and extract their parameters. In a more recent effort, Kelly and Wonka [50] demonstrate an interactive approach for generating buildings from architectural footprints using procedural extrusions. Even though procedural rules provide an efficient way to create high quality detailed models, they are not useful for direct model acquisition.

To overcome this limitation,inverse procedural modelinghas recently become a new and growing area. Here, the focus is to discover parameterized grammar rules and the corresponding parame-ters that can generate a given specific target example. In an early effort, Aliaga et al. [3] present a system to extract a repertoire of grammars from a set of images with user guidance and use this information to quickly generate modifications of the architectural structures. Bokeloh et al. [10]

explore partial symmetries in a given model to generate rules that can be used to describe similar models. Given a large set of rules, Talton et al. [105] present an approach to select the rules and the parameter settings that can generate output resembling a high level description of desired productions. A common challenge for these methods is that the expressive power of procedural modeling makes the inverse problem extremely difficult. Thus, obtaining suitable generative 10

2.1. Overview of Approaches

Figure 2.2: The modeling of urban spaces has been performed using a variety of different approaches. (Images courtesy of corresponding authors.)

procedures to capture target shapes still remains as an ambitious problem.

Direct acquisition of urban spaces, on the other hand, is becoming popular due to the variety of the input data sources. An important portion of the proposed 3D reconstruction methods use LIDAR scans. These scans are collected by measuring distance by illuminating a target region with a laser and analyzing the reflected light. This process creates 3D point clouds with significant accuracy, however the challenges in the practical acquisition process often lead to incomplete coverage. Therefore several automatic and interactive methods have been proposed for post-processing of such data.

Zhou and Neumann [126] present an automatic algorithm for creating building models from LIDAR scans. This algorithm first performs vegetation detection and then estimates the principle directions of the building roof patches. This information is used to fit parametric models to the data. In a similar effort, Pu and Vosselman [89] present an automatic method for reconstruction of building facade models from terrestrial laser scanning data by fitting polygonal models. The method first detects planar features in the input point clouds corresponding to wall, door, or window regions. A concave or convex polygonal model is then generated from these features.

In a recent effort, Vanegas and colleagues [108] propose an approach for extracting volume descriptions from 3D point clouds based on theManhattan Worldassumption, i.e. the presence of three mutually orthogonal directions in the scene.

Chapter 2. 3D Urban Modeling Revisited

In addition to fully automatic methods, several algorithms that utilize user interaction have been proposed for processing of scan data. Nan et al. [77] propose an interactive method where the users roughly define simple building blocks, called smart boxes, over the 3D point samples.

The algorithm then snaps these boxes properly to the data by accounting both for data fitting and inter-box similarity. In a similar data consolidation framework, Zheng et al. [123] exploit large-scale repetitions to denoise the input data and complete missing parts. These repetitions are detected with the help of the user.

There are a multitude of urban reconstruction methods that combine LIDAR scans with images.

Liu and Stamos [61] present a system that registers 2D images with 3D point clouds. By matching linear features in the images and the point clouds, the method aims to robustly register the camera parameters of the images to the 3D data. More recently, Li et al. [58] present an interactive system for combining information from images and LIDAR scans. They create a layered representation of input buildings which enables more robust detection of repeating structures that are used to enhance 3D data.

Although scanning devices are frequently used by land surveying offices, they are still not available for mass markets due to economical and practical reasons. On the other hand, the advances in the camera technology and the simplicity of the acquisition process has recently made images one of the obvious input sources. It is estimated that tens of billions of photos are taken each year, many of which depict urban sites [76]. As a result, image-based modeling methods has recently become one of the most popular 3D acquisition techniques. In the next section, we provide an overview of different approaches proposed using images as input.

In document Computational Shape Understanding for 3D Reconstruction and Modeling (sider 29-34)