• No results found

Application #3: Semantic Scene Understanding

N/A
N/A
Protected

Academic year: 2022

Share "Application #3: Semantic Scene Understanding"

Copied!
61
0
0

Laster.... (Se fulltekst nå)

Fulltekst

(1)

• Modeling by example, revisited

Application #1: 3D Modeling

1

[Sung et al. 2017]

Deep neural network predicts the next best part to add and its position to enable non-expert

users to create novel shapes.

(2)

• Joint multi-modal understanding

Application #2: Image Understanding

2

[Zhang et al. 2017]

understanding 3D shapes can benefit image understanding

(3)

• Semantic 3D reconstruction

Application #3: Semantic Scene Understanding

[Song et al. 2017] 3

(4)

4

Motivating Applications: Semantic Scene Understanding

[Kelly et al. 2017, Kelly and Guerrero et al. 2018]

Application #4: 3D Asset Creation

(5)

• Number of Voxels grows as versus occupied surface

What’s Different in 3D?

5

O (n

3

)

<latexit sha1_base64="pPG/+UG2ht775Yz5Ajs7KOd+Azk=">AAAB7XicbZDLSgMxFIbP1Futt6pLN8Ei1E2ZsYIui27cWcFeoB1LJs20sZlkSDJCGfoOblwo4tb3cefbmLaz0NYfAh//OYec8wcxZ9q47reTW1ldW9/Ibxa2tnd294r7B00tE0Vog0guVTvAmnImaMMww2k7VhRHAaetYHQ9rbeeqNJMinszjqkf4YFgISPYWKt5WxYP1dNeseRW3JnQMngZlCBTvVf86vYlSSIqDOFY647nxsZPsTKMcDopdBNNY0xGeEA7FgWOqPbT2bYTdGKdPgqlsk8YNHN/T6Q40nocBbYzwmaoF2tT879aJzHhpZ8yESeGCjL/KEw4MhJNT0d9pigxfGwBE8XsrogMscLE2IAKNgRv8eRlaJ5VPMt356XaVRZHHo7gGMrgwQXU4Abq0AACj/AMr/DmSOfFeXc+5q05J5s5hD9yPn8AYnCOVQ==</latexit><latexit sha1_base64="pPG/+UG2ht775Yz5Ajs7KOd+Azk=">AAAB7XicbZDLSgMxFIbP1Futt6pLN8Ei1E2ZsYIui27cWcFeoB1LJs20sZlkSDJCGfoOblwo4tb3cefbmLaz0NYfAh//OYec8wcxZ9q47reTW1ldW9/Ibxa2tnd294r7B00tE0Vog0guVTvAmnImaMMww2k7VhRHAaetYHQ9rbeeqNJMinszjqkf4YFgISPYWKt5WxYP1dNeseRW3JnQMngZlCBTvVf86vYlSSIqDOFY647nxsZPsTKMcDopdBNNY0xGeEA7FgWOqPbT2bYTdGKdPgqlsk8YNHN/T6Q40nocBbYzwmaoF2tT879aJzHhpZ8yESeGCjL/KEw4MhJNT0d9pigxfGwBE8XsrogMscLE2IAKNgRv8eRlaJ5VPMt356XaVRZHHo7gGMrgwQXU4Abq0AACj/AMr/DmSOfFeXc+5q05J5s5hD9yPn8AYnCOVQ==</latexit><latexit sha1_base64="pPG/+UG2ht775Yz5Ajs7KOd+Azk=">AAAB7XicbZDLSgMxFIbP1Futt6pLN8Ei1E2ZsYIui27cWcFeoB1LJs20sZlkSDJCGfoOblwo4tb3cefbmLaz0NYfAh//OYec8wcxZ9q47reTW1ldW9/Ibxa2tnd294r7B00tE0Vog0guVTvAmnImaMMww2k7VhRHAaetYHQ9rbeeqNJMinszjqkf4YFgISPYWKt5WxYP1dNeseRW3JnQMngZlCBTvVf86vYlSSIqDOFY647nxsZPsTKMcDopdBNNY0xGeEA7FgWOqPbT2bYTdGKdPgqlsk8YNHN/T6Q40nocBbYzwmaoF2tT879aJzHhpZ8yESeGCjL/KEw4MhJNT0d9pigxfGwBE8XsrogMscLE2IAKNgRv8eRlaJ5VPMt356XaVRZHHo7gGMrgwQXU4Abq0AACj/AMr/DmSOfFeXc+5q05J5s5hD9yPn8AYnCOVQ==</latexit><latexit sha1_base64="pPG/+UG2ht775Yz5Ajs7KOd+Azk=">AAAB7XicbZDLSgMxFIbP1Futt6pLN8Ei1E2ZsYIui27cWcFeoB1LJs20sZlkSDJCGfoOblwo4tb3cefbmLaz0NYfAh//OYec8wcxZ9q47reTW1ldW9/Ibxa2tnd294r7B00tE0Vog0guVTvAmnImaMMww2k7VhRHAaetYHQ9rbeeqNJMinszjqkf4YFgISPYWKt5WxYP1dNeseRW3JnQMngZlCBTvVf86vYlSSIqDOFY647nxsZPsTKMcDopdBNNY0xGeEA7FgWOqPbT2bYTdGKdPgqlsk8YNHN/T6Q40nocBbYzwmaoF2tT879aJzHhpZ8yESeGCjL/KEw4MhJNT0d9pigxfGwBE8XsrogMscLE2IAKNgRv8eRlaJ5VPMt356XaVRZHHo7gGMrgwQXU4Abq0AACj/AMr/DmSOfFeXc+5q05J5s5hD9yPn8AYnCOVQ==</latexit>

O (n

2

)

<latexit sha1_base64="63ak468AXCor9nBfBLOpfhlLeqM=">AAAB7XicbZDLSgMxFIbP1Futt6pLN8Ei1E2ZKQVdFt24s4K9QDuWTJppYzPJkGSEMvQd3LhQxK3v4863MW1noa0/BD7+cw455w9izrRx3W8nt7a+sbmV3y7s7O7tHxQPj1paJorQJpFcqk6ANeVM0KZhhtNOrCiOAk7bwfh6Vm8/UaWZFPdmElM/wkPBQkawsVbrtiwequf9YsmtuHOhVfAyKEGmRr/41RtIkkRUGMKx1l3PjY2fYmUY4XRa6CWaxpiM8ZB2LQocUe2n822n6Mw6AxRKZZ8waO7+nkhxpPUkCmxnhM1IL9dm5n+1bmLCSz9lIk4MFWTxUZhwZCSanY4GTFFi+MQCJorZXREZYYWJsQEVbAje8smr0KpWPMt3tVL9KosjDydwCmXw4ALqcAMNaAKBR3iGV3hzpPPivDsfi9ack80cwx85nz9g645U</latexit><latexit sha1_base64="63ak468AXCor9nBfBLOpfhlLeqM=">AAAB7XicbZDLSgMxFIbP1Futt6pLN8Ei1E2ZKQVdFt24s4K9QDuWTJppYzPJkGSEMvQd3LhQxK3v4863MW1noa0/BD7+cw455w9izrRx3W8nt7a+sbmV3y7s7O7tHxQPj1paJorQJpFcqk6ANeVM0KZhhtNOrCiOAk7bwfh6Vm8/UaWZFPdmElM/wkPBQkawsVbrtiwequf9YsmtuHOhVfAyKEGmRr/41RtIkkRUGMKx1l3PjY2fYmUY4XRa6CWaxpiM8ZB2LQocUe2n822n6Mw6AxRKZZ8waO7+nkhxpPUkCmxnhM1IL9dm5n+1bmLCSz9lIk4MFWTxUZhwZCSanY4GTFFi+MQCJorZXREZYYWJsQEVbAje8smr0KpWPMt3tVL9KosjDydwCmXw4ALqcAMNaAKBR3iGV3hzpPPivDsfi9ack80cwx85nz9g645U</latexit><latexit sha1_base64="63ak468AXCor9nBfBLOpfhlLeqM=">AAAB7XicbZDLSgMxFIbP1Futt6pLN8Ei1E2ZKQVdFt24s4K9QDuWTJppYzPJkGSEMvQd3LhQxK3v4863MW1noa0/BD7+cw455w9izrRx3W8nt7a+sbmV3y7s7O7tHxQPj1paJorQJpFcqk6ANeVM0KZhhtNOrCiOAk7bwfh6Vm8/UaWZFPdmElM/wkPBQkawsVbrtiwequf9YsmtuHOhVfAyKEGmRr/41RtIkkRUGMKx1l3PjY2fYmUY4XRa6CWaxpiM8ZB2LQocUe2n822n6Mw6AxRKZZ8waO7+nkhxpPUkCmxnhM1IL9dm5n+1bmLCSz9lIk4MFWTxUZhwZCSanY4GTFFi+MQCJorZXREZYYWJsQEVbAje8smr0KpWPMt3tVL9KosjDydwCmXw4ALqcAMNaAKBR3iGV3hzpPPivDsfi9ack80cwx85nz9g645U</latexit><latexit sha1_base64="63ak468AXCor9nBfBLOpfhlLeqM=">AAAB7XicbZDLSgMxFIbP1Futt6pLN8Ei1E2ZKQVdFt24s4K9QDuWTJppYzPJkGSEMvQd3LhQxK3v4863MW1noa0/BD7+cw455w9izrRx3W8nt7a+sbmV3y7s7O7tHxQPj1paJorQJpFcqk6ANeVM0KZhhtNOrCiOAk7bwfh6Vm8/UaWZFPdmElM/wkPBQkawsVbrtiwequf9YsmtuHOhVfAyKEGmRr/41RtIkkRUGMKx1l3PjY2fYmUY4XRa6CWaxpiM8ZB2LQocUe2n822n6Mw6AxRKZZ8waO7+nkhxpPUkCmxnhM1IL9dm5n+1bmLCSz9lIk4MFWTxUZhwZCSanY4GTFFi+MQCJorZXREZYYWJsQEVbAje8smr0KpWPMt3tVL9KosjDydwCmXw4ALqcAMNaAKBR3iGV3hzpPPivDsfi9ack80cwx85nz9g645U</latexit>

(6)

AO-CNN STRUCTURE

Data Representation .. Many Possibilities!

6

points voxels cells patches

(7)

1. Representation


2. Neighborhood information

• who are the neighbouring elements

• how are the elements ordered


3. Extrinsic versus intrinsic representation


4. Simplicity versus memory/runtime tradeoff

Challenges

7

(8)

• Image-based


• Volumetric


• Surface-based


• Point-based

Representation for 3D

8

(9)

Image-based


• Volumetric


• Surface-based


• Point-based

Representation for 3D

9

(10)

Image-based

• Volumetric

• Point-based

• Surface-based

Representation for 3D: Multi-view CNN

10

regular image analysis networks

[Kalogerakis et al. 2015]

(11)

Multi-view CNN

3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation

11

3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation

(12)

Multi-view CNN

Integrating View Information

12

(13)

Image-based

Representation for 3D: Local Multi-view CNN

13

Segmentation

Correspondence Feature matching

Predicting semantic functions

[Huang et al. 2018]

localized renderings for point-wise features

(14)

Tangent Convolutions

Tangent Convolutions

14

[Tatarchenko et al. 2018]

loses information due to occlusion project to local patches


(contrast with PCPNet construction)

(15)

Signal Interpolation

Use nearest neighbor or Gaussian mixture based methods for interpolation.

Now the signal is more dense

Dealing with Sparse Points

15

(16)

Signal Interpolation

Use nearest neighbor or Gaussian mixture based methods for interpolation.

Now the signal is more dense

Dealing with Sparse Points

16

(17)

Tangent Convolutions

Improved Performance

17

(18)

Image-based

PROS: directly use image networks, good performance

CONS: rendering is slow and memory-heavy, not very geometric

• Volumetric


• Point-based


• Surface-based

Representation for 3D

18

(19)

• Image-based


Volumetric


• Surface-based


• Point-based

Representation for 3D

19

(20)

Volumetric

3D CNNs : Direct Approach

20

[Xiao et al. 2014]

(21)

*) VOXNET: A 3D CONVOLUTIONAL NEURAL NETWORK FOR REAL-TIME OBJECT RECOGNITION [MATURANA ET AL. 2015]

VoxNet [Maturana et al. 15]

21

Binary occupancy, density grid, etc.

rotational invariance

(22)

VISUALISATION OF FIRST LAYER FILTERS

Visualization of First Level Filters

22

(23)

Volumetric

Representation for 3D: Volumetric Deformation

[Yumer and Mitra 2016] 23

(24)

Efficient Volumetric Datastructures

24

[Wang et al. 2017]

(25)

O-CNN: STRUCTURE AND CNN OPERATIONS

Data Structure and CNN Operations

25 shuffled keys

(encode position in space)

labels

(parent label → child indices)

downsampling example

(“where there is an octant, there is CNN computation”)

faster neighbor access

(26)

Efficient Volumetric Datastructures

[Hane et al. 2018] 26

only generate non-empty voxels

Wang et al. 2017

Encoder Decoder/generator

(27)

Efficient Volumetric Datastructures

[Hane et al. 2018] 27

(28)

O-CNN: EVALUATION

Lower Memory Footprint

28

(29)

*) ADAPTIVE O-CNN [WANG ET AL. 2018]

Adaptive O-CNN

29

image to planar patch-based shapes

[Wang et al. 2018]

(30)

First-order Patches

30

OCNN Adaptive OCNN

(31)

*) FPNN: FIELD PROBING NEURAL NETWORKS FOR 3D DATA [LI ET AL. 2016]

Field Probing Neural Networks for 3D Data

31

[Li et al. 2016]

(32)

Spatial Probes

32

(33)

Details

Method Details

33

(34)

• Image-based


• Volumetric

PROS: adaptations of image networks

CONS: special layers for hierarchical datastructures, still too coarse


• Surface-based


• Point-based

Representation for 3D

34

(35)

• Image-based


• Volumetric


Surface-based


• Point-based

Representation for 3D

35

(36)

• Many different ways to parameterize a surface:

Local/Global Parameterizations

36

[Sinha et al. 2016]

Geometry Image Metric Alignment (GWCNN)

[Ezuz et al. 2017]

(37)

*) DEEP LEARNING 3D SHAPE SURFACES USING GEOMETRY IMAGES [SINHA ET AL. 2016]

Shape Surfaces using Geometry Images

37

(38)

*) GEODESIC CONVOLUTIONAL NEURAL NETWORKS ON RIEMANNIAN MANIFOLDS [MASCI ET AL. 2018 (UPDATED VERSION]

Using Geodesic Patches: GCNN

38

(f ? a)(x) := X

,r

a(✓ + ✓, r)(D(x)f )(r, ✓)

<latexit sha1_base64="TquvtCTwYBxPgA56TL/7lpYdxY0=">AAACO3icbZDLSxxBEMZ7NA8z5rHRYy6FizBDRGYkEAkIknjwaIKrws6y1PTWuI09D7prxGXY/yuX/BO55eIlh0jINff07s4hPgoafnxfFdX1pZVWlqPoh7e0/Ojxk6crz/zV5y9evuq8XjuxZW0k9WSpS3OWoiWtCuqxYk1nlSHMU02n6cWnmX96ScaqsjjmSUWDHM8LlSmJ7KRh50uQQWIZDWAIwVUIH/bAT2ydD5uEx8S4ZaaAwYLhLSQHpB20HpjQDw7cWBYGZmshhsNON9qO5gX3IW6hK9o6Gna+J6NS1jkVLDVa24+jigcNGlZS09RPaksVygs8p77DAnOyg2Z++xQ2nTKCrDTuFQxz9f+JBnNrJ3nqOnPksb3rzcSHvH7N2e6gUUVVMxVysSirNXAJsyBhpAxJ1hMHKI1yfwU5RoOSXdy+CyG+e/J9ONnZjh1/ftfd/9jGsSLeiA0RiFi8F/viUByJnpDiq7gWv8SN98376f32/ixal7x2Zl3cKu/vP2AxqYM=</latexit><latexit sha1_base64="TquvtCTwYBxPgA56TL/7lpYdxY0=">AAACO3icbZDLSxxBEMZ7NA8z5rHRYy6FizBDRGYkEAkIknjwaIKrws6y1PTWuI09D7prxGXY/yuX/BO55eIlh0jINff07s4hPgoafnxfFdX1pZVWlqPoh7e0/Ojxk6crz/zV5y9evuq8XjuxZW0k9WSpS3OWoiWtCuqxYk1nlSHMU02n6cWnmX96ScaqsjjmSUWDHM8LlSmJ7KRh50uQQWIZDWAIwVUIH/bAT2ydD5uEx8S4ZaaAwYLhLSQHpB20HpjQDw7cWBYGZmshhsNON9qO5gX3IW6hK9o6Gna+J6NS1jkVLDVa24+jigcNGlZS09RPaksVygs8p77DAnOyg2Z++xQ2nTKCrDTuFQxz9f+JBnNrJ3nqOnPksb3rzcSHvH7N2e6gUUVVMxVysSirNXAJsyBhpAxJ1hMHKI1yfwU5RoOSXdy+CyG+e/J9ONnZjh1/ftfd/9jGsSLeiA0RiFi8F/viUByJnpDiq7gWv8SN98376f32/ixal7x2Zl3cKu/vP2AxqYM=</latexit><latexit sha1_base64="TquvtCTwYBxPgA56TL/7lpYdxY0=">AAACO3icbZDLSxxBEMZ7NA8z5rHRYy6FizBDRGYkEAkIknjwaIKrws6y1PTWuI09D7prxGXY/yuX/BO55eIlh0jINff07s4hPgoafnxfFdX1pZVWlqPoh7e0/Ojxk6crz/zV5y9evuq8XjuxZW0k9WSpS3OWoiWtCuqxYk1nlSHMU02n6cWnmX96ScaqsjjmSUWDHM8LlSmJ7KRh50uQQWIZDWAIwVUIH/bAT2ydD5uEx8S4ZaaAwYLhLSQHpB20HpjQDw7cWBYGZmshhsNON9qO5gX3IW6hK9o6Gna+J6NS1jkVLDVa24+jigcNGlZS09RPaksVygs8p77DAnOyg2Z++xQ2nTKCrDTuFQxz9f+JBnNrJ3nqOnPksb3rzcSHvH7N2e6gUUVVMxVysSirNXAJsyBhpAxJ1hMHKI1yfwU5RoOSXdy+CyG+e/J9ONnZjh1/ftfd/9jGsSLeiA0RiFi8F/viUByJnpDiq7gWv8SN98376f32/ixal7x2Zl3cKu/vP2AxqYM=</latexit><latexit sha1_base64="TquvtCTwYBxPgA56TL/7lpYdxY0=">AAACO3icbZDLSxxBEMZ7NA8z5rHRYy6FizBDRGYkEAkIknjwaIKrws6y1PTWuI09D7prxGXY/yuX/BO55eIlh0jINff07s4hPgoafnxfFdX1pZVWlqPoh7e0/Ojxk6crz/zV5y9evuq8XjuxZW0k9WSpS3OWoiWtCuqxYk1nlSHMU02n6cWnmX96ScaqsjjmSUWDHM8LlSmJ7KRh50uQQWIZDWAIwVUIH/bAT2ydD5uEx8S4ZaaAwYLhLSQHpB20HpjQDw7cWBYGZmshhsNON9qO5gX3IW6hK9o6Gna+J6NS1jkVLDVa24+jigcNGlZS09RPaksVygs8p77DAnOyg2Z++xQ2nTKCrDTuFQxz9f+JBnNrJ3nqOnPksb3rzcSHvH7N2e6gUUVVMxVysSirNXAJsyBhpAxJ1hMHKI1yfwU5RoOSXdy+CyG+e/J9ONnZjh1/ftfd/9jGsSLeiA0RiFi8F/viUByJnpDiq7gWv8SN98376f32/ixal7x2Zl3cKu/vP2AxqYM=</latexit>

[Masci et al. 2015]

(39)

*) GEODESIC CONVOLUTIONAL NEURAL NETWORKS ON RIEMANNIAN MANIFOLDS [MASCI ET AL. 2018 (UPDATED VERSION]

GCNN Architecture

39

(40)

• Parameterize in spectral domain

Handling Rotational Ambiguity

40

(41)

map 3D surface to 2D domain

Parameterization for Surface Analysis

41

[Maron et al. 2017]

(42)

map 3D surface to 2D domain

Parameterization for Surface Analysis

42

[Maron et al. 2017]

(43)

• Map 3D surface to 2D domain


One such mapping: flat torus (seamless => translation-invariant)


Many mappings exists: sample a few and average result


Which functions to map? 


XYZ, normals, curvature, …

Parameterization for Surface Analysis

43

[Maron et al. 2017]

(44)

• Tested on mesh segmentation

Parameterization for Surface Analysis

44

[Maron et al. 2017]

(45)

• Condition decoded points on 2D patches

Texture Transfer (Parameterization + Alignment)

45

[Wang et al. 2016]

(46)

• Condition decoded points on 2D patches

AtlasNet for Surface Generation

[Groueix et al. 2018] 46

condition decoded points on 2D patches

(47)

• Condition decoded points on 2D patches

AtlasNet for Surface Generation

47

Latent representation can be

inferred from images or point clouds

[Groueix et al. 2018]

(48)

• Condition decoded points on 2D patches

AtlasNet for Surface Generation

48

Latent representation can be

inferred from images or point clouds

Quad Mesh is generated by mapping a regular grid in

2D domain to 3D points

[Groueix et al. 2018]

(49)

• Condition decoded points on 2D patches

AtlasNet for Surface Generation

49

Latent representation can be

inferred from images or point clouds

texture coordinates come for free!!

(50)

• Image-based


• Volumetric


• Surface-based

PROS: parameterize + image networks (instrinsic representation)

CONS: suffers from parameterisation artefacts (local versus global distortion), 
 requires good quality mesh


• Point-based

Representation for 3D

50

(51)

• Image-based


• Volumetric


• Surface-based


Point-based

Representation for 3D

51

(52)

Common representation: native representation


Easy to obtain from meshes, depth scans, laser scans

Representation for 3D: Point-based

52

(53)

Common representation


Easy to obtain from meshes, depth scans, laser scans


Unstructured (e.g., any permutation of points gives same shape!)

In Original Representation

[Qi et al. 2017] 53

(54)

• Permutation-invariant functions

PointNet for Point Cloud Analysis

54

permutation-invariant functions

[Qi et al. 2017]

(55)

• Permutation-invariant functions

Use MLPs (h) and max-pooling (g) as simple symmetric functions

PointNet for Point Cloud Analysis

55

Use MLPs (h) and max-pooling (g) as simple symmetric functions

[Qi et al. 2017]

(56)

PointNet Architecture

[Qi et al. 2017] 56

(57)

PointNet for Point Cloud Analysis

57

(58)

PointNet++

PointNet++

[Qi et al. 2018] 58

(59)

• Multi-scale version

PCPNet for Local Point Cloud Analysis

[Guerrero et al. 2018] 59

(60)

PCPNet Architecture

60

(61)

• Often generated output needs to be compare to some true shape

PointNet for Point Cloud Synthesis

[Su et al. 2017] 61

Earth Mover Distance as loss function

Referanser

RELATERTE DOKUMENTER

Veltkamp (Editors).. DeepGM, to 3D shape retrieval. The proposed technique leverages recent developments in machine learning and geometry processing to effectively represent and

moderators, such as in a between subjects study with an experimental and a control group (e.g. Martin et al., 2014), the separate groups were treated as separate studies in

ice-pushed bedrock ridges and till ridges. Up to 100-m-thick accumulations of glaciomarine sediments SSU-II were mapped on the southern side of the Kolguev Line whereas less than

Beach and shoreface sediments deposited in the more than 800-km long ice-dammed Lake Komi in northern European Russia have been investigated and dated. The lake flooded the

We consider this also to be related to the accuracy of the 14 C dates, but partly also to the elevations of the investigated basins, and the original estimate of the Vedde Ash

concluded that the reproductive impact of formaldehyde in humans was unlikely at occupational exposure levels, despite finding evidence of increased risk of spontaneous abortion (SAB)

Molecular signatures of Mantoniella species based on comparison of ITS2 secondary 680. structures

Posterior of model weights measure the extent of model misspeci cation (can be used as model selection criteria).. Can be used to measure time varying