• No results found

Heterogeneous Spline Surface Intersections

N/A
N/A
Protected

Academic year: 2022

Share "Heterogeneous Spline Surface Intersections"

Copied!
22
0
0

Laster.... (Se fulltekst nå)

Fulltekst

(1)

Heterogeneous Spline Surface Intersections

Sverre Briseid (sbr@sintef.no) Trond Hagen (trr@sintef.no)

Geometric and Physical Modeling 2009, San Francisco

(2)

ICT 2

Outline

 Heterogeneous architectures

 OpenMP & CUDA

 Spline surface intersection/self-intersection

 Multi-core approach

 Results

 Conclusions

(3)

Heterogeneous architectures

 More than one type of architecture in a system

 Multi-core CPU

 Specialized accelerated cores

 Different programming models

 Sequential algorithms a bottleneck

 Split problem into independent task, run in parallel

 Utilize the strengths of the different architectures

(4)

ICT 4

Multi-core CPU & OpenMP

int i, m=10, N=1000

double A[N], B[N], C[N];

#pragma omp parallel for for (i=0; i<N; i++) {

A[i] = B[i] + m*C[i];

}

 2-4 cores in modern desktop computers

 Requires parallel algorithms

 OpenMP

API for shared memory parallel programming

C/C++

Compiler pragmas

Easy syntax

(5)

GPU & CUDA

 GPU (Graphics Processing Unit)

All modern computers has one

Massively parallel – Up to 500 cores

Computational power: Up to 2 teraflops

32-bit precision at full speed – 64-bit precision at half speed

 CUDA

API for using NVIDIA graphics cards

GPU computing for the masses

Syntax based on C/C++

Computational kernels

(6)

ICT 6

Spline surface

 Parametric

 Controlled by a regular

polygon mesh

(7)

Self-intersection - Singularities

(8)

ICT 8

Transversal and tangential

intersections

(9)

Multi-core approach

Zoom in on problematic areas using parallel resources, analyze

Let the CPU trace out the intersection curves in a sequential manner

Overlap-test

Massive uniform subdivision down to Bezier level. Level n => 2^n Bezier segments in each parameter direction

Create axis aligned bounding boxes

Box-box overlap-test

(10)

ICT 10

 Intersection-analysis

 Subdivide normal surface to the same level as the surface

 Check if sub-patches contain the origin

 Create direction cones for the bezier normal patches

 Check if cone span is less than pi => no self-intersection

 Check all pairs of normal cones whether they overlap =>

possibly a tangential intersection, given that bounding boxes overlap

(11)

The intersection-test modules

1.Spline surface refinement

Localize the possible intersections 2.Bounding box generation

Axis aligned boxes containing the Bezier subpatches 3.Box-box overlap-test

See if two Bezier subpatches may overlap 4.Normal surface refinement

Refine to the same level as the spline surface 5.Degeneracy-test

Check if bounding boxes of refined normal surface contain the origin 6.Normal cone generation

Compute the span of the normals for each Bezier subpatch 7.Cone-cone overlap-test

Check if we may have a tangential intersection

(12)

ICT 12

Speedups for subdivision levels 5-8 Input: Cubic Bezier surface &

corresponding quintic normal surface

0 1 2 3 4 5 6 7 8

0 cores 1 core 2 cores 3 cores 4 cores GPU

Level 5 Level 6 Level 7 Level 8

(13)

Kernel speedups – subdivision level 8

Kernel 1 – Surface refinement

(14)

ICT 14

Kernel speedups – subdivision level 8

Kernel 2 – Bounding box generation

(15)

Kernel speedups – subdivision level 8

Kernel 3 – Box-box overlap test

(16)

ICT 16

Kernel speedups – subdivision level 8

Kernel 4 – Normal surface refinement

(17)

Kernel speedups – subdivision level 8

Kernel 5 – Normal surface degeneracy test

(18)

ICT 18

Kernel speedups – subdivision level 8

Kernel 6 – Normal cone generation

(19)

Kernel speedups – subdivision level 8

Kernel 7 – Cone-cone overlap test

(20)

ICT 20

Pipeline – Heterogeneous parallelization

Surface

Surface refinement

Bounding box generation

Normal surface

Normal surface refinement

Degeneracy test Normal cone generation

Cone-cone overlap test

1

Box-box overlap test

2

3

4

5 6

7

(21)

Conclusions

 Heterogeneous intersections a good idea?

It does seems like it

 What about the algorithmic approach?

Well suited for difficult cases

Scales well on the CPU for most of the kernels

Good speedup on the GPU

Parallel pipeline allows load balancing between CPU & GPU

 Is the algorithm futureproof?

Future processors will get even more parallel

Faster CPU-GPU inter-communication reduces overhead

Heterogeneous algorithms will get even more important

(22)

ICT 22

Thank you for your attention!

Questions?

Referanser

RELATERTE DOKUMENTER

There had been an innovative report prepared by Lord Dawson in 1920 for the Minister of Health’s Consultative Council on Medical and Allied Services, in which he used his

However, since each new vertex of a normal mesh lies in a direction normal to the local surface defined using the coarse scale vertices, the wavelet coefficients of a majority of

We constrain the surface normals at each image location to fall on an irradiance cone whose axis is the light source direction and whose apex angle is determined by the measured

Initial viscosity results are shown in Table 6 and Figure 6, and the relation between initial Marsh Cone and viscosity values is shown in Figure 7.. Figure 7 Viscosity after 15

The speedup factors in Figure 11 show that the multi-core CPU is a good candidate for running module 2 (bounding box construction), module 5 (degeneracy test) and module 7 (cone

Examine if tip (rotor) speed of a vertical-shaft impactor (VSI) and crusher type for the last crushing stage (VSI or cone crusher) affect grading, shape, surface texture and

Figure  4  shows f s (corrected for temperature zero shifts for the cone types mentioned above) and f t vs depth. For all cones except cone 6 the pore pressure correction is

NGI [13] documents the flow cone feasibility study, in which constant head tests with the flow cone tool were simulated using finite elements (FE). The axisymmetric FE model