Tensor Decomposition Models

(1)

Tutorial: Tensor Approximation in Visualization and Computer Graphics

Tensor Decomposition Models

Renato Pajarola, Susanne K. Suter, and Roland Ruiters

(2)

Data Reduction and Approximation

• A fundamental concept of data reduction is to remove redundant and irrelevant information while preserving the relevant features

‣ e.g. through frequency analysis by projection onto pre-defined bases, or extraction of data intrinsic principal components

– identify spatio-temporal and frequency redundancies

‣ maintain strongest and most significant signal components

• Data reduction linked to concepts and techniques of data compression, noise reduction as well as feature extraction and recognition/extraction

8

(3)

Data Approximation using SVD

• Singular Value Decomposition (SVD) standard tool for matrices, i.e., 2D input datasets

‣ see also principal component analysis (PCA)

left singular vectors

(column-space)

right singular vectors

(row-space) singular

values M: rows

N: columns

=

A

N N

M N

N N 0

0

U ∑ V^T

9

(4)

M

N

=

M

Ã U V^T

N

N 0 R 0

R

∑

0

N N

N

Low-rank Approximation

10

• Exploit ordered singular values: s1 ≥ s2 ≥ ... ≥ sN

• Select first r singular values (rank reduction)

(5)

N N

M N

~ =

N

Ã U N

R

V^T R R

R

∑ M

N

Low-rank Approximation

10

• Exploit ordered singular values: s1 ≥ s2 ≥ ... ≥ sN

• Select first r singular values (rank reduction)

‣ use only bases (singular vectors) of corresponding subspace

(6)

• Matrix SVD

‣ rank reducibility

‣ orthonormal row/column matrices

Matrix SVD Properties

N N

M N

=

N N 0

0

A U ∑ V^T

M

N

11

(7)

What is a Tensor?

• Data sets are often multidimensional arrays (tensors)

‣ images, image collections, video, volume data etc.

12

1st-order tensor

I₁ a

i₁ = 1, . . . ,I₁

2nd-order tensor

I₁

A

i₂ = 1, . . . ,I₂

I₂

3rd-order tensor

I₂

I₁

A

i₃ = 1, . . . ,I₃

I₃

a

0-order tensor ...

(8)

• Individual elements of a vector a are given by ai₁, from a matrix A by ai₁,i₂

and from a tensor

A

^by^aⁱ¹^,ⁱ²^,ⁱ³

• The generalization of rows, columns (and tubes) is a fiber in a particular mode

• Two dimensional sections of a tensor are called slices

‣ frontal, horizontal and lateral for

A

^∈^ℝ³

Fibers and Slices

13

(9)

• Operations with tensors often performed as matrix operations using unfolded tensor

representations

‣ different tensor unfolding strategies possible

• Forward cyclic unfolding A(n) of a 3rd order tensor

A

(or 3D volume)

• The n-rank of a tensor is typically defined on an unfolding

‣ n-rank Rn = rankn(

A

) = rank(A(n))

‣ multilinear rank-(R1, R2, …, RN) of

A

I₂ I₃

A

I₁

I₂

I₃ I₂

I₃ I₁

I₂

I₁

A

I₁

I₂ I₃

I₃

I₃ I₃

I₂ I₂

I₁ I₁

A₍₃₎ A₍₁₎

A₍₂₎

I₂ ·I₁ I₁ ·I₃ I₃ ·I₂

Unfolding and Ranks

14

(10)

Rank-one Tensor

• N-mode tensor

A

^∈^ℝ^I¹^×…×I^N^{that can}

be expressed as the outer product of N vectors

‣ Kruskal tensor

• Useful to understand principles of rank-reduced tensor reconstruction

‣ linear combination of rank-one tensors

15

A = b ⁽ ¹ ⁾ b ⁽ ² ⁾ ··· b ^(N ⁾

I₃ I₂

I

₁

A ⁼

b⁽¹⁾

b⁽²⁾ b⁽³⁾

(11)

Tensor Decomposition Models

16

I₃ I₁

I₂

A

• Three-mode factor analysis (3MFA/Tucker3) [ Tucker, 1964+1966 ]

• Higher-order SVD (HOSVD) [ De Lathauwer et al., 2000a ]

Tucker

I1

U(1)

I2

U⁽²⁾ I3

U⁽³

)

R2

R1

R3

R2

R3

R1

core tensor B basis matrices U⁽ⁿ⁾

B

• PARAFAC (parallel factors) [ Harshman, 1970 ]

• CANDECOMP (CAND) (canonical decomposition) [ Caroll & Chang, 1970 ]

CP

I1

U(1)

I2

U⁽²⁾ I3

U⁽³

)

R R

R

coefficients

(12)

• Higher order tensor

A

^∈^ℝ^I¹^×…×I^N

represented as a product of a core tensor B ∈ ℝ^R¹^×…×R^N and N factor matrices U⁽ⁿ⁾∈ ℝ^Iⁿ^×Rⁿ

‣ using n-mode products ×n

U⁽³⁾ U⁽¹⁾ U⁽²⁾

I₁ I₂

I₁

I₂ I₃

I₃

R₁ R₂ R₃

R₁

R₂ R₃

= B ⁺

e

A

A = B ⇥

¹

U

⁽¹⁾

⇥

²

U

⁽²⁾

⇥

³

··· ⇥

^N

U

⁽^N⁾

+ e

Tucker Model

17

(13)

CANDECOMP-PARAFAC Model

• Canonical decomposition or parallel factor analysis model (CP)

• Higher order tensor

A

factorized into a sum of rank-one tensors

‣ normalized column vectors ur(n) define factor

matrices U⁽ⁿ⁾∈ ℝ^Iⁿ^×R and weighting factors λr

18

U⁽³⁾ U⁽¹⁾

U⁽²⁾ I₁

I₂ I₁

I₂ I₃

I₃

R R R

0

l₁ l₂

l_R l_R ₁

A ₌ ... ⁺ e

A =

Â

R r=1

l

_r

· u

⁽¹⁾_r

u

⁽²⁾_r

. . . u

^(N_r ⁾

+ e

(14)

• The CP model is defined as a linear combination of rank-one tensors

Linear Combination of Rank-one Tensors

19

I₃ I₂

I₁

l₁

u⁽¹⁾₁

u⁽²⁾₁ u⁽³⁾₁

+^l² + . . . ₊

l_R

u⁽²⁾₂ u⁽¹⁾₂

u⁽³⁾₂ u⁽³⁾_R

u⁽²⁾_R u⁽¹⁾_R

A ⁼ ⁺ e

A =

Â

R r=1

l

_r

· u

⁽r¹⁾

u

⁽r²⁾

. . . u

^(Nr ⁾

+ e

(15)

• The CP model is defined as a linear combination of rank-one tensors

• The Tucker model can be interpreted as linear combination of rank-one tensors

Linear Combination of Rank-one Tensors

19

A =

R₁ r

Â

₁=1

R₂

r

Â

₂=1

···

R_N r_N

Â

=1

b

_r₁_r₂_...r_N

· u

⁽r¹₁ ⁾

u

⁽r²₂ ⁾

. . . u

^(Nr_N ⁾

+ e

u⁽¹⁾_R₁

u⁽²⁾_R₂ u⁽³⁾_R₃

u⁽³⁾r₃

u⁽²⁾r₂

u⁽¹⁾r₁

I₃ I₂

I₁ _A ₌ ^b^r¹^r²^r³ + . . . ₊ ^b^R¹^R²^R³ ⁺ e

(16)

I₃ I₁

I₂

I₃

I₁

U⁽³⁾

U⁽¹⁾

I₂ U⁽²⁾

Af ^⇡

R

R R

l₁

l_R

...

CP a Special Case of Tucker

20

I1

U(1)

I2

U⁽²⁾ I3

U⁽³

)

R R

R

R I1

U(1)

I2

U⁽²⁾ I3

U⁽³

)

R2

R1

R3

R2

R3

R1

B

CP Tucker

I₃ I₁

I₂

I₃

I₁

U⁽³⁾ R₃

R₁

R₂ U⁽¹⁾

I₂ U⁽²⁾

⇡ B

Af

(17)

• Any special form of core and corresponding factor matrices

‣ e.g. blocks along diagonal

B₁

I₁

I₂ I₁

I₂ I₃

I₃

R R R

0

A ₌ _...

0

⁺

e

U⁽¹⁾₁ _U⁽¹⁾₂ . . . _U⁽¹⁾_P

. . .

B_P

U⁽³⁾₂

U⁽_P²⁾ U⁽²⁾₁ U⁽²⁾₂

U⁽³⁾₁ U⁽³⁾_P

Generalizations

21

(18)

• Full reconstruction using a Tucker or CP model may require excessively many coefficients and wide factor matrices

‣ large rank values R (CP), or R1, R2 … RN (Tucker)

• Quality of approximation increases with the rank, and number of column vectors of the factor matrices

‣ best possible fit of these bases matrices discussed later

22

Reduced Rank Approximation

U⁽³⁾ U⁽¹⁾ U⁽²⁾

I₁ I₂ I₃

R₁ R₂ R₃

R₁

R₂ R₃

l₁

B

l_R

...

(19)

I₃ I₁

I₂

I₁

Af ⇡

R

l...₁ ^l^R

I₃

R

U⁽¹⁾

U⁽³⁾

I₂ U⁽²⁾ R

I₃ I₁

I₂

I₃

I₁

U⁽³⁾

U⁽¹⁾

I₂ U⁽²⁾

Af ⇡

R

R R

l₁

l_R

...

Rank- R Approximation

• Approximation of a tensor as a linear

combination of ranke-one tensors using a limited number R of terms

‣ CP model of limited rank R

23

A f =

Â

R r=1

l

_r

· u

⁽¹⁾r

u

⁽²⁾r

. . . u

^(Nr ⁾

(20)

I₃ I₁

I₂

I₃

I₁

U⁽³⁾ R₃

R₁

R₂ U⁽¹⁾

I₂ U⁽²⁾

f B

A ⇡

Rank- (R 1 , R 2 , …, R N ) Approximation

• Decomposition into a tensor with reduced, lower multilinear rank(R1, R2, …, RN)

‣

• n-mode products of factor matrices and core tensor in a given reduced rank space

‣ Tucker model with limited ranks Ri

24

A f = B ⇥

¹

U

⁽¹⁾

⇥

²

U

⁽²⁾

⇥

³

··· ⇥

^N

U

^(N⁾

I₃ I₁

I₂

I₃

I₁

U⁽³⁾ R₃

R₁

R₂ U⁽¹⁾

I₂ U⁽²⁾

⇡ B

Af

rank_n(Af) = R_n  rank_n(A ) = rank(A_(n))

(21)

Best Rank Approximation

25

• Rank reduced approximation that minimizes least-squares cost

‣

• Alternating least squares (ALS) iterative algorithm that converges to a minimum approximation error based on the

Frobenius norm ||…||F

‣ rotation of components in basis matrices

Af= B ⇥¹ U⁽¹⁾ ⇥² U⁽²⁾ ⇥³ U⁽³⁾

=

I₃ I₁

I₂

A˜

I₃

I₁

U⁽³⁾ R₃

R₁

R₂ U⁽¹⁾

I₂ U⁽²⁾

B

typical high-quality data reduction:

Rk ≤ Ik / 2

A f = argmin ( A f ) A A f

²

(22)

Generalization of the Matrix SVD

26

A U ⌃ V^T

M

N

M

N N

N

=

higher orders

CP model

higher orders

Tucker model

U⁽³⁾ U⁽¹⁾

U⁽²⁾ I₁

I₂ I₁

I₂ I₃

I₃

R R R

0

l₁ l₂

l_R l_R ₁

A ₌ ... ⁺ e

U⁽³⁾ U⁽¹⁾ U⁽²⁾

I₁ I₂

I₁

I₂ I₃

I₃

R₁ R₂ R₃

R₁

R₂ R₃

= B ⁺ e

A

Tensor Decomposition Models

Tutorial: Tensor Approximation in Visualization and Computer Graphics

Tensor Decomposition Models

Renato Pajarola, Susanne K. Suter, and Roland Ruiters

Data Reduction and Approximation

Data Approximation using SVD

=

=

Low-rank Approximation

~ =

Low-rank Approximation

Matrix SVD Properties

=

What is a Tensor?

A

A

a

A

A

Fibers and Slices

A

A

A

Unfolding and Ranks

Rank-one Tensor

A

A = b ( 1 ) b ( 2 ) ··· b (N )

I

A =

Tensor Decomposition Models

A

Tucker

B

CP

A

e

A

A = B ⇥

U

⇥

U

⇥

··· ⇥

U

+ e

Tucker Model

CANDECOMP-PARAFAC Model

A

0

0

A =

Â

l

· u

u

. . . u

+ e

Linear Combination of Rank-one Tensors

A =

Â

l

· u

u

. . . u

+ e

Linear Combination of Rank-one Tensors

A =

Â

Â

···

Â

b

· u

u

. . . u

+ e

CP a Special Case of Tucker

B

CP Tucker

0

A = b ⁽ ¹ ⁾ b ⁽ ² ⁾ ··· b ^(N ⁾

A ⁼