Tutorial: Tensor Approximation in Visualization and Computer Graphics
Tensor Decomposition Models
Renato Pajarola, Susanne K. Suter, and Roland Ruiters
Data Reduction and Approximation
• A fundamental concept of data reduction is to remove redundant and irrelevant information while preserving the relevant features
‣ e.g. through frequency analysis by projection onto pre-defined bases, or extraction of data intrinsic principal components
– identify spatio-temporal and frequency redundancies
‣ maintain strongest and most significant signal components
• Data reduction linked to concepts and techniques of data compression, noise reduction as well as feature extraction and recognition/extraction
8
Data Approximation using SVD
• Singular Value Decomposition (SVD) standard tool for matrices, i.e., 2D input datasets
‣ see also principal component analysis (PCA)
left singular vectors
(column-space)
right singular vectors
(row-space) singular
values M: rows
N: columns
=
A
N N
M N
N N 0
0
U ∑ VT
9
M
N
=
MÃ U VT
N
N 0 R 0
R
∑
0
N N
N
Low-rank Approximation
10
• Exploit ordered singular values: s1 ≥ s2 ≥ ... ≥ sN
• Select first r singular values (rank reduction)
N N
M N
~ =
N
à U N
R
VT R R
R
∑ M
N
Low-rank Approximation
10
• Exploit ordered singular values: s1 ≥ s2 ≥ ... ≥ sN
• Select first r singular values (rank reduction)
‣ use only bases (singular vectors) of corresponding subspace
• Matrix SVD
‣ rank reducibility
‣ orthonormal row/column matrices
Matrix SVD Properties
N N
M N
=
N N 0
0
A U ∑ VT
M
N
11
What is a Tensor?
• Data sets are often multidimensional arrays (tensors)
‣ images, image collections, video, volume data etc.
12
1st-order tensor
I1 a
i1 = 1, . . . ,I1
2nd-order tensor
I1
A
i2 = 1, . . . ,I2
I2
3rd-order tensor
I2
I1
A
i3 = 1, . . . ,I3
I3
a
0-order tensor ...
• Individual elements of a vector a are given by ai1, from a matrix A by ai1,i2
and from a tensor
A
by ai1,i2,i3• The generalization of rows, columns (and tubes) is a fiber in a particular mode
• Two dimensional sections of a tensor are called slices
‣ frontal, horizontal and lateral for
A
∈ ℝ3Fibers and Slices
13
• Operations with tensors often performed as matrix operations using unfolded tensor
representations
‣ different tensor unfolding strategies possible
• Forward cyclic unfolding A(n) of a 3rd order tensor
A
(or 3D volume)• The n-rank of a tensor is typically defined on an unfolding
‣ n-rank Rn = rankn(
A
) = rank(A(n))‣ multilinear rank-(R1, R2, …, RN) of
A
I2 I3
A
A
I1
I1
I2
I3 I2
I3 I1
I2
I1
A
I1
I2 I3
I3
I3 I3
I2 I2
I1 I1
A(3) A(1)
A(2)
I2 ·I1 I1 ·I3 I3 ·I2
Unfolding and Ranks
14
Rank-one Tensor
• N-mode tensor
A
∈ ℝI1×…×IN that canbe expressed as the outer product of N vectors
‣ Kruskal tensor
• Useful to understand principles of rank-reduced tensor reconstruction
‣ linear combination of rank-one tensors
15
A = b ( 1 ) b ( 2 ) ··· b (N )
I3 I2
I
1A =
b(1)
b(2) b(3)
Tensor Decomposition Models
16
I3 I1
I2
A
• Three-mode factor analysis (3MFA/Tucker3) [ Tucker, 1964+1966 ]
• Higher-order SVD (HOSVD) [ De Lathauwer et al., 2000a ]
Tucker
I1
U(1)
I2
U(2) I3
U(3
)
R2
R1
R3
R2
R3
R1
core tensor B basis matrices U(n)
B
• PARAFAC (parallel factors) [ Harshman, 1970 ]
• CANDECOMP (CAND) (canonical decomposition) [ Caroll & Chang, 1970 ]
CP
I1
U(1)
I2
U(2) I3
U(3
)
R R
R
R
coefficients
• Higher order tensor
A
∈ ℝI1×…×INrepresented as a product of a core tensor B ∈ ℝR1×…×RN and N factor matrices U(n)∈ ℝIn×Rn
‣ using n-mode products ×n
U(3) U(1) U(2)
I1 I2
I1
I2 I3
I3
R1 R2 R3
R1
R2 R3
= B +
e
A
A = B ⇥
1U
(1)⇥
2U
(2)⇥
3··· ⇥
NU
(N)+ e
Tucker Model
17
CANDECOMP-PARAFAC Model
• Canonical decomposition or parallel factor analysis model (CP)
• Higher order tensor
A
factorized into a sum of rank-one tensors‣ normalized column vectors ur(n) define factor
matrices U(n)∈ ℝIn×R and weighting factors λr
18
U(3) U(1)
U(2) I1
I2 I1
I2 I3
I3
R R R
0
0
l1 l2
lR lR 1
A = ... + e
A =
Â
R r=1l
r· u
(1)ru
(2)r. . . u
(Nr )+ e
• The CP model is defined as a linear combination of rank-one tensors
Linear Combination of Rank-one Tensors
19
I3 I2
I1
l1
u(1)1
u(2)1 u(3)1
+l2 + . . . +
lR
u(2)2 u(1)2
u(3)2 u(3)R
u(2)R u(1)R
A = + e
A =
Â
R r=1l
r· u
(r1)u
(r2). . . u
(Nr )+ e
• The CP model is defined as a linear combination of rank-one tensors
• The Tucker model can be interpreted as linear combination of rank-one tensors
Linear Combination of Rank-one Tensors
19
A =
R1 r
Â
1=1R2
r
Â
2=1···
RN rN
Â
=1b
r1r2...rN· u
(r11 )u
(r22 ). . . u
(NrN )+ e
u(1)R1
u(2)R2 u(3)R3
u(3)r3
u(2)r2
u(1)r1
I3 I2
I1 A = br1r2r3 + . . . + bR1R2R3 + e
I3 I1
I2
I3
I1
U(3)
U(1)
I2 U(2)
Af ⇡
R
R R
l1
lR
...
CP a Special Case of Tucker
20
I1
U(1)
I2
U(2) I3
U(3
)
R R
R
R I1
U(1)
I2
U(2) I3
U(3
)
R2
R1
R3
R2
R3
R1
B
CP Tucker
I3 I1
I2
I3
I1
U(3) R3
R1
R2 U(1)
I2 U(2)
⇡ B
Af
• Any special form of core and corresponding factor matrices
‣ e.g. blocks along diagonal
B1
I1
I2 I1
I2 I3
I3
R R R
0
A = ...
0
+e
U(1)1 U(1)2 . . . U(1)P
. . .
. . .
BP
U(3)2
U(P2) U(2)1 U(2)2
U(3)1 U(3)P
Generalizations
21
• Full reconstruction using a Tucker or CP model may require excessively many coefficients and wide factor matrices
‣ large rank values R (CP), or R1, R2 … RN (Tucker)
• Quality of approximation increases with the rank, and number of column vectors of the factor matrices
‣ best possible fit of these bases matrices discussed later
22
Reduced Rank Approximation
U(3) U(1) U(2)
I1 I2 I3
R1 R2 R3
R1
R2 R3
l1
B
lR
...
I3 I1
I2
I1
Af ⇡
R
l...1 lR
I3
R
U(1)
U(3)
I2 U(2) R
I3 I1
I2
I3
I1
U(3)
U(1)
I2 U(2)
Af ⇡
R
R R
l1
lR
...
Rank- R Approximation
• Approximation of a tensor as a linear
combination of ranke-one tensors using a limited number R of terms
‣ CP model of limited rank R
23
A f =
Â
R r=1l
r· u
(1)ru
(2)r. . . u
(Nr )I3 I1
I2
I3
I1
U(3) R3
R1
R2 U(1)
I2 U(2)
f B
A ⇡
Rank- (R 1 , R 2 , …, R N ) Approximation
• Decomposition into a tensor with reduced, lower multilinear rank(R1, R2, …, RN)
‣
• n-mode products of factor matrices and core tensor in a given reduced rank space
‣ Tucker model with limited ranks Ri
24
A f = B ⇥
1U
(1)⇥
2U
(2)⇥
3··· ⇥
NU
(N)I3 I1
I2
I3
I1
U(3) R3
R1
R2 U(1)
I2 U(2)
⇡ B
Af
rankn(Af) = Rn rankn(A ) = rank(A(n))
Best Rank Approximation
25
• Rank reduced approximation that minimizes least-squares cost
‣
• Alternating least squares (ALS) iterative algorithm that converges to a minimum approximation error based on the
Frobenius norm ||…||F
‣ rotation of components in basis matrices
Af= B ⇥1 U(1) ⇥2 U(2) ⇥3 U(3)
=
I3 I1
I2
A˜
I3
I1
U(3) R3
R1
R2 U(1)
I2 U(2)
B
typical high-quality data reduction:
Rk ≤ Ik / 2
A f = argmin ( A f ) A A f
2Generalization of the Matrix SVD
26
A U ⌃ VT
M
N
M
N N
N N
N
=
higher orders
CP model
higher orders
Tucker model
U(3) U(1)
U(2) I1
I2 I1
I2 I3
I3
R R R
0
0
l1 l2
lR lR 1
A = ... + e
U(3) U(1) U(2)
I1 I2
I1
I2 I3
I3
R1 R2 R3
R1
R2 R3
= B + e
A