Information Theory and Visualization

(1)

Tutorial on Information Theory in Visualization

Information Theory and Visualization

Min Chen

University of Oxford

(2)

Facts

Theoretic Framework

Facts

Wisdom

W W W W W W W W W W W W W W W W W W W W

Theory

W T W T

W T

T

T T



T T

T

T T

(3)

Source Filtering Visual

Mapping Rendering

raw data D

information I

geometry & labels

N N G N

Perception Cognition Destination

information I'

knowledge K

Displaying Optical Transmission

image V

optical signal S

optical signal

N N S'

Viewing

image

N V'

N N

image V'

image V

vis-encoder

vis-channel

vis-decoder

Source Encoder Destination

(Transmitter)

message M

Channel Decoder

(Receiver)

signal S

signal S'

message

N M' M. Chen and H. Jänicke, An information-theoretic

framework for visualization, IEEE Transactions on Visualisation and Computer Graphics, 2010

(4)

Source vis- encoder

raw data

D N

vis-

decoder Destination

knowledge K image

V' N

Source Encoder Destination

(Transmitter)

message M

Channel Decoder

(Receiver)

signal S

signal S'

message

N M' image

V

vis- channel

N

?

compactness

?

error detection error correction

(5)



 X



 x₁, x₂, ..., x_m



 p(x_i)



Claude E. Shannon (1916-2001)





 ^m

i

i p x

x p

X ) ( )log ( )

( ₂

H

(6)

Information Theory and Visualization

1. Data Intelligence  a big picture 2. Visualization  a small picture

3. Measurement, Explanation, and Prediction 4. Example: Visual Multiplexing

5. Example: Error Detection and Correction 6. Example: Process Optimization

7. Summary

Min Chen

(7)

Information Theory and Visualization

7. Summary

Min Chen

(8)

Communication System

G, A, C, T

Z



H

Min Chen, Cost-Benefit Analysis of Data Intelligence, https://vimeo.com/145258513, 2015

(9)

Communication System

G, A, C, T

Z



H

(10)



45 letters

pneumonoultramicroscopicsilicovolcanoconiosis

189,819 letters a word or a formula?

(11)



X Y Z

I(X; Y) I(Y; Z)

Process 1 Process 2 p(x, y, z) = p(x) p(y|x) p(z|y)

p(x) p(y|x) p(z|y)

)

; ( )

;

(X Y I X Z

I 

(12)

Process

X Y

I

H X ^H ^Y

I

(13)

 r decisions

 3 valid values each (e.g., buy, sell, hold)



Z₂

Aggregated Data

at 1-minute resolution

Z₁

Raw Data

1 hour long at 5-second resolution

Z₃

Time Series Plots

Z₄

Feature Recognition

Z5

Correlation Indices

Z6

Graph Visualization

Z₇

Decision

rtime series

720 data points

2³²valid values

rtime series

 60 data points

2³²valid values

M

M H

M M

H

rtime series

60 data points

128 valid values

r(r-1)/2 data points

2^30.7valid values

r(r-1)/2 connections

5 valid values

r decisions

3 valid values rtime series

10 features

8 valid values

M ^machine_process H ^human_process _... ^alphabet

Hmax=23040r Hmax=1920r

H_max=30r H_max=420r

Hmax1.16r(r-1) Hmax15r(r-1)

H_max1.58r

(14)

Data Process 1

Process

2 ...

alphabet

Z¹

Process L-1

Process

L Decision

alphabet

Z²

alphabet

Z³

alphabet

Z^L-1

alphabet

Z^L

alphabet

Z^L+1

I I

I  I   I  I

(15)

Big Data Process 1

Process

2 ...

alphabet

Z¹

Process L-1

Process

L Decision

alphabet

Z²

alphabet

Z³

alphabet

Z^L-1

alphabet

Z^L

alphabet

Z^L+1

entropy H ⁽ Z

¹

⁾

entropy

H(Z^L+1)

I⁽Z¹^;Z^L+1⁾

mutual information

(16)







X Y Z

I(X; Y) I(Y; Z)

Process 1 Process 2 p(x, y, z) = p(x) p(y|x) p(z|y)

p(x) p(y|x) p(z|y)

)

; ( )

;

(X Y I X Z

I 

X Process 1 Y Process 2 Z interaction U₁ interaction U₂

X Process 1 Y Process 2 Z domain knowledge about X

)

; (

)

;

( X Y I X Z

I 

M. Chen and H. Jänicke, An information-theoretic

framework for visualization, IEEE Transactions on Visualisation and Computer Graphics, 2010

(17)

entropy H ⁽ Z

¹

⁾

Big Data Process 1

Process

2 ...

alphabet

Z¹

Process

L-1

Process

L Decision

alphabet

Z²

alphabet

Z³

alphabet

Z^L-1

alphabet

Z^L

alphabet

Z^L+1

entropy H ⁽ X ⁾

I⁽Z¹^;X⁾

mutual information

x  X is a piece of soft knowledge All possible decisions under different conditions

a) totally data-driven b) totally instinct-driven c) data-informed

d) due to unknown or uncontrollable factors

a b c d

(18)

Data Process

1 ...

alphabet

Z¹

... ^Process

L Decision

alphabet

Z²

alphabet

Z^s

alphabet

Z^s+1

alphabet

Z^L

alphabet

Z^L+1

Process s

Alphabet Compression

Z Z

Potential Distortion

Z Z

M. Chen and A. Golan, What may visualization processes optimize?, IEEE Transactions on Visualisation and Computer Graphics, 2015

(19)



Data Process

1 ...

alphabet

Z¹

... ^Process

L Decision

alphabet

Z²

alphabet

Z^s

alphabet

Z^s+1

alphabet

Z^L

alphabet

Z^L+1

Process s

(20)

Process

Z V

forward mapping

z₁ z₂ ...

z_n

v₁ v₂ ...

v_m

Process

Z’ V

z’₁ z’₂ ...

z’_n backward mapping

D_KL

Solomon Kullback 1907-1994

Richard Leibler 1914-2003

(21)



Cost-Benefit Ratio

Data Process

1 ...

alphabet

Z¹

... ^Process

L Decision

alphabet

Z²

alphabet

Z^s

alphabet

Z^s+1

alphabet

Z^L

alphabet

Z^L+1

Process s

M. Chen and A. Golan, What may visualization processes optimize?, IEEE Transactions on Visualization and Computer Graphics, 2015

(22)

Data Process

1 ...

alphabet

Z¹

... ^Process

L Decision

alphabet

Z²

alphabet

Z^s

alphabet

Z^s+1

alphabet

Z^L

alphabet

Z^L+1

Process s

Process

s

Cost Process

s+1

Cost

Composite Process for s^and s+1

Alphabet Compression Potential Distortion

Cost

(23)

Information Theory and Visualization

7. Summary

Min Chen

(24)

 H(X)

 V(G)

 D

) (

) (VMR) (

Ratio Mapping

Visual

X G H

 V

) (

) 0 ), ( )

( (ILR) max(

Ratio Loss

n Informatio

X

G X

H

V H 



D V( ) (DSU)

on Utilizati Space

Display G



M. Chen and H. Jänicke, An information-theoretic framework for visualization, IEEE Transactions on Visualisation and Computer Graphics, 2010

(25)



 V(G) = H(Z)

 D



 

 D



 



0 

64 128 192 256

0 8 16 24 32 40 48 56 64 minimal 64 pixels

minimal 256 pixels

X

X X X X

X X X X X X

X X X X

X X X

X

X X

X X X X X X X X X X X

X X X X

X X X X X

X X

X X X

X X X X

X X

X X X

X X X X

X X

X X X X X

X X X

X X X X X X X X

X X X X

X X X X X X X X X X X X X X

X X X X

X X

X X X X X X

X X X X X

X X X X

X

X X X X X

X X X X

X X X X X

X X X

X X X X

X X X X X

X X X X

X X X X X X X X X

X X X X X

X X X X

X X

X X X X

X X X X X X X X X X X X X X X

X X X X

X X X X X X

X X X X X

minimal 8 pixels

minimal 64 pixels

byte 1

byte 16

byte 32

byte 64 byte 48

7 6 5 4 3 2 1 0

256 512 log 1

256 ) 1

(

64

0 255

0

2 



 

 

t i

Z H

(26)









0 64 128 192 256

0 8 16 24 32 40 48 56 64 0

64 128 192 256

0 8 16 24 32 40 48 56 64

(a) evenly distributed p

0 64 128 192 256

0 8 16 24 32 40 48 56 64 minimal 64 pixels

minimal 256 pixels

H(X) = 512 bits V(G) = 512 bits D = 16382 bits VMR = 1

ILR = 0 DSU = 0.03

H(X) = 512 bits V(G) = 384 bits D = 4096 bits VMR = 0.75 ILR = 0.25 DSU = 0.09

(27)

Information Theory and Visualization

7. Summary

Min Chen

(28)









0 64 128 192 256

0 8 16 24 32 40 48 56 64 0

64 128 192 256

0 8 16 24 32 40 48 56 64

information loss:

25%

0 64 128 192 256

0 8 16 24 32 40 48 56 64 minimal 64 pixels

minimal 256 pixels

M. Chen and H. Jänicke, An information-theoretic framework for visualization, IEEE Transactions on Visualisation and Computer Graphics, 2010

(29)



0 64 128 192 256

0 8 16 24 32 40 48 56 64 0

64 128 192 256

0 8 16 24 32 40 48 56 64

0 64 128 192 256

0 8 16 24 32 40 48 56 64 0

64 128 192 256

0 8 16 24 32 40 48 56 64

(b) unevenly distributed p

information loss:

25.8%

Z Z'

A:

0 32 64 96 128 160 192 224 256

B:

C:

D:

0 8 16 24 32 40 48 56 64

probability linear

2

 1 p

4

 1 p

8

 1 p

8

 1 p

information loss:

25.0%

(30)



0 32 64 96 128 160 192 224 256

0 8 16 24 32 40 48 56 64 0

64 128 256

0 8 16 24 32 40 48 56 64 192

0 64 128 192 256

0 8 16 24 32 40 48 56 64 0

64 128 192 256

0 8 16 24 32 40 48 56 64

0 64 128 192 256

0 8 16 24 32 40 48 56 64 0

64 128 192 256

0 8 16 24 32 40 48 56 64

(b) unevenly distributed p (c) 4 regional mappings

information loss:

25.8%

information loss:

25.0%

information loss:

22.6%

(31)



0 64 128 192 256

0 8 16 24 32 40 48 56 64 0

64 128 192 256

0 8 16 24 32 40 48 56 64

0 64 128 192 256

0 8 16 24 32 40 48 56 64 0

64 128 192 256

0 8 16 24 32 40 48 56 64

(b) unevenly distributed p (c) 4 regional mappings

2⁰ 2² 2⁴ 2⁶ 2⁸

0 8 16 24 32 40 48 56 64 0

64 128 192 256

0 8 16 24 32 40 48 56 64

(d) logarithmic plot

0 32 64 96 128 160 192 224 256

0 8 16 24 32 40 48 56 64 0

64 128 256

0 8 16 24 32 40 48 56 64 192

information loss:

25.8%

information loss:

25.0%

information loss:

22.6%

information loss:

0%

A:

B:

C:

D:

2

 1 p

4

 1 p

8

 1 p

8

 1 p

12

 p

22

 1 p

23

 1 p

2 1

1 _

 _k p

p k

2

 1

p k

2

 1

...



 k

(32)

 D

 H(X) H_max(X)

 V(G)



http://en.wikipedia.org/wiki/Treemapping http://hci.stanford.edu/jheer/files/zoo/

(33)

Information Theory and Visualization

7. Summary

Min Chen

(34)

http://learnpracticalgis.com/how-to-overlay-maps/

(35)



vis-encoder vis-link (consisting of many vis-channels) vis-decoder

information about at p

at p c₃ cc₁₄ c₂

c_k c₂ c_k

c₃ spatial domainD

temporal domainT

other signals and noise

MUX DEMUX

Location pcan be associated with X in the source data or determined by a spatial mapping.

Xcan be a data record or a set of partially encoded visual attributes.

Perceived information may include estimated values and relationships with data conveyed by other signals.

p

M. Chen, S. Walton, K. Berger, J. Thiyagalingam, B. Duffy, H. Fang, C. Holloway, and A. E. Trefethen, Visual multiplexing, Computer Graphics Forum, 2014.

(36)

(37)

(38)

(39)

60 70

50

(40)

text label

(41)

(42)

Data Space

Visualization Space

Display Space

H

Data Space Entropy

V(G)

Visualization Capacity (Visualization Space Entropy)

D

Display Space Capacity

V(G) D

<< 1

(43)



R. P. Botchen, S. Bachthaler, F. Schick, M. Chen, G. Mori, D.

Weiskopf and T. Ertl, Action-based multi-field video visualization, IEEE Transactions on Visualization and Computer Graphics, 2008.

(44)

Information Theory and Visualization

7. Summary

Min Chen

(45)

Information Theory and Visualization

7. Summary

Min Chen

(46)



Four Levels of Visualization

M. Chen and A. Golan, What may visualization processes optimize?, IEEE Transactions on Visualisation and Computer Graphics, 2015

(47)

(48)



Observational Visualization

(49)

Observational Visualization

(50)

Information Theory and Visualization

7. Summary

Min Chen

(51)

Information Theory and Visualization

Tutorial on Information Theory in Visualization

Information Theory and Visualization



Information Theory and Visualization

Information Theory and Visualization

Z

H

Z

H

H X H Y

I  I   I  I

entropy H ( Z

)

)

; (

)

;

( X Y I X Z

I 

entropy H ( Z

)

entropy H ( X )

Z Z

Z Z

Information Theory and Visualization

Information Theory and Visualization

Information Theory and Visualization

Information Theory and Visualization

Information Theory and Visualization

Information Theory and Visualization

H X ^H ^Y

entropy H ⁽ Z

⁾

entropy H ⁽ Z

⁾

entropy H ⁽ X ⁾