Tutorial on Information Theory in Visualization
Information Theory and Visualization
Min Chen
University of Oxford
Facts
Theoretic Framework
Facts
Wisdom
W W W W W W W W W W W W W W W W W W W W
Theory
W T W T
W T
T
T
T T
T T
T T
T T
T
T
T T
Source Filtering Visual
Mapping Rendering
raw data D
information I
geometry & labels
N N G N
Perception Cognition Destination
information I'
knowledge K
Displaying Optical Transmission
image V
optical signal S
optical signal
N N S'
Viewing
image
N V'
N N
image V'
image V
vis-encoder
vis-channel
vis-decoder
Source Encoder Destination
(Transmitter)
message M
Channel Decoder
(Receiver)
signal S
signal S'
message
N M' M. Chen and H. Jänicke, An information-theoretic
framework for visualization, IEEE Transactions on Visualisation and Computer Graphics, 2010
Source vis- encoder
raw data
D N
vis-
decoder Destination
knowledge K image
V' N
Source Encoder Destination
(Transmitter)
message M
Channel Decoder
(Receiver)
signal S
signal S'
message
N M' image
V
vis- channel
N
?
compactness
?
error detection error correction
X
x1, x2, ..., xm
p(xi)
Claude E. Shannon (1916-2001)
m
i
i
i p x
x p
X ) ( )log ( )
( 2
H
Information Theory and Visualization
1. Data Intelligence a big picture 2. Visualization a small picture
3. Measurement, Explanation, and Prediction 4. Example: Visual Multiplexing
5. Example: Error Detection and Correction 6. Example: Process Optimization
7. Summary
Min Chen
Information Theory and Visualization
1. Data Intelligence a big picture 2. Visualization a small picture
3. Measurement, Explanation, and Prediction 4. Example: Visual Multiplexing
5. Example: Error Detection and Correction 6. Example: Process Optimization
7. Summary
Min Chen
Communication System
G, A, C, T
Z
H
Min Chen, Cost-Benefit Analysis of Data Intelligence, https://vimeo.com/145258513, 2015
Communication System
G, A, C, T
Z
H
45 letters
pneumonoultramicroscopicsilicovolcanoconiosis
189,819 letters a word or a formula?
X Y Z
I(X; Y) I(Y; Z)
Process 1 Process 2 p(x, y, z) = p(x) p(y|x) p(z|y)
p(x) p(y|x) p(z|y)
)
; ( )
;
(X Y I X Z
I
Process
X Y
I
H X H Y
I
r decisions
3 valid values each (e.g., buy, sell, hold)
Z2
Aggregated Data
at 1-minute resolution
Z1
Raw Data
1 hour long at 5-second resolution
Z3
Time Series Plots
Z4
Feature Recognition
Z5
Correlation Indices
Z6
Graph Visualization
Z7
Decision
rtime series
720 data points
232valid values
rtime series
60 data points
232valid values
M
M H
M M
H
rtime series
60 data points
128 valid values
r(r-1)/2 data points
230.7valid values
r(r-1)/2 connections
5 valid values
r decisions
3 valid values rtime series
10 features
8 valid values
M machineprocess H humanprocess ... alphabet
Hmax=23040r Hmax=1920r
Hmax=30r Hmax=420r
Hmax1.16r(r-1) Hmax15r(r-1)
Hmax1.58r
Data Process 1
Process
2 ...
alphabet
Z1
Process L-1
Process
L Decision
alphabet
Z2
alphabet
Z3
alphabet
ZL-1
alphabet
ZL
alphabet
ZL+1
I I
I I
I I I I
Big Data Process 1
Process
2 ...
alphabet
Z1
Process L-1
Process
L Decision
alphabet
Z2
alphabet
Z3
alphabet
ZL-1
alphabet
ZL
alphabet
ZL+1
entropy H ( Z
1)
entropy
H(ZL+1)
I(Z1;ZL+1)
mutual information
X Y Z
I(X; Y) I(Y; Z)
Process 1 Process 2 p(x, y, z) = p(x) p(y|x) p(z|y)
p(x) p(y|x) p(z|y)
)
; ( )
;
(X Y I X Z
I
X Process 1 Y Process 2 Z interaction U1 interaction U2
X Process 1 Y Process 2 Z domain knowledge about X
)
; (
)
;
( X Y I X Z
I
M. Chen and H. Jänicke, An information-theoreticframework for visualization, IEEE Transactions on Visualisation and Computer Graphics, 2010
entropy H ( Z
1)
Big Data Process 1
Process
2 ...
alphabet
Z1
Process
L-1
Process
L Decision
alphabet
Z2
alphabet
Z3
alphabet
ZL-1
alphabet
ZL
alphabet
ZL+1
entropy H ( X )
I(Z1;X)
mutual information
x X is a piece of soft knowledge All possible decisions under different conditions
a) totally data-driven b) totally instinct-driven c) data-informed
d) due to unknown or uncontrollable factors
a b c d
Data Process
1 ...
alphabet
Z1
... Process
L Decision
alphabet
Z2
alphabet
Zs
alphabet
Zs+1
alphabet
ZL
alphabet
ZL+1
Process s
Alphabet Compression
Z Z
Potential Distortion
Z Z
M. Chen and A. Golan, What may visualization processes optimize?, IEEE Transactions on Visualisation and Computer Graphics, 2015
Data Process
1 ...
alphabet
Z1
... Process
L Decision
alphabet
Z2
alphabet
Zs
alphabet
Zs+1
alphabet
ZL
alphabet
ZL+1
Process s
Process
Z V
forward mapping
z1 z2 ...
zn
v1 v2 ...
vm
Process
Z’ V
z’1 z’2 ...
z’n backward mapping
DKL
Solomon Kullback 1907-1994
Richard Leibler 1914-2003
Cost-Benefit Ratio
Data Process
1 ...
alphabet
Z1
... Process
L Decision
alphabet
Z2
alphabet
Zs
alphabet
Zs+1
alphabet
ZL
alphabet
ZL+1
Process s
M. Chen and A. Golan, What may visualization processes optimize?, IEEE Transactions on Visualization and Computer Graphics, 2015
Data Process
1 ...
alphabet
Z1
... Process
L Decision
alphabet
Z2
alphabet
Zs
alphabet
Zs+1
alphabet
ZL
alphabet
ZL+1
Process s
Process
s
Alphabet Compression
Potential Distortion
Cost Process
s+1
Alphabet Compression
Potential Distortion
Cost
Composite Process for s and s+1
Alphabet Compression Potential Distortion
Cost
Information Theory and Visualization
1. Data Intelligence a big picture 2. Visualization a small picture
3. Measurement, Explanation, and Prediction 4. Example: Visual Multiplexing
5. Example: Error Detection and Correction 6. Example: Process Optimization
7. Summary
Min Chen
H(X)
V(G)
D
) (
) (VMR) (
Ratio Mapping
Visual
X G H
V
) (
) 0 ), ( )
( (ILR) max(
Ratio Loss
n Informatio
X
G X
H
V H
D V( ) (DSU)
on Utilizati Space
Display G
M. Chen and H. Jänicke, An information-theoretic framework for visualization, IEEE Transactions on Visualisation and Computer Graphics, 2010
V(G) = H(Z)
D
D
0
64 128 192 256
0 8 16 24 32 40 48 56 64 minimal 64 pixels
minimal 256 pixels
X
X X X X
X X X X X X
X X X X
X X X
X X X
X
X X
X X X X X X X X X X X
X X X X
X X X X X
X X
X X X
X X X X
X X
X X X
X X X X
X X
X X X X X
X X X
X X X X X X X X
X X X X
X X X X
X X X X X X X X X X X X X X
X X X X
X X
X X X X X X
X X X X X
X X X X
X
X X X X X
X X X X
X X X X
X X X X X
X X X X X
X X X
X X X
X X X X
X X X X
X X X X X
X X X X
X X X X X X X X X
X X X X X
X X X X
X X X X
X X
X X X X
X X X X X X X X X X X X X X X
X X X X
X X X X X X
X X X X X
X X X X X
X X X X X
minimal 8 pixels
minimal 64 pixels
byte 1
byte 16
byte 32
byte 64 byte 48
7 6 5 4 3 2 1 0
256 512 log 1
256 ) 1
(
64
0 255
0
2
t i
Z H
0 64 128 192 256
0 8 16 24 32 40 48 56 64 0
64 128 192 256
0 8 16 24 32 40 48 56 64
(a) evenly distributed p
0 64 128 192 256
0 8 16 24 32 40 48 56 64 minimal 64 pixels
minimal 256 pixels
H(X) = 512 bits V(G) = 512 bits D = 16382 bits VMR = 1
ILR = 0 DSU = 0.03
H(X) = 512 bits V(G) = 384 bits D = 4096 bits VMR = 0.75 ILR = 0.25 DSU = 0.09
Information Theory and Visualization
1. Data Intelligence a big picture 2. Visualization a small picture
3. Measurement, Explanation, and Prediction 4. Example: Visual Multiplexing
5. Example: Error Detection and Correction 6. Example: Process Optimization
7. Summary
Min Chen
0 64 128 192 256
0 8 16 24 32 40 48 56 64 0
64 128 192 256
0 8 16 24 32 40 48 56 64
(a) evenly distributed p
information loss:
25%
0 64 128 192 256
0 8 16 24 32 40 48 56 64 minimal 64 pixels
minimal 256 pixels
M. Chen and H. Jänicke, An information-theoretic framework for visualization, IEEE Transactions on Visualisation and Computer Graphics, 2010
0 64 128 192 256
0 8 16 24 32 40 48 56 64 0
64 128 192 256
0 8 16 24 32 40 48 56 64
(a) evenly distributed p
0 64 128 192 256
0 8 16 24 32 40 48 56 64 0
64 128 192 256
0 8 16 24 32 40 48 56 64
(b) unevenly distributed p
information loss:
25.8%
Z Z'
A:
0 32 64 96 128 160 192 224 256
B:
C:
D:
0 8 16 24 32 40 48 56 64
probability linear
2
1 p
4
1 p
8
1 p
8
1 p
information loss:
25.0%
0 32 64 96 128 160 192 224 256
0 8 16 24 32 40 48 56 64 0
64 128 256
0 8 16 24 32 40 48 56 64 192
0 64 128 192 256
0 8 16 24 32 40 48 56 64 0
64 128 192 256
0 8 16 24 32 40 48 56 64
(a) evenly distributed p
0 64 128 192 256
0 8 16 24 32 40 48 56 64 0
64 128 192 256
0 8 16 24 32 40 48 56 64
(b) unevenly distributed p (c) 4 regional mappings
information loss:
25.8%
information loss:
25.0%
information loss:
22.6%
0 64 128 192 256
0 8 16 24 32 40 48 56 64 0
64 128 192 256
0 8 16 24 32 40 48 56 64
(a) evenly distributed p
0 64 128 192 256
0 8 16 24 32 40 48 56 64 0
64 128 192 256
0 8 16 24 32 40 48 56 64
(b) unevenly distributed p (c) 4 regional mappings
20 22 24 26 28
0 8 16 24 32 40 48 56 64 0
64 128 192 256
0 8 16 24 32 40 48 56 64
(d) logarithmic plot
0 32 64 96 128 160 192 224 256
0 8 16 24 32 40 48 56 64 0
64 128 256
0 8 16 24 32 40 48 56 64 192
information loss:
25.8%
information loss:
25.0%
information loss:
22.6%
information loss:
0%
A:
B:
C:
D:
2
1 p
4
1 p
8
1 p
8
1 p
12
p
22
1 p
23
1 p
2 1
1
k p
p k
2
1
p k
2
1
...
k
D
H(X) Hmax(X)
V(G)
http://en.wikipedia.org/wiki/Treemapping http://hci.stanford.edu/jheer/files/zoo/
Information Theory and Visualization
1. Data Intelligence a big picture 2. Visualization a small picture
3. Measurement, Explanation, and Prediction 4. Example: Visual Multiplexing
5. Example: Error Detection and Correction 6. Example: Process Optimization
7. Summary
Min Chen
http://learnpracticalgis.com/how-to-overlay-maps/
vis-encoder vis-link (consisting of many vis-channels) vis-decoder
information about at p
at p c3 cc14 c2
ck c2 ck
c3 spatial domainD
temporal domainT
other signals and noise
MUX DEMUX
Location pcan be associated with X in the source data or determined by a spatial mapping.
Xcan be a data record or a set of partially encoded visual attributes.
Perceived information may include estimated values and relationships with data conveyed by other signals.
p
M. Chen, S. Walton, K. Berger, J. Thiyagalingam, B. Duffy, H. Fang, C. Holloway, and A. E. Trefethen, Visual multiplexing, Computer Graphics Forum, 2014.
60 70
50
text label
Data Space
Visualization Space
Display Space
H
Data Space Entropy
V(G)
Visualization Capacity (Visualization Space Entropy)
D
Display Space Capacity
V(G) D
<< 1
R. P. Botchen, S. Bachthaler, F. Schick, M. Chen, G. Mori, D.
Weiskopf and T. Ertl, Action-based multi-field video visualization, IEEE Transactions on Visualization and Computer Graphics, 2008.
Information Theory and Visualization
1. Data Intelligence a big picture 2. Visualization a small picture
3. Measurement, Explanation, and Prediction 4. Example: Visual Multiplexing
5. Example: Error Detection and Correction 6. Example: Process Optimization
7. Summary
Min Chen
Information Theory and Visualization
1. Data Intelligence a big picture 2. Visualization a small picture
3. Measurement, Explanation, and Prediction 4. Example: Visual Multiplexing
5. Example: Error Detection and Correction 6. Example: Process Optimization
7. Summary
Min Chen
Four Levels of Visualization
M. Chen and A. Golan, What may visualization processes optimize?, IEEE Transactions on Visualisation and Computer Graphics, 2015
Observational Visualization
Observational Visualization
Information Theory and Visualization
1. Data Intelligence a big picture 2. Visualization a small picture
3. Measurement, Explanation, and Prediction 4. Example: Visual Multiplexing
5. Example: Error Detection and Correction 6. Example: Process Optimization
7. Summary
Min Chen