1 Thies Pfeiffer – Central Facility Labs
Analysis of mobile eye-tracking studies
Eyetracking Quickstart
Buswell, 1935
Taken from Joos et al. 2005
Prukinje-Eyetracker, Source unknown
3 Thies Pfeiffer – Central Facility Labs
Why do we move our eyes?
The eyes only perceive a small part of the world in high acuity 1,3°in the foveola to
5° in the fovea
4 Thies Pfeiffer – Central Facility Labs
Why do we move our eyes?
The eyes only perceive a small part of the world in high acuity 1,3°in the foveola to
Where?
5 Thies Pfeiffer – Central Facility Labs
Why do we move our eyes?
The eyes are permanently on the move to scan our environemnt
Where?
When?
The eyes only perceive a small part of the world in high acuity 1,3°in the foveola to
5° in the fovea
6 Thies Pfeiffer – Central Facility Labs
Why do we move our eyes?
Fixation
Point-of-Regard
Scanpath
(Alfred L. Yarbus, 1967)
Where?
When?
The eyes only perceive a small part of the world in high acuity 1,3°in the foveola to
The eyes are permanently on the
move to scan our environemnt
7 Thies Pfeiffer – Central Facility Labs
What do we process during fixations?
Fixation
• Express Fixations < 100 ms
• identification of known stimuli (e.g. brands, signs)
• Image Processing 100 – 300 ms
• processing of emotions and image- based information
• Reading > 300 ms
• analysis and understanding of texts and complex structures
Scanpath
Point-of-Regard How long?
The eyes are permanently on the move to scan our environemnt
(Alfred L. Yarbus, 1967)
What?
Additional relevant eye movements
• Smooth Pursuit
• following moving targets
• Spatial Perception
• Accomodation
• adapting the lense to different depths of focus
• Vergence Movements
• bringing the point of regard onto
corresponding retinal areas in
both eyes
9 Thies Pfeiffer – Central Facility Labs
How do we measure eye movements?
Buswell, 1935
Taken from Joos et al. 2005
Prukinje-Eyetracker, Source unknown
Detecting eye orientation
• most common method today based on video camera in the infrared domain
• eye is illuminated using infrared LED
• detecting the pupil using methods of computer vision
• for stabilization, the reflection of the LED on the lens is also detected
measurements provide the
orientation of the eye and
the size of the pupil
11 Thies Pfeiffer – Central Facility Labs
Mapping eye orientation to the computer screen
• Goal: get fixation information in terms of screen coordinates >
plane of analysis (2D)
• Requirement: head/eye position relative to eye tracker + eye tracker position relative to screen
• two solutions
• fix the head using a chinrest or bitebar
• use computer vision again to track the eye position
• mapping from eye position and orientation to screen (analysis
plane) often done using an explicit calibration
Modern remote eye tracking system attached to a
laptop.
How do we analyse eye movements?
13 Thies Pfeiffer – Central Facility Labs
Scanpath Analysis
How to create a scanpath:
• map orientation and position of the eye to coordinates of target stimuli to get fixated point (short
“fixation”)
• e.g. desktop pixels
• map fixation duration to radius, draw circle around fixated point
• connect subsequent fixations by straight lines,
• in addition to that, sometimes fixations
are numbered
Scanpath Analysis
What do we learn?
• closer investigation of a single individual
• visualization of the viewing process
• Important indices
• number of fixations, duration of fixations, distance of saccades
• re-fixations: going back to previously fixated areas
• sub-path patterns
• Example research topics
• text understanding
• predicting next fixation target, e.g. syllables in reading
15 Thies Pfeiffer – Central Facility Labs
Region Analysis
How to create a region analysis:
• define regions on the target stimuli and label them
• rectangles in the simplest case, but polygons are also possible
• aggregate fixations within each region and create per-region statistics
• e.g. min/max duration, median duration, number of fixations, total duration
• connect regions according to the
frequency of their transitions
with directed arrows
Region Analysis
What do we learn?
• investigation of groups
• coarse visualization of the viewing process
• Important indices
• number of fixations, duration of fixations
• transitions, transition probability
• Example research topics
• interaction between text and images
17 Thies Pfeiffer – Central Facility Labs
Analysis of Attention Maps / Heatmaps
How to create an attention map:
• map orientation and position of the eye to coordinates of target stimuli to get fixated point
• map fixation duration to attention level
• spread around the fixated point according to the area of high acuity
• typically modelled as a Gaussian distribution
• map the accumulated attention level to a color
• e.g. heat color ramp ⇒ Heatmaps
Analysis of Attention Maps / Heatmaps
What do we learn?
• investigation of groups
• taking into account area of high acuity
• Important indices
• duration of fixations
• areas of low/high attention level
• Example research topics
• saliency mapping (e.g. comparing with computer vision)
• quantitative analysis of designs
19 Thies Pfeiffer – Central Facility Labs
What do these approaches have in common?
• In all approaches, the human establishes the link between
“pixels of attention” and the attended content
• Implicitly for scanpaths and attention maps based on spatial co- occurance
• Explicitly for region-based analysis
• Process
• Gaze Position & Orientation (2.5D)
⇒ Screen Coordinates (2D)
⇒ Content (2D)
What do we hope to get
from following eye movements?
21 Thies Pfeiffer – Central Facility Labs
Speech processing
- Visual World Paradigm
• Idea:
• Based on the eye movements of a listener during a verbal instruction one can draw conclusions about the processing of different language structures in the brain (e.g. preferences, sequences, etc.)
• Method of empirical research in psycholingusitics: Visual World Paradigm
(Tanenhaus, et al. (1995). Integration of Visual and Linguistic Information in Spoken Language Comprehension. Science, 268, 1632-1634)
Weiß, P., Pfeiffer, T., Eikmeyer, H. - J., & Rickheit, G. (2006).
Processing Instructions. In G. Rickheit & I. Wachsmuth (Eds.), Situated Communication(pp. 31–76). Berlin: Mouton de Gruyter.
Detection perceptual biases
23 Thies Pfeiffer – Central Facility Labs
Assessing cognitive processes - Detecting Search Strategies
Gaze Analysis
Pfeiffer, J., Pfeiffer, T., & Meißner, M. (In Press).Towards Attentive In-Store Recommender Systems: Detecting Exploratory vs. Goal-oriented Decisions. Proceedings of the SIGDSS 2013 Pre-ICIS Workshop - Reshaping Society through Analytics, Collaboration, and Decision Support: Role of BI and Social Media.
Assessing level of expertise
One of the observed persons is the expert, the other a trainee.
Which video shows the recordings of the expert?
Cooperation with Prof. Dr. Jörg Thomaschewski, HS Emden-Lehr
25 Thies Pfeiffer – Central Facility Labs
HCI: Interaction between
Speech, Gestures, Gaze and Environment
• Using motion capturing and eye tracking we measured gaze and gestures during communication of references
• Result: the best approximation of pointing direction was achieved taking the gaze direction of the
dominant eye into account (Pfeiffer 2011).
•
Implications for the addressee
Pfeiffer, T. (2011). Understanding Multimodal Deixis with Gaze and Gesture in Conversational Interfaces(Berichte aus der Informatik) . Aachen, Germany: Shaker Verlag.
°
Grounding with the Eyes: Joint Attention
• If interaction partners deliberately direct their attention towards a target, this is called: Joint
Attention.
• In establishing Joint Attention, it is important in which sequence the gaze alternates between the target and the interlocutor ⇒
„Communication Protocol“
Pfeiffer-Leßmann, N., Pfeiffer, T., & Wachsmuth, I. (2013).A model of joint attention for humans and machines.ECEM 2013, JEMR Vol. 6, pp. 152–152.
27 Thies Pfeiffer – Central Facility Labs
Desktop-based 2D Systems
Advantages
• Strong assumption of comparable perspectives between participants
• Strong assumption about temporal synchronization of perceived
stimulus onsets (because they are always within field of view)
• Effortless identification of gaze targets
• Convenient tools for analyzing gaze
data (Scanpath, Heatmap, AOI/ROI)
Desktop-based 2D Systems
Disadvantages
• Restricted application domain
• restricted field of view
• restricted presentations of 3D stimuli
• restricted interaction with other modalities (walking, sports, etc.)
• only simple interactive situations
• almost no social interactions
• no real-life situations
• In almost all cases, the target domain
needs to be modelled in the computer to
be subject to analysis
29 Thies Pfeiffer – Central Facility Labs
Current Trend:
From Stationary Eye Tracking to Mobile Systems
Leaving the Laboratory, Embracing the Real World
31 Thies Pfeiffer – Central Facility Labs
Studying Real Interactions
Studies on human-human interactions in close interaction spaces.
Measuring Mobile Eye Tracking Data
• Basic idea similar to desktop
• Mapping eye orientation to video plane for analysis
• General eye position and arrangement of camera and plane known by design
• Hard part is detecting the pupil in
different lighting conditions and
environments
33 Thies Pfeiffer – Central Facility Labs
Why is mobile eye tracking then so difficult?
• Main problems:
• Content on the analysis plane is not known
• dynamic environments
• moving head ⇒ moving camera ⇒ moving content
• Location of content on the analysis plane depends on time, position and orientation of the wearer’s head > highly individual data
• Fixation data cannot be aggregated simply by location
• What is a fixation in a mobile setting anyway?
• Standard methods of analysis are not directly applicable
• They rely on the assumption of a static content on the plane of
analysis that does not change over time and/or between participants.
• Regions of interest are going in the right direction, but they are also
normally presented visually in static locations
Standard Solution:
Manual Annotation
• Manual Annotation of Gaze Videos
• going through the recordings
• frame-by-frame or
• fixation-by-fixation
• labelling each fixation according to underlying content
• Some approaches to speed up this process exist
• e.g. semanticode
• Result: comparable to region
analysis, good for statistics, but
no precise location on stimuli
35 Thies Pfeiffer – Central Facility Labs
Problems with Manual Annotation
• Direct problems
• very time consuming, often 15x original recording time
• cost/benefit ratio renders many studies inoperable
• differences in interpretation between annotators
• Inter-rater agreement, annotating (selected) sequences by several annotators
• Indirect problems
• because of effort, re-analysis is unlikely to happen and thus post-hoc changes of the annotation manual are unlikely to happen
• reduces scientific quality
• errors in the recordings are often only detected during analysis
• collecting more data is often problematic when distance in time is too large,
additional quality control right after recordings increases again the workload
Huge Problem:
Increased Numbers of Participants
37 Thies Pfeiffer – Central Facility Labs
Options to get out of the misery
• Do not be interested in content
• activity detection by raw gaze data analysis
• drowsiness detection
• detection of cognitive load
Options to get out of the misery
• Identify the content in the plane of analysis (scene camera video) automatically
Harmening, K. & Pfeiffer, T. Location-based online
identification of objects in the centre of visual attention using eye tracking. Proceedings of the First International Workshop on Solutions for Automatic Gaze-Data Analysis 2013 (SAGA 2013), Center of Excellence Cognitive Interaction Technology, 2013, 38- 40
39 Thies Pfeiffer – Central Facility Labs
Options to get out of the misery
• Do away completely with the weak 2D world!
• Standard Intermediate Approach
• Gaze Position & Orientation (2.5D)
⇒ Screen Coordinates (2D)
⇒ Content (2D)
• Direct Approach
• Gaze Position & Orientation (3D)
⇒ Content (3D)
3D Gaze Analysis in Virtual Reality
41 Thies Pfeiffer – Central Facility Labs
Parts of the problem already solved
Content
• is already represented in 3D
Gaze
• has to be mapped to
the 3D world.
Typical Virtual Reality Installatoin:
- CAVE
• 3D Stereo projections surrounding the user
• Head position and orientation is
tracked anyway to
compute the required
perspective for the
rendering process
43 Thies Pfeiffer – Central Facility Labs
VR Example: Joint Attention
- Cognitive Modelling in the Agent Max
Pfeiffer-Leßmann, N., Pfeiffer, T., & Wachsmuth, I. (2012). An operational model of joint attention - timing of gaze patterns in interactions between humans and a virtual human. Proceedings of the 34th Annual Conference of the Cognitive Science Society(pp. 851–856).
Combining Motion Capturing and Eye Tracking
• Scene Camera
• Video-based Eye Tracking
• Binocular
• Infrared LED
• Cable bound
45 Thies Pfeiffer – Central Facility Labs
Construction of the 3D User Model
Head
Eye Left
Eye Right
Transfor mation
Transfor mation Transform
ation
Head Position & Orientation
Eye Orientation
Eye Distance
Accuracy and Precision
(Pfeiffer, 2008)
48 Thies Pfeiffer – Central Facility Labs
Model-based Determination of 3D Point-of-Regard
3D Point-of-Regard
• Basic approach
• Requires only monocular eye tracking
Position is determined by intersecting gaze-ray with
object models
Model-based Determination of 3D Point-of-Regard
3D Point-of-Regard?
Position is determined by intersecting gaze-ray with
object models
50 Thies Pfeiffer – Central Facility Labs
Taking Vergence Movements into Account
3D Point-of-Regard!
Gaze depth is detemined by
analyzing vergence movements
Taking Vergence Movements into Account
3D Point-of-Regard?
Gaze depth is detemined by
analyzing vergence movements
52 Thies Pfeiffer – Central Facility Labs
Machine Learning Approach
• Intersection of line of sight • Parameterized Self- Organizing Map (ML)
(Pfeiffer, Latoschik und Wachsmuth, 2010)
Machine Learning Approach
3D Point-of-Regard 3D Point-of-Regard based on
machine learning (PSOM)
54 Thies Pfeiffer – Central Facility Labs
Visualization: 3D Scan Path (Single Person)
• Fixations as spheres
• Size represents duration
• Saccades represented as links
Visualization: 3D Scan Path (Multiple Persons)
Data
• 3D point-of-regards using PSOM
• 10 persons
• Visualization not suitable for many
parallel 3D scan paths
56 Thies Pfeiffer – Central Facility Labs
Model-of-Interest based Visualization
(Stellmach, Nacke und Dachselt, 2010)
Data
• 3D point-of-regard detemined by model intersection
• Recorded on desktop, monocular eye tracking
Visualization
• Color-coding duration of attention or number of fixations per object
• Analogous to 2D Heatmaps
• red: most-often fixated areas
• blue: rarely fixated areas
• uncolored: not fixated areas
Surface-based Visualization
(Stellmach, Nacke und Dachselt, 2010)
Data
• 3D point-of-regard detemined by model intersection
• Recorded on desktop, monocular eye tracking
Visualization
• Color-coding duration of attention
or number of fixations per object
58 Thies Pfeiffer – Central Facility Labs
3D Attention Volumes
Data
• 3D point-of-regard based on PSOM
Visualization
• Volume rendering of attention
• Models of the objects of interest are not necessarily required
Pfeiffer, T. (2011). Understanding Multimodal Deixis with Gaze and Gesture in Conversational Interfaces(Berichte aus der Informatik) . Aachen, Germany: Shaker Verlag.
3D Attention Volumes
60 Thies Pfeiffer – Central Facility Labs
3D Attention Volumes on Real Objects
Transition to 3D Gaze Analysis in Real Life
62 Thies Pfeiffer – Central Facility Labs
Parts of the problem already solved
Content
• Idea:
• only model relevant
aspects of the world, so called proxy objects
Gaze
• has to be mapped to
the 3D world.
Getting Head Position & Orientation
Egocentric Camera Pose
Estimation using Scene Camera
• inexpensive
• requires computational power
• might be intrusive to design (markers)
Camera Pose Estimation using Outside-in Tracking
• high precision
• expensive (20.000,- and up)
• restricted area
64 Thies Pfeiffer – Central Facility Labs
Camera Pose Estimation
3D position
& orientation
3D AOI: Annotated Proxy Geometry
3D position
& orientation Window
Door
Chimney
3D position
& orientation
66 Thies Pfeiffer – Central Facility Labs
Determining Fixation Target
3D position
& orientation Window
Door
Chimney
gaze ray
3D area of interest 3D position
& orientation
First
Set-up Coordinate Frame
68 Thies Pfeiffer – Central Facility Labs
Second
Place Target Objects
Third
Enter 3D Proxy Geometries into the Model
Used to identify target Annotate Proxy Objects:
<MillimeterField DEF='field1' id='1'>
<ObservableObject DEF='MyObject' name='AOI' position='0 0 0' size='1 1 1'/>
</ MillimeterField >
70 Thies Pfeiffer – Central Facility Labs
Third – Advanced Users
Alternatively annotate complex objects
For adanced users
• 3D Scans using Microsoft Kinect or Intel RealSense
• 3D Modelling e.g. in Blender
Window
Door
Chimney
Forth
Run the experiment
Alternative 1
• Use standard procedure of eyetracking system to
record video and sample file with gaze data
• Use EyeSee3D to analyse the video and sample data offline
Alternative 2
• Use EyeSee3D in online-
mode to get results in real-
time
72 Thies Pfeiffer – Central Facility Labs
Fifth
Collect Results
• CSV file output:
• Time: absolute time of day
• Framenumber: number of frame from scene camera
• Fixated Object ID: as specified in the model annotation
• Fixated Position: in 3D coordinates
• Observer Position: in 3D coordinates
• Observer Matrix: 4x4 Matrix with Position/Orientation of observer
• Merge that with event/sample files from eyetracker
Efficient Analysis of Mobile Eye-Tracking Studies
Pfeiffer, T., & Renner, P. (2014). EyeSee3D: A Low-Cost Approach for Analysing Mobile 3D Eye Tracking Data
Using Augmented Reality Technology. Proceedings of the Symposium on Eye Tracking Research and Applications, 195–202.
74 Thies Pfeiffer – Central Facility Labs
Eye-Hand Coordination
Towards Visualizations for 3D Eye Tracking
(Stellmach, Nacke und Dachselt, 2010)
Maurus et al. (2014) Realistic Heatmap Visualization for Interactive Analysis of 3D Gaze Data. ETRA 2014
76 Thies Pfeiffer – Central Facility Labs
Recent work:
Towards more realistic 3D attention mapping
• Problems with existing approaches
• based on intersections, not real 3D gaze position [maurus, stellmach]
• centered on objects (no cross object scattering) [stellmach]
• no check for occlusions [stellmach]
• visualization based on vertex coloring [stellmach]
• no support for moving objects [maurus, stellmach]
• Application side
• require dedicated viewer [maurus]
• post-processing process [maurus, stellmach]
• sub-optimal rendering quality [maurus, stellmach]
Recent work
Pfeiffer, Thies and Memili, Cem (2015). GPU-accelerated Attention Map Generation for Dynamic 3D Scenes. IEEE VR 2015.
78 Thies Pfeiffer – Central Facility Labs
3D Attention Mapping on 3D Objects
Our approach
• Realistic 3D Point-of-Regard Modelling
• Shadow mapping for every 3D fixation
• Binocular eye tracking for depth estimation
• 3D Gaussian to represent spread of attention around 3D POR
• Per-object representation of attention in Attention Texture
• Provides adjustable level of detail (texture size, texture UV mapping)
• Allows for moving/transforming objects
• Global maximum collection in Max-Attention-Texture
• Speed-up normalization by reducing read/write cycles
• Splitting attention aggregation from heatmap generation
• Attention is aggregated on per-object level
• Heatmap textures are generated on-the-fly using a shader
• Heatmap textures can be exported for high-quality renderings
80 Thies Pfeiffer – Central Facility Labs
Performance
on Quadro K5000 (173 GB/s, 256-bit)
Mapping Attention in Complex 3D Scenarios
82 Thies Pfeiffer – Central Facility Labs