• No results found

Visual Analysis of Text Annota1ons for Stance Classifica1on with ALVA

N/A
N/A
Protected

Academic year: 2022

Share "Visual Analysis of Text Annota1ons for Stance Classifica1on with ALVA"

Copied!
1
0
0

Laster.... (Se fulltekst nå)

Fulltekst

(1)

Visual Analysis of Text Annota1ons for Stance Classifica1on with ALVA

The automa1c detec1on and classifica1on of stance taking in text data using natural language processing and machine learning methods create an opportunity to gain insight about writers’ feelings and aAtudes towards their own and other people's uCerances. However, this task presents mul1ple challenges related to the training data collec1on as well as the actual classifier training. In order to facilitate the process of training a stance classifier, we propose a visual analy1cs approach called ALVA for text data annota1on and visualiza1on. Our approach supports the annota1on process management and supplies annotators with a clean user interface for labeling uCerances with several stance categories. The analysts are provided with a visualiza1on of stance annota1ons which facilitates the analysis of categories used by the annotators. ALVA is already being used by our domain experts in linguis1cs and computa1onal linguis1cs in order to improve the understanding of stance phenomena and to build a stance classifier for applica1ons such as social media monitoring.

Kos$antyn Kucher*

Linnaeus University, Sweden

*Contact: [email protected]

Screenshot of the web-based annota1on interface of ALVA. Annotators are presented with a single uCerance at a 1me. They can label it with one or several stance categories, label it as neutral (no stance), or label it as irrelevant (e.g., if the text contains only URLs or numbers).

Visualiza1on of about 8,000 text annota1ons in our system ALVA with the CatCombos representa1on. Each annota1on represented by a colored dot can be labeled with up to ten stance categories in our concrete use case. Annota1ons are grouped together into rectangular blocks by the combina1on of categories which occur in the data set. Thus, the block groups form layers: the top layer contains 15 annota1on blocks labeled with four categories simultaneously, and the boCom layer with a single block solely contains neutral annota1ons. Color-coded rectangles in the block headers represent the corresponding sets of categories. Here, all annota1ons related to the category Concession and Contrariness are highlighted in blue.

hCp://cs.lnu.se/isovis/

Carita Paradis

Lund University, Sweden Magnus Sahlgren

Gavagai AB, Sweden

Andreas Kerren

Linnaeus University, Sweden

The overview of the main aspects of ALVA.

EuroVis ‘16, 6-10 June 2016, Groningen, The Netherlands

Social media texts collected with our previous tool uVSAT

MongoDB

Annota1on

interface Annota1on management

Visualiza1on of annota1ons

Our current data set comprises about 8,000 annota1ons of uCerances in English (in most cases, individual sentences) collected from social media on poli1cal topics such as the US elec1on. The annota1ons were performed by several annotators during mul1ple annota1on rounds.

As displayed in the En1ty-Rela1onship diagram, a single annota1on corresponds to a combina1on of annotator, annota1on round value, and actual uCerance.

The analysts (researchers in linguis1cs and computa1onal linguis1cs) are interested in the following ques1ons corresponding to visualiza1on tasks:

•  Are there many annota1ons marked as neutral and irrelevant?

•  What is the distribu1on of individual stance categories in the data?

•  Are there many annota1ons labeled with mul1ple categories?

•  Which stance categories tend to co-occur in annota1ons?

•  Is it possible to compare annota1ons made for the same uCerances?

We have designed a representa1on called CatCombos for our visualiza1on which is based on the ideas of seman1c substrates and set visualiza1ons. It focuses on the groups of annota1ons rather than individual annota1ons to provide overview. By combining it with dynamic queries, details on demand, and highligh1ng links between annota1ons made for the same uCerance, the analysts can use ALVA for exploratory visual analysis of the annotated data.

Analyst Annotator

Referanser

RELATERTE DOKUMENTER

“To a certain extent, historical camp style and taste have been subsumed by a more generalized sense of postmodern irony and pastiche, a stance that approaches life (and media

Organized criminal networks operating in the fi sheries sector engage in illicit activities ranging from criminal fi shing to tax crimes, money laundering, cor- ruption,

Recommendation 1 – Efficiency/sustainability: FishNET has been implemented cost-efficiently to some extent, and therefore not all funds will be spent before the project’s

However, this guide strongly recommends that countries still undertake a full corruption risk assessment, starting with the analysis discussed in sections 2.1 (Understanding

The samples include a carbon fiber epoxy composite and a sandwich-structured composite panel with an aramid fiber honeycomb core in between two skin layers of fiberglass

The increasing complexity of peace operations and the growing willingness of international actors to assume extended responsibil- ity for the rule of law in often highly

Sivanandan’s use of Marxism to contextualise ‘race’ relative to class, 270 corresponded with the stance of Marxism in the United Kingdom in the 1970s, as described in chapter 2.

Often certain aspects of shape will relate to the location of the world space object the character is reacting to. If a per- son is recoiling from a large spider, the direction of