Mining Arguments from Online Debating Systems

18
M A O D S 6 MLDM. W @ AI*IA 2017 Andrea Pazienza, Stefano Ferii th November – Bari, Italy

Transcript of Mining Arguments from Online Debating Systems

Page 1: Mining Arguments from Online Debating Systems

MINING ARGUMENTSFROM ONLINE DEBATING SYSTEMS6TH MLDM.IT WORKSHOP @ AI*IA 2017

Andrea Pazienza, Stefano Ferilli

14th November 2017 – Bari, Italy

Page 2: Mining Arguments from Online Debating Systems

Overview

1. Introduction to Argumentation

2. Mining Argumentation Graphs from Online DebatingSystems

3. Application to a Reddit Thread

4. Conclusions and future works

Page 3: Mining Arguments from Online Debating Systems

INTRODUCTION TO ARGUMENTATION

Page 4: Mining Arguments from Online Debating Systems

Introduction to Argumentation

Argument MiningAnalyzing argumentationstructures in the discourse

AbstractArgumentationA framework for practicaland uncertain reasoningable to cope with partial andinconsistent knowledge

Page 5: Mining Arguments from Online Debating Systems

Argument Mining

Where are we?# Computational linguistics: statistical or rule-based modeling of natural

language# NLP: interactions between computers and human languages# Data mining: automatic extraction of data (often numerical)# Text mining: automatic extraction of data from natural language texts# Argument mining: automatic extraction of arguments from natural language

texts

Why to do it?# Big Data problem: the ever increasing amounts of data on the web mean that

manual analysis of this content seems to become increasingly infeasible.

How to process Big Data?

# By mining information

Page 6: Mining Arguments from Online Debating Systems

Argument Mining

Existing approaches in NLP primarily focus on micro-level(monological) rather than macro-level (dialogical) perspective

What can we mine?# Sentiment Anaysis: mining attitudes towards something (positive, neutral,

negative)# Opinion Mining: mining opinions about something# Graph-based dialogical model

Page 7: Mining Arguments from Online Debating Systems

(Macro-level) Argument Mining

Argument Mining# Segmenting texts into argumentative units

# Identifying relations between units

# Analyzing Polarity and classifying Stance◦ Sentiment Analysis: e.g. number of

people liking new Mercedes vs. numberof people not liking new Mercedes

◦ Opinion Mining: e.g. people thinkingnew Mercedes is too expensive, peoplethinking new Mercedes is reliable

# Mining arguments pro- and con- people’s

opinions◦ e.g. not only the information that people

like Mercedes but also why: people likenew Mercedes, because they think it isreliable

Page 8: Mining Arguments from Online Debating Systems

Abstract Argumentation

Argumentation Framework (AF)# encapsulates arguments as nodes in a digraph

# connects them through a relationship of attack

# defines a calculus of opposition for determiningwhat is acceptable

# allows a range of different semantics

a

b

e

cd

f

g

h

Generalizations of Argumentation Frameworks# Bipolar: add support relation

# Weighted: add weights on attacks# Values, Preferences

# etc.

Extension-based vs Ranking-based Semantics

# extension-based semantics do not fully exploit the weight of relations

# rank arguments from the most to the least acceptable ones

Page 9: Mining Arguments from Online Debating Systems

MINING ARGUMENTATION GRAPHSFROM ONLINE DEBATING SYSTEMS

Page 10: Mining Arguments from Online Debating Systems

Online Debating Systems (ODS)

Classical Thread Discussion# A tree (i.e., hierarchical)

structure consisting of# Root node: the discussion

topic, i.e. the major claim,shared by a user, followed by

# Comments from other users# Each of these comments has

as children the comments inresponse to it.

Step 1: Identification of arguments

# Each content is an abstract argument

Page 11: Mining Arguments from Online Debating Systems

Problem Approach

Step 2: Identification of relations between arguments

# Sentiment and Tone Analysis◦ systematically identify, extract and quantify Polarity◦ 5 Polarity classes: very negative, negative, neutral, positive, very positive◦ StanfordCoreNLP provides the sentiment tool

# Text Similarity◦ check if two arguments are addressing the same topic◦ Semantic Similarity via word embeddings◦ GloVe algorithm for obtaining vector representations for words

Given α, β ∈ A two abstract arguments, representing twoconsecutive comments of the online debate:

w(〈α, β〉) � similarity(〈α, β〉) · sentiment(β) (1)

# similarity : A×A 7→ [0, 1]# sentiment : A 7→ [−1, 1]

Page 12: Mining Arguments from Online Debating Systems

Bipolar Weighted Argumentation Framework

Bipolar Weighted Argumentation Framework (BWAF)

# attack relations with a negativeweight in the interval [−1, 0[

# support relations with a positiveweight in the interval ]0, 1]

a

b0.7

e

-0.7

c0.9

d

-0.4

0.3

f

-0.5 g-0.3

h

-0.5

-0.1

-0.7

BWAF Ranking-based Semantics by means of Strength Propagation

Page 13: Mining Arguments from Online Debating Systems

APPLICATION TO A REDDIT THREAD

Page 14: Mining Arguments from Online Debating Systems

Application to a Reddit Thread

We consider a Reddit discussion of an episode of Black Mirror, apopular TV series

-0.49

-0.48

0.16

0.32

-0.1

-0.17

-0.1

-0.42

0.16

-0.470.16

-0.13

0.5

0.06

0.05

-0.5 0.5

0.45

-0.15

0.08

0.08

-0.5-0.44

-0.5

-0.5

-0.5

-0.1

-0.08

-0.5

-0.48

-0.46

-0.13

-0.12

-0.12

-0.5

-0.16

-0.32

-0.33

-0.16

0.32

-0.16

0.17

-0.4

-0.5

-0.17

-0.25

0.250.24

0.22

-0.25

-0.16

-0.34

-0.48

-0.24

-0.16-0.45

-0.16

-0.14

-0.48

0.43

-0.45

-0.46

-0.44

-0.47

-0.48

-0.48

-0.48

-0.19

-0.45

a17

a22a80

a8

a75

a10

a24

a64

a26

a48a47

a4

a29a50

a68

a73

a65

a84

a74

a54

a78

a58

a66

a49

a41

a51

a55

a5

a40

a56

a62

a57

a61

a69

a60

a63

a59

a9

a71

a53

a45

a72

a27

a0

a85

a39

a38

a34a37 a36

a35

a28

a33a43

a31

a42

a32

a13a44

a11

a16

a15

a14

a12

a18

a20

a76

a19

a21

a23

# Sentiment Polarity + TextSimilarity to build the BWAF

# Arguments acceptability viasp-ranking Semantics

# Construction procedure mayembed some noise but is simpleand computationally fast, so thatthe argumentation modelinstantiation will be still quitereliable.

Page 15: Mining Arguments from Online Debating Systems

Application to a Reddit Thread

arg sp arg sp arg sp arg sp arg sp arg spa4 1.45 a80 1.0 a59 1.0 a38 1.0 a13 1.0 a32 0.8176a43 1.43 a78 1.0 a57 1.0 a36 1.0 a8 1.0 a34 0.76a22 1.32 a76 1.0 a55 1.0 a35 1.0 a5 1.0 a11 0.7077a10 1.108 a73 1.0 a53 1.0 a33 1.0 a58 0.9641 a39 0.68a19 1.0855 a72 1.0 a51 1.0 a29 1.0 a28 0.9595 a56 0.67a50 1.08 a71 1.0 a49 1.0 a26 1.0 a18 0.9589 a37 0.66a48 1.05 a69 1.0 a47 1.0 a24 1.0 a74 0.954 a75 0.58a9 1.0425 a66 1.0 a45 1.0 a23 1.0 a60 0.9488 a68 0.56a17 1.0197 a65 1.0 a44 1.0 a21 1.0 a0 0.9225 a20 0.55a27 1.0035 a64 1.0 a42 1.0 a16 1.0 a31 0.9047 a12 0.54a85 1.0 a63 1.0 a41 1.0 a15 1.0 a54 0.8492a84 1.0 a62 1.0 a40 1.0 a14 1.0 a61 0.84

The (collective) strength propagation of all paths ending to nodes takes advantages ofweight of relations and, in particular, of weighted support relations

Page 16: Mining Arguments from Online Debating Systems

CONCLUSIONS AND FUTURE WORKS

Page 17: Mining Arguments from Online Debating Systems

Conclusions and Future Works

# Argument Mining to extract arguments and relations betweenthem, to build a graph-based dialogical model

# Considering the similarity between the comments, the sentimentassociated with them and their hierarchical structure, extract anAF that models an online debate by identifying weighted attacksand supports depending on their strength

# To improve the quality of the argument graph construction,further argument mining techniques may be exploited, eventhough this may drastically impact on the computational cost.

Page 18: Mining Arguments from Online Debating Systems

Perspectives

Opinion mining:# understanding what people think about something VS understanding why

Going beyond critical thinking, i.e., a set of rational, deductivearguments:

# How to influence a real audience?# What is the role of emotions?# How to analyze human reasoning processes?

Big data:# social network posts, forums, blogs, product reviews, user comments to

newspapers articles, etc.

Deep learning:# fast and efficient machine learning algorithms large and unsupervised corpora# e.g., word embeddings: automatically learned feature spaces encoding

high-level, rich linguistic similarity between terms