Comparing and Combining Sentiment Analysis Methods

addictedswimmingΤεχνίτη Νοημοσύνη και Ρομποτική

24 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

86 εμφανίσεις

Pollyanna
Gonçalves

(UFMG, Brazil)

Matheus

Araújo

(UFMG, Brazil)





Fabrício

Benevenuto

(UFMG, Brazil)

Meeyoung

Cha (KAIST, Korea)




Comparing and Combining
Sentiment Analysis Methods


Key component of a new wave of applications that explore social network
data


Summary of public opinion about:


politics, products, services (e.g. a new car, a movie), etc.


Monitor social network data (in real
-
time)


Common as polarity analysis (positive or negative)


Sentiment Analysis on Social Networks



Which method to use?


There are several methods proposed for different contexts


There are several popular methods


Validations based on examples, comparisons with baseline, with use of
limited datasets



There is not a proper comparison among methods


Advantages? Disadvantages? Limitations?


Sentiment Analysis Methods


Compare 8 popular sentiment analysis methods


Focus on the task of detecting polarity:
positive

vs.
negative



Combine methods



Deploy the methods in a system
---

www.ifeel.dcc.ufmg.br





This talk

Ifeel

System

& Conclusions

Methods

&

Methodology


Comparing


&

Combining




Extracted from instant messages services



Skype, MSN, Yahoo Messages, etc.


Grouped as
positive

and
negative

Emoticons


Lexical method (paid software)



Allows to optimize the lexical dictionary
-
> we used the default



Measures various emotional, cognitive, and structural components



We only consider sentiment
-
relevant categories such as
positivity
,
negativity


Linguistic Inquiry and Word Count (LIWC)


Lexical approach based on the
WordNet

dictionary


Groups words in synonyms



Detects
positivity
,
negativity
, and neutrality

of texts




SentiWordNet


Lexical method adapted from a psychometric scale



Consists of a dictionary of adjectives associated to sentiments


Positive
: Joviality, assurance, serenity, and surprise


Negative
: Fear, sadness, guilt, hostility, shyness and fatigue


PANAS
-
t


Uses a well
-
known lexical dictionary namely Affective Norms for
English Words (ANEW)



Produces a scale of happiness


1 (extremely happy) to 9 (extremely unhappy)



We consider [1..5) for
negative

and [5..9] for
positive





Happiness Index


Combines 9 supervised machine learning methods



Estimates the
strength

of
positive

and
negative

sentiment in a text



We used the trained model provided by the authors

SentiStrengh


Machine learning method, trained with Naïve
Bayes
’ model



Trained model implemented as a python library



Classify tweets in JSON format for
positive
,
negative
, neutral and
unsure



SAIL/AIL
Sentiment

Analyzer

(
SASA)


Extract cognitive and affective
information using natural language
processing techniques



Uses the affective categorization model
Hourglass of Emotions



Provides an approach that classify
messages as
positive

and
negative




SenticNet


Comparison of
coverage

and
prediction performance
across different
datasets



Dataset 1: human labeled


About
12,000
messages labeled with Amazon Mechanical Turk:


Twitter, MySpace, YouTube and
Digg

comments, BBC and Runners World forums



Dataset 2: unlabeled


Complete snapshot from Twitter (collected in 2009) ~2 billion tweets


Extracted tragedies, disasters, movie releases, and political events



Focus on the English messages



Methodology

Ifeel

System

& Conclusions

Methods

&

Methodology


Comparing


&

Combining



What is the coverage of each method?

Coverage
vs.

Prediction Performance


Emoticons:
best

prediction and
worst

coverage


SentiStrenght
:
second

in prediction and
third

in coverage

Prediction Performance across datasets

Twitter

MySpace

Youtube

BBC

Digg

Runners

World

PANAS
-
t

0.643

0.958

0.737

0.396

0.476

0.698

Emoticons

0.929

0.952

0.948

0.359

0.939

0.947

SASA

0.750

0.710

0.754

0.346

0.502

0.744

SenticNet

0.757

0.884

0.810

0.251

0.424

0.826

SentiWordNet

0.721

0.837

0.789

0.384

0.456

0.780

SentiStrength

0.843

0.915

0.894

0.532

0.632

0.778

Happiness Index

0.774

0.925

0.821

0.246

0.393

0.832

LIWC

0.690

0.862

0.731

0.377

0.585

0.895


Strong variations across datasets

Prediction Performance across datasets

Twitter

MySpace

Youtube

BBC

Digg

Runners

World

PANAS
-
t

0.643

0.958

0.737

0.396

0.476

0.698

Emoticons

0.929

0.952

0.948

0.359

0.939

0.947

SASA

0.750

0.710

0.754

0.346

0.502

0.744

SenticNet

0.757

0.884

0.810

0.251

0.424

0.826

SentiWordNet

0.721

0.837

0.789

0.384

0.456

0.780

SentiStrength

0.843

0.915

0.894

0.532

0.632

0.778

Happiness Index

0.774

0.925

0.821

0.246

0.393

0.832

LIWC

0.690

0.862

0.731

0.377

0.585

0.895


Worst performance for datasets containing formal text

Polarity Analysis

Detected only

positive

Sentiments!

Methods tend to detect more positive sentiments


Positive as positive is usually greater than negative as negative

Even disasters were

classified
predominantly as
positive


Combines 7, of the 8 methods analyzed


Emoticons,
SentiStrength
, Happiness Index,
SenticNet
,
SentiWordNet
, PANAS
-
t, SASA


Removed LIWC (paid method)



Weights are distributed according to the rank of prediction performance:


Higher weight for the method with highest F
-
measure


Emoticon received weight 7 and PANAS
-
t 1





Combined Method

Combined Method


Best

coverage and
second

in prediction performance


4 methods combined are sufficient


Ifeel

System

& Conclusions

Methods

&

Methodology


Comparing


&

Combining




Example for:



Feeling too happy today :)




Deploys all methods, except LIWC



Allows to evaluate an entire file



Allows to change parameters on the
methods


iFeel

(Beta version)

www.ifeel.dcc.ufmg.br


We compare 8 popular sentiment analysis methods for detecting polarity


No method had the best results in all analysis


Prediction performance largely varies according to the dataset


Most methods are biased towards positivity




We propose a combined method


Achieves high coverage and high prediction performance



Ifeel
: methods deployed and easily available



Future work
: Compare others methods like POMS and EMOLEX


Conclusions



Questions?


www.dcc.ufmg.br/~fabricio

www.ifeel.dcc.ufmg.br

fabricio@dcc.ufmg.br

Thank you!