lect2-sentiwordnet

scarfpocketΤεχνίτη Νοημοσύνη και Ρομποτική

24 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

126 εμφανίσεις

Sentiment Analysis

An Overview of Concepts and
Selected Techniques

Terms


Sentiment


A thought, view, or attitude, especially one
based mainly on emotion instead of reason


Sentiment Analysis



opinion mining


use of natural language processing (NLP) and
computational techniques to automate the
extraction or classification of sentiment from
typically unstructured text

Motivation



Consumer information


Product reviews


Marketing


Consumer attitudes


Trends


Politics


Politicians want to know voters’ views


Voters want to know policitians’ stances and who else
supports them


Social


Find like
-
minded individuals or communities



Webpage

Problem


How to interpret features for sentiment
detection?


Bag of words (IR)


Annotated lexicons (WordNet, SentiWordNet)


Syntactic patterns



Which features to use?


Words (unigrams)


Phrases/n
-
grams


Sentences


Challenges



Must consider other features due to…


Subtlety of sentiment expression


irony


Domain/context dependence


words/phrases can mean different things in different
contexts and domains


Approaches


Machine learning


Naïve Bayes


Maximum Entropy Classifier


SVM


Markov Blanket Classifier



Unsupervised methods


Use lexicons


Assume pairwise
independent features


Three levels of meaning

1.
Lexical
Seman
i
cs


The meanings of individual words

2.
Senten
tic
al /
Composional

/ Formal

Seman
ti
cs


How those meanings combine to make meanings
for

individual sentences

3.
Discourse or
Pragma
ti
cs


How those meanings combine with each other
and with

other facts about various kinds of
context to make

meanings for a text or discourse

(+ Dialog or
Conversa
ti
onal

Seman
ti
cs
)

Wordnet[
1
][
2
]


The research efforts of the Department of Linguistics
and Psychology at Princeton University for better
understanding of English language and semantics
resulted
.



WordNet

is available as a database, searchable via
web interface or via a variety of software APIs,
providing a

comprehensive database of over 150,000
unique terms
organised

into more than 117,000
different meanings (WORDNET, 2006).



WordNet

also grew with extensions of its structure
applied to a number of other languages (WORDNET,
2009).

WordNet


A hierarchically organized lexical database


On

line thesaurus + aspects of a
dic
ti
onary


Versions for other languages are under
development

Category

-----
UniqueForms

Noun

------
>
117,097

Verb

------
>
11,488

Adjective

------
>
22,141

Adverb

------
>
4,601

How is “sense” defined in

WordNet?


The set of near

synonyms for a WordNet sense is
called a
synset

(
synonym set);
it’s their version
of a sense or a concept



Example: chump as a
noun

to mean

‘a person
who is gullible and easy to take advantage of’




Each of these senses share this same gloss



Thus for WordNet,
the meaning of this sense of
chump
is this list.

SentiWordNet [
3
]


Based on WordNet “synsets”


http://wordnet.princeton.edu/



SentiWordNet is sentiment analysis lexical resource made
up of synset from WordNet, a

thesaurus
-
like resource; they
are allocated a sentiment score of positive, negative or
objective.



These scores are automatically generated using the semi
-
supervised method



Each term in WordNet database is assigned a score of 0 to
1 in SentiWordNet which indicates its polarity.



Strong partiality information terms are assigned with higher
scores whereas less bias/subjective terms carry low scores.



Values in 3 dimension sum to 1.


Ex:

P=0.75, N=0, O=0.25



Demo

Explore the sentiment lexicons discussed
here:


http://sentiment.christopherpotts.net/lexicon/




Our Demo:


http://www.tripadvisor.com/Hotel_Review
-
g187147
-
d290407
-
Reviews
-
Paris_France_Hotel
-
Paris_Ile_de_France.html



Tutorial page:
http://sentiment.christopherpotts.net/lexicons.html#buildin
g

1.
Polarity class
ifi
cation or semantic orientation
determination of sentiment expressing

phrases


a positive sentiment
, a negative sentiment

2.
Intensity or strength determination of sentiment
expressing phrases


the word excellent

is a strong positive word whereas
the word

good is a weak positive word


3.
Product feature extraction



for example battery life, image quality and resolution
in a

camera domain and seating comfort, maximum
speed, wheels and steering in a car domain.


4.
Opinion and sentiment expressing phrase
extraction


for example extremely co
mfo
rtable,

not smooth, quite
heavy, good and bad

References

1.
http://www.answers.com/sentiment
, 9/22/08


B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? Sentiment
classification using machine learning techniques,” in
Proc Conf
on Empirical Methods in Natural Language Processing (EMNLP)
,
pp. 79

86, 2002.


Esuli A, Sebastiani F.
SentiWordNet: A Publicly Available Lexical
Resource for Opinion Mining.
In: Proc of LREC 2006
-

5th Conf
on Language Resources and Evaluation, 2006.


Zhang E, Zhang Y.
UCSC on TREC 2006 Blog Opinion Mining.
TREC 2006 Blog Track, Opinion Retrieval Task.


Devitt A, Ahmad K.

Sentiment Polarity Identification in Financial
News: A Cohesion
-
based Approach
.
ACL 2007.



Bo Pang , Lillian Lee, A sentimental education: sentiment
analysis using subjectivity summarization based on minimum
cuts, Proceedings of the 42nd Annual Meeting on Association for
Computational Linguistics, p.271
-
es, July 21
-
26, 2004.