An overview of concepts of Sentiment Analysis

lravikumarvsp 15 views 15 slides Oct 08, 2024
Slide 1
Slide 1 of 15
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15

About This Presentation

sentiment Analysis


Slide Content

Sentiment Analysis
An Overview of Concepts and
Selected Techniques

Terms
Sentiment
A thought, view, or attitude, especially one
based mainly on emotion instead of reason
Sentiment Analysis
 aka opinion mining
use of natural language processing (NLP) and
computational techniques to automate the
extraction or classification of sentiment from
typically unstructured text

Motivation
Consumer information
Product reviews
Marketing
Consumer attitudes
Trends
Politics
Politicians want to know voters’ views
Voters want to know policitians’ stances and who else
supports them
Social
Find like-minded individuals or communities

Problem
Which features to use?
Words (unigrams)
Phrases/n-grams
Sentences
How to interpret features for sentiment
detection?
Bag of words (IR)
Annotated lexicons (WordNet, SentiWordNet)
Syntactic patterns
Paragraph structure

Challenges
Harder than topical classification, with
which bag of words features perform well
Must consider other features due to…
Subtlety of sentiment expression
irony
expression of sentiment using neutral words
Domain/context dependence
words/phrases can mean different things in different
contexts and domains
Effect of syntax on semantics

Approaches
Machine learning
Naïve Bayes
Maximum Entropy Classifier
SVM
Markov Blanket Classifier
Accounts for conditional feature dependencies
Allowed reduction of discriminating features from
thousands of words to about 20 (movie review
domain)
Unsupervised methods
Use lexicons
Assume pairwise
independent features

LingPipe Polarity Classifier
First eliminate objective sentences, then
use remaining sentences to classify
document polarity (reduce noise)

LingPipe Polarity Classifier
Uses unigram features extracted from
movie review data
Assumes that adjacent sentences are
likely to have similar subjective-objective
(SO) polarity
Uses a min-cut algorithm to efficiently
extract subjective sentences

LingPipe Polarity Classifier
Graph for classifying three items.

LingPipe Polarity Classifier
Accurate as baseline but uses only 22% of
content in test data (average)
Metrics suggests properties of movie
review structure

SentiWordNet
Based on WordNet “synsets”
http://wordnet.princeton.edu/
Ternary classifier
Positive, negative, and neutral scores for each
synset
Provides means of gauging sentiment for
a text

SentiWordNet: Construction
Created training sets of synsets, L
p and L
n
Start with small number of synsets with fundamentally
positive or negative semantics, e.g., “nice” and “nasty”
Use WordNet relations, e.g., direct antonymy, similarity,
derived-from, to expand L
p
and L
n
over K iterations
L
o
(objective) is set of synsets not in L
p
or L
n
Trained classifiers on training set
Rocchio and SVM
Use four values of K to create eight classifiers with
different precision/recall characteristics
As K increases, P decreases and R increases

SentiWordNet: Results
24.6% synsets with Objective<1.0
Many terms are classified with some degree of
subjectivity
10.45% with Objective<=0.5
0.56% with Objective<=0.125
Only a few terms are classified as definitively
subjective
Difficult (if not impossible) to accurately
assess performance

SentiWordNet: How to use it
Use score to select features (+/-)
e.g. Zhang and Zhang (2006) used words in
corpus with subjectivity score of 0.5 or greater
Combine pos/neg/objective scores to
calculate document-level score
e.g. Devitt and Ahmad (2007) conflated
polarity scores with a Wordnet-based graph
representation of documents to create
predictive metrics

References
1.http://www.answers.com/sentiment , 9/22/08
B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? Sentiment
classification using machine learning techniques,” in Proc Conf
on Empirical Methods in Natural Language Processing (EMNLP) ,
pp. 79–86, 2002.
Esuli A, Sebastiani F. SentiWordNet: A Publicly Available Lexical
Resource for Opinion Mining. In: Proc of LREC 2006 - 5th Conf
on Language Resources and Evaluation, 2006.
Zhang E, Zhang Y. UCSC on TREC 2006 Blog Opinion Mining.
TREC 2006 Blog Track, Opinion Retrieval Task.
Devitt A, Ahmad K.
 Sentiment Polarity Identification in Financial
News: A Cohesion-based Approach . ACL 2007.
Bo Pang , Lillian Lee, A sentimental education: sentiment
analysis using subjectivity summarization based on minimum
cuts, Proceedings of the 42nd Annual Meeting on Association for
Computational Linguistics, p.271-es, July 21-26, 2004.
Tags