Micro-blogging today has become a very popular communication tool among the Internet users. Real-time web services such as Twitter allow users to express their opinions and interests, often expressed in the form of short text messages. Many business companies are looking into utilizing these data streams in order to improve their marketing campaigns, refine advertising and better meet their customer needs. In this study, we focus on using Twitter, for the task of extraction product reputation trend. Thus, business could gauge the effectiveness of a recent marketing campaign by aggregating user opinions on Twitter regarding their product. In this paper, we introduce an approach for automatically classifying the sentiment of Twitter messages toward product/brand, using emoticons and by improving pre-processing steps in order to achieve high accuracy.
Micro-blogging today has become a very popular communication tool among the Internet users. Millions of messages are appearing daily in popular websites that provide services for micro-blogging such as Twitter, Tumblr, Facebook. Authors of those messages write about their life, share opinions on variety of topics and discuss current issues. Such data can be efficiently used for marketing or social studies [
Through these opinions, we can extract information about the product, that we are interested in and numerate reputation of product. Knowing the reputation is very important for marketing analyzer because they enhance the public’s view of product by analyzing extracted reputation. In the past, market analyzer conducted manual survey to find reputation of product. However, manual survey not only costs high but also requires lots of labor.
The purpose of our study is to extract opinion from micro-blog automatically and to summarize extracted opi- nions to provide reputation of product in which we are interested. In most of the previous researches, text polar- ity were extracted based on assumption that most of sentiment messages consists of positive or negative words as “good”, “bad” and etc. However, Twitter message’s structure is unique and it allows you to write messages no longer than 140 characters, which constrain users not to use very long sentences but to use emoticons, ab- breviations, acronyms and other forms of informal language. Most of social networks users use informal lan- guage to shorten their messages, as it takes less time to type. So considering that, in presented research, we make an assumption that using emoticons, emotion identifiers, acronyms and etc. as sentiment classification feature that would help us to get high accuracy in sentiment extraction task.
In this paper, we propose a method to extract sentiment automatically from tweets, which are the Twitter user’s status messages. Many companies want to analyze their customer satisfaction, thus we apply our method to a “negative” and “others” (positive and objective tweets) classification task of tweets. We assume that “negative” tweets can be more informative, so merchandise department can use it to gather critical feedback about problems in newly released products.
Multiple papers have been published on sentiment analysis. Many of them have also explored using Twitter as their primary source of data.
Earlier works on sentiment analysis uses the traditional text classification methods on normal text forms like movie reviews. In [
In our work, we will pay attention to the most important pre-processing step before training the classifier. Emoticons, which can give us a lot of information about text sentiment are usually ignored or stripped as noisy labels. Thus, we believe that, by using emoticons in text sentiment classification we can get high accuracy in performance of our classifier.
Our approach is to use Naïve Bayes machine learning classifier for sentiment classification. First, we present how we collect data for training and test set. Then, we propose a very effective and efficient way of tweets pre- processing. Finally, we will present the results of experiment.
In this work for tweets collection, Twitter API [
In this study, to collect data for each class (“negative” and “others”, as for “others” class we use “positive” + “neutral” tweets), positive (“J”) and negative (“L”) emoticons were used. As for neutral/objective tweets, spam or commercial tweets about product or service were considered as objective. We also make an assumption, that most of positive tweets toward product or service must contain positive expression words, like “good”, “great”, “amazing”, when words like “bad”, “awful” describes negative feelings. Thus, we increased our training set with tweets, which contains feeling descriptive words [
Twitter users are much more likely to have grammatical/spelling errors, colloquialisms, and slang incorporated into their output, due to the 140 character limit that is imposed on users. As a result, regular expression matching of common errors and substituting with standard language is necessary.
In this study we introduce new resources for pre-processing Twitter data:
1) We replaced all emoticons with their sentiment polarity by looking up to the emoticon dictionary [
2) Non-informative Twitter usernames, URL links and hash tags were stripped from the tweets.
3) We build an acronym dictionary, to replace acronyms as OMG (“Oh My God”), LOL (“Laughing Out Loud”), ILU (“I Love You”) and etc. with their expanded forms.
4) Stop words list [
5) Emotions identifier as wow, awww, xxx (“many kisses”) or kkkkk (giggling) and laugher as hahaha, hehehe, jajaja and ahahaha also were replaced with their sentiment polarity.
6) All tweets were lowercased.
7) All digits and unnecessary punctuation were removed.
8) Repeated letters as yeeeees, yahooooo, looooove were also removed.
9) We ignored all Non-ASCII characters.
10) All doubled tweets and retweets were removed.
11) Removed names of all businesses/companies according to the top brands on Twitter [
The most important step in this research is the selection of classifier for the text classification task. According to the paper [
The Naïve Bayes method for classification is often used in text classification due to its speed and simplicity. It makes the assumption that words are generated independently of word position. The Naive Bayesian classifier is a probabilistic model which is used for our purposes to estimate the probability that a tweet belongs to a specific class (positive, negative, or neutral). For a given set of classes, it estimates the probability of a class
The parameters
In this work for tweets collection, Twitter API was used. API has a parameter that specifies which language to
. Example of emoticons to be replaced using emoticon dictionary [11] .
Icon | Meaning |
---|---|
:-) :) :o) :] :3 :c) :> =] 8) =) :} :^) :っ) | Smiley or happy face. |
:-D :D 8-D 8D x-D xD X-D XD =-D =D =-3 =3 B^D | Laughing, big grin, laugh with glasses |
:-)) | Very happy or double chin |
>:[ :-( :( :-c :c :-< :っC :< :-[ :[ :{ | Frown, sad |
:-|| :@ >:( | Angry |
:'-( :'( | Crying |
retrieve tweets in. We had always set this parameter to English. Thus, our classification will only work on tweets in English because our training data is in English only. Throughout the course of this project about five million tweets were collected automatically to be used as training data.
In this study, to collect data for each class (“negative” and “others”, as for “others” class we use positive + neutral/objective tweets), positive “:)” and negative “:(” emoticons were used. There are multiple emoticons that can express positive and negative emotions. In the Twitter API, the query “:)” will return tweets that contain positive emotions and the query “:(” will return tweets with negative emotions. For the neutral training data set, we queried API with “http//” and “#hashtag”, because according to our own research almost all neutral/spam messages contain URL link and hash tags.
Tweets in our training set are from the time period from October to December, 2012. After the pre-processing step, we take the first 300,000 positive/neutral tweets (neutral tweets with neutral or spam content) and 300,000 tweets with negative content, for a total of 600,000 training tweets. On the basis of the extracted training data, we generate our sentiment classifier. We applied the Naïve Bayes algorithm to the classifier.
The challenging task of this research is that, sometimes users can express mixed sentiments in tweets toward product or services. For example, “Love iphone’s new design, but hate its short battery life L”.
Naïve Bayes classifier is useful for such cases, since it estimates probability of occurrences of each word in tweet. Thus, to not distort the initial meaning of tweet we do not remove slang and other informal language forms as in previous researches. For instance, the above mentioned tweet will look as following after all neces- sary pre-processing steps: “love new design hate short battery life [sad]”.
The test data was also collected automatically using the Twitter Search API. All set of the test data was manual- ly marked as “others” or “negative”. Not all the test data has emoticons. We used the following process to col- lect test data.
We searched the Twitter API with specific queries. These queries are arbitrarily chosen from different do- mains. For example, these queries consist of consumer products, services, and people. The query terms we used are listed in
We looked at the result set for a query. If we saw a result that contains a sentiment, we mark it as “others” (positive/neutral) or “negative”. Thus, this test set is selected independently of the presence of emoti- cons.
Our experiment was conducted by gathering large amount of tweets using Twitter Stream API (from October to December, 2012), to be used as training and testing data. For the training set, data were collected by querying Twitter API for two types of emoticons:
Smiley emoticon
Frowny/Sad emoticon
Also, emoticon corpus from the work [
For the neutral dataset, objective tweets with no sentiment or tweets with spam context were considered as
. Query terms for the test data.
Product/Service | Tweets # |
---|---|
Air Asia | 196 (negative: 47, others: 149) |
Windows 8 | 168 (negative: 26, others: 142) |
PSY | 123 (negative: 12, others: 111) |
Galaxy S III | 168 (negative: 18, others: 150) |
iPhone 5 | 210 (negative: 27, others: 183) |
WiiU | 146 (negative: 17, others: 129) |
neutral. The collected dataset was used to extract features, which will be used to train our sentiment classifier. The product reputation was estimated by analysing the output result of classifier within given product name. For test data, tweets mentioning service, mobile phones, video game console, OS and popular music was used (Ta- ble 2). As in the paper [
To improve our classifier’s result, we decided to build and use our own dictionary of negation phrases with its sentiment meaning. So the further step as building dictionary, with negation word as “not” and preceding adjec- tives to change its sentiment polarity, for example, “not bad”—“good”, “not annoyed”—“pleased” and etc. was included to the pre-processing steps.
. Classifier accuracy and F-score for two way classification task.
Product/Service | Accuracy | F1 measure | |
---|---|---|---|
Other (Pos and Objective) | Negative | ||
Air Asia | 81.3% | 87.5% | 62.3% |
Windows 8 | 80.5% | 88.2% | 42.1% |
PSY | 85.0% | 91.0% | 53.7% |
Galaxy S III | 71.8% | 81.6% | 40.0% |
iPhone 5 | 81.0% | 88.7% | 40.0% |
WiiU | 84.0% | 90.7% | 40.0% |
. The results of using negation dictionary.
Product/Service | Accuracy | F1 measure | |
---|---|---|---|
Other (Pos and Objective) | Negative | ||
Air Asia | 82.1% | 88.0% | 64.8% |
Windows 8 | 83.3% | 89.8% | 53.3% |
PSY | 87.0% | 92.3% | 58.0% |
Galaxy S III | 72.6% | 82.2% | 41.0% |
iPhone 5 | 72.8% | 82.2% | 42.4% |
WiiU | 87.7% | 92.8% | 57.1% |
F1-score (“negative”) for unigrams and bigrams classification features
F1-score (“others”) for unigrams and bigrams classification features
F1-score results (“negative”) using three methods
Sentiment classification toward product is the challenging one. Let’s have a look at tweet, mentioning iPhone 5: “I have to admit I’m a little jealous of robbies iphone 5 :-(”. In general, it is negative tweet, but from the point of Apple Inc., it is positive tweet which tells, that their product is highly demanded.
Micro-blogging nowadays became one of the major types of the communication. A recent research has identi- fied it as online word-of-mouth branding. The large amount on information contained in micro-blogging web- sites makes them an attractive source of data for opinion mining and sentiment analysis.
This study investigates how product reputation can be automatically extracted from famous Twitter micro- blogging service. We have proposed an approach based on opinion sentiment classification. We used the
F1-score results (“others”) using three methods
collected corpus to train our sentiment classifier. Our classifier should be able to determine positive, negative and neutral sentiment from tweets and estimate the reputation of given product for the certain period of time.
As for the future work, we plan to collect data with detection of fake twitter accounts, to prevent fake reputation of product/services and make improvements in our approach to get high reputation accuracy.