1. Home
  2. Archives
  3. Vol 12 (2018) Issue 3
  4. Articles

A Hierarchical Emotion Classification Technique for Thai Reviews

Abstract

Emotion classification is an interesting problem in affective computing that can be applied in various tasks, such as speech synthesis, image processing and text processing. With the increasing amount of textual data on the Internet, especially reviews of customers that express opinions and emotions about products. These reviews are important feedback for companies. Emotion classification aims to identify an emotion label for each review. This research investigated three approaches for emotion classification of opinions in the Thai language, written in unstructured format, free form or informal style. Different sets of features were studied in detail and analyzed. The experimental results showed that a hierarchical approach, where the subjectivity of the review is determined first, then the polarity of opinion is identified and finally the emotional label is calculated, yielded the highest performance, with precision, recall and F-measure at 0.691, 0.743 and 0.709, respectively.

Keywords

1 Introduction

With the growth of Internet communication, emotion classification on textbased resources, e.g. blogs, online newspapers and social networks, is becoming a more interesting and challenging task. Emotion classification can be applied in researches on subjects such as speech synthesis [1], image processing [2-4] and especially text processing [5-8]. Many people regularly buy products from websites or mobile applications. These websites or applications want to have product feedback after customers have used their products. Customers' reviews are usually written in unstructured format, free form or informal style. The content of a review usually expresses opinions and emotions about the quality and quantity of the product that the reviewer ordered and used. These emotions can affect the profit and image of a company. If reviewers express positive opinions or positive emotions toward a product on a website, there is a tendency to attract more customers to it. Due to the large amount of reviews, it is difficult for readers to identify emotions manually. An automatic emotion classification method is needed to solve this problem.

Received October 25th, 2018, Revised December 26th, 2018, Accepted for publication December 26th, 2018. Copyright © 2018 Published by ITB Journal Publisher, ISSN: 2337-5787, DOI: 10.5614/itbj.ict.res.appl.2018.12.3.6

Most emotion classification methods are proposed for English and Western languages. However, these cannot be directly applied to the Thai language. Thai texts are written in continuous form, without spaces, punctuation marks, full stops ('.'), commas (',') or semicolons (';') to identify boundaries of words and sentences [9]. Haruechaiyasak, et al. [10] proposed the S-Sense framework for three Thai social media sources to identify intention and sentiment labels from text. They used three lexicon resources, i.e. LEXiTRON (electronic Thai-English dictionary), the Thai Twitter corpus and clue words, and applied the Multinomial Naive Bayes algorithm as the classification model. Chirawichitchai [6] proposed emotion classification on social networks in the Thai language by using a corpus-based approach. This research used Boolean, term frequency and Tf-Idf weights as feature sets and applied the Support Vector Machine, Naïve Bayes, Decision Tree and K-nearest Neighbor algorithms to detect six emotions. The experimental results showed that Boolean weighting with Support Vector Machine yielded the best performance. Lastly, Chumwatana [11] proposed sentiment classification on social media and websites. This method extracts emotional words from text and assigned each word with a sentiment score ('+1' for a positive word, '0' for a neutral word and '-1' for a negative word). The sentiment score of an opinion is calculated by summing the word scores together. Previous researches on Thai opinion classification have suggested that an effective feature set can be constructed from corpus-based and lexicon approaches. This research proposes a hierarchical framework to identify emotions (i.e. objective, anger, disgust, fear, sadness, happiness and surprise) for actual customer reviews written in the Thai language.

The hierarchical classification framework consists of 3 levels: the opinion level, the sentiment level and the emotion level. The opinion level separates customers' reviews into two types, i.e. objective and subjective reviews [12]. An objective or neutral emotion expresses factual information or no opinion. A subjective opinion expresses a reviewer's opinion, which can be classified as a positive or a negative opinion. The second level is the sentiment level, which categorizes a subjective opinion as a positive or a negative sentiment [13]. The emotion level then assigns an emotion label to an opinion.

Based on Ekman [14], six basic emotions can be used to describe facial expressions in all human traditions. These emotions are: anger, disgust, fear, sadness, happiness and surprise. Breazeal in 2003 [15] proposed polarity in arousal-valance (A-V) graph space based on Ekman's emotions, where the xaxis represents valence by mapping a scale of pleasant versus unpleasant or positive versus negative sentiment, while the y-axis represents arousal by mapping a scale of being relaxed vs aroused. Positive emotions are happiness and surprise, while negative emotions are anger, disgust, fear, and sadness. The emotion classification used in this research was organized accordingly.

2 Proposed Emotion Classification Techniques

The proposed method consists of three main processes: (1) text preprocessing (2) feature extraction, and (3) emotion classification. Text preprocessing provides necessary information and normalization of the words that occur in the reviews. Feature extraction generates a set of features by using corpuses and lexicons. Then, the emotion classification applies a classification algorithm to the extracted features to identify the emotion labels. Three approaches for emotion classification were studied, as shown in Figure 1.

4

Figure 1 Three approaches of emotion classification framework.

The first approach uses a classifier directly to identify seven labels of opinion analysis, i.e. objective opinion and Ekman's six human emotions. The second approach is a two-level structure of opinion filtering and emotion classification. Opinion filtering determines whether a review contains a subjective opinion of the reviewer. Emotion classification then classifies the emotion of the opinion identified in the previous step into six emotion labels, i.e. anger, disgust, fear, sadness, happiness and surprise. The last approach contains three levels of opinion filtering, sentiment filtering and positive and negative emotion classification. The difference with the second approach is that after opinion filtering instead of classifying emotion directly, each opinion is separated first into positive and negative opinions and then each positive opinion is classified into a positive emotion and each negative opinion is classified into a negative emotion.

2.1 Text Preprocessing

Preprocessing includes word segmentation, part-of-speech (POS) tagging, word replacement and stopword elimination. Examples of the preprocessing steps are shown in Table 3.

2.1.1 Word Segmentation

Unlike English, the Thai language has distinctive syntactical and semantic characteristics. The language has no specific symbols (e.g. '.', '?', ';') to identify the end of a sentence or clause. Furthermore, there is no space between words. Accordingly, identifying the boundaries of each word is a nontrivial problem for Thai. This research specifically focused on customer reviews usually written in unstructured format, free form or informal style. Word segmentation was performed by KuCut [16], which is based on global and local unsupervised learning to segment unknown words.

2.1.2 Part-Of-Speech Tagging

Part-of-speech (POS) is considered an important element at the morphology level to represent the role of token words such as 'verb', 'noun' and 'conjunction'. POS has essential information at the word level for identifying opinion categories. Hence, the Jitar tagging tool [17] was applied to assign a POS label to each word. Jitar is based on a trigram hidden Markov model (HMM) and the Naist corpus [18]. The Naist corpus consists of 60,511,974 words that were collected from Thai magazines and has 49 part-of-speech tag sets in 17 groups.

2.1.3 Word Replacement

Word replacement reduces typographical errors and words with repeated characters. Typographical errors are caused by mistyping, for example, 'แอลกอฮอล์' (alcohol) can be mistyped as 'แอลกอฮ', 'แอลกอฮอ' or 'แอลกอฮอลล์', while words with repeated characters are caused by a reviewer repeating characters on the keyboard to express a strong opinion. For example, in 'ดีมากกกกกก' (very good) the character 'ก' is repeated 5 times. The Thai language has the symbol 'ๆ' to signify the repetition of the previous word. Therefore, 'ดีมากกกกกก' is becomes 'ดีมากๆๆๆๆๆ'. Word replacement was implemented by regular expression rules. In addition, this research defined five

new POS tags for punctuation that indicates opinion labels. The new POS tag set is shown in Table 1.

C
Naist tag setNew tag set
?/punc?/Qmark
(/punc or /(punc'/Qparent
¶/puncๆ/Qrepeat
!/punc!/Qexclamation
'/punc or '/punc or "/punc or "/punc"/Qquote

Table 1 New POS tag set.

2.1.4 Stopword Elimination

Some extremely common words in text have little value to identify types of human emotions. These words are called stopwords and consist of a set of common words (e.g. a, the, for, at), punctuations (e.g. (, ], ?, '), numbers (e.g. 1, 2, 3) and symbols (e.g. %, $, @). These were eliminated.

The remaining words are the main words that the reviewer uses to express his or her opinion. This research used information from POS tagging to ignore words that do not express opinions or emotions of reviewers. There are three types of filters, i.e. eleven POS tags that are words expressing no opinion, blank and English words for brands or ingredients of products, or reviewers' names. The list of eleven POS tags is shown in Table 2.

NoPOS
Tag
DescriptionExampleNoPOS
Tag
DescriptionExample
ParticipleSymbol
1
2
aff
part
Affirmative Particleค่ะ, ครับ
นะ, นั่นเอง
7symSymbol
Noun
ฯลฯ,%
Classifier8ntitTitle nounนาย,
นางสาว
3clClassifierชิ้น,กล่อง9nlabLabel noun2, ก
Prefix10nnumCardinal numberหมื่น, 1000
4pref1Prefix1การ, ความPunctuation
5
6
pref2
pref3
Prefix2
Prefix3
ผู้, นัก
ชาว
11puncPunctuation.,- ,_

Table 2 Eleven POS tags for stopword elimination.

ProcessExampleRemark
Input"ivory_caps"_ไม่เห็นได้ผลเลยใช้มาจะ_6_ขวดแล้วอ่ะ
["ivory caps", there are not any results, although I have used
it for 6 bottles]
underscore (_) represents a space
Word
Segmentation
"|ivory|_|caps|"|ไม่|เห็น|ได้|ผล|เลย|ใช้|มา|จะ|_|6|_|ขวด|แล้ว|อ่
ะ|
vertical bar (|) represents a segmented sign
POS Tagging"/punc|ivory/npn|_/punc|caps/npn|"/punc|_/punc| ไม่/neg|
เห็น/vt|ได้/vpost|ผล/ncn|เลย/part|_/punc|ใช้/vt|มา/vpost|
จะprev|_/punc|6/nnum|_/punc|ขวด/cl|แล้ว/vpost|อ่ะ/aff|
"/Qquote|ivory/npn|_/punc|caps/npn|"/Qquote|
Word_/punc|ไม่/neg|เห็น/vt|ได้/vpost|ผล/ncn|เลย/part|_/punc|ใช้/vt|มาreplace "/punc and
Replacement/vpost| จะprev|
_/punc|6/nnum|_/punc|ขวด/cl|แล้ว/vpost|อ่ะ/aff|
"/punc with "/Qquote
Stopword
Elimination
''/Qquote|''/Qquote|ไม่/neg|เห็น/vt|ได้/vpost|ผล/ncn|เลย/part|
ใช้/vt|มา/vpost|จะ/prev|แล้ว/vpost|
remove 3 tokens;
6/nnum|, บวด/cl|,
อ่ะ/aff| and remove 5
spaces and 2 English
words

Table 3 Examples of each text preprocessing step.

2.2 Feature Extraction

Feature extraction constructs a vector representation for each review. This research used five corpus-based and lexicon feature subsets.

Term Weighting 2.2.1

The term frequency (tf) and inverse document frequency (idf) weighting technique [19] was used, where \(tf_{ij}\) is the number of times a term \(t_i\) appears in document \(d_i\) and \(f_i\) is the raw frequency count of term \(t_i\) in document \(d_i\). The normalized \(tf_{ij}\) formula is shown in Eq. (1), where the maximum is computed over all terms that appear in the document and |V| is the vocabulary size. \(idf_i\) is the inverse document frequency of term \(t_i\), where N is the total number of documents and \(df_i\) is the number of documents in which term \(t_i\) appears. The formula for \(idf_i\) is shown in Eq. (2). The weight \(Tf - Idf_{ij}\) can be calculated with Eq. (3):

\[tf_{ij} = \frac{f_{ij}}{\max(\{f_{1j}, f_{2j}, \dots, f_{|V|j}, \})}\](1)

\[idf_i = \log \frac{N}{df_i} \tag{2}\]

\[Tf - Idf_{ij} = tf_{ij} * idf_i\] (3)

The Tf-Idf weight of a unigram word (TUW) is calculated for each term while that of a bigram word (TBW) is calculated for each word pair.

2.2.2 Part-of-Speech Weighting

The part-of-speech (POS) weight feature is the Tf-Idf weight for a POS tag, calculated from the training data. There are weights for both unigram part-ofspeech (TUP) and bigram part-of-speech (TBP).

2.2.3 Thai Sentiment Lexicon

The last subset of features is a Thai sentiment lexicon [20] that creates two attributes for identifying positive and negative words from customer reviews. These attributes were derived from a Thai sentiment lexicon that is available online and consists of 1,031 words, where 321 words constitute the positive lexicon (PL) and 710 words constitute the negative lexicon (NL). The lexicon includes nouns, verbs and adjectives. This feature is represented as two integer attributes: positive_score and negative_score. The value of each attribute is calculated according to the pseudocode in Figure 2.


Figure 2 Pseudocode for calculating lexicon score.

2.3 Opinion Classification

Three classification algorithms were used in combination with the proposed method to identify emotion labels, i.e. Decision Tree, Multinomial Naïve Bayes and Support Vector Machine.

2.3.1 Decision Tree

A decision tree consists of internal nodes that represent attribute tests and leaf nodes that contain output classes. Information gain is computed as the decrease in entropy after a data set is split on an attribute and the attribute with the highest information gain is selected for the current split. Information gain and entropy can be calculated according to Eqs. (4) and (5), respectively [21].

\[Gain(S, A) = Entropy(S) - \sum_{i} \frac{|S_i|}{S} \cdot Entropy(S_i)\] (4)

\[Entropy(S) = \sum_{i=0}^{c} -p_i \log_2 p_i \tag{5}\] where S is the training data set, A is the attribute set, p is the proportion of instances belonging to class i, and c is the total number of classes.

2.3.2 Multinomial Naive Bayes

Multinomial Naive Bayes [22] is a popular classification technique in the context of text analytics. It calculates the conditional probability of observing features 1 through given some class c, where (|c) is shown in Eq. (6).

\[P(x_1, x_2, \dots, x_n | c) \tag{6}\]

With the independence assumption, Multinomial Naive Bayes can be expressed as Eq. (7). Its application to text classification considers the positions of words in a document as shown in Eq. (8).

\[C_{NB} = \operatorname{argmax}_{c \in C} P(c_j) \prod_{x \in X} P(x|c)\] (7)

\[C_{NB} = \operatorname{argmax}_{c \in C} P(c_j) \prod_{i \in positions} P(x_i | c_j)\] (8)

2.3.3 Support Vector Machine

Support Vector Machine (SVM) [19,23] is a supervised learning method relying on a linear separation of input data with high dimensions. SVM represents the training data with different categories as points in a vector space and uses a margin to define the distance between the separating hyperplane and the training data that are closest to this hyperplane. A kernel function K(x,y) represents our desired notion of similarity between data x and y, allowing the learning of a non-linear model. The polynomial kernel, shown in Eq. (9), is a commonly used function and was applied in this research.

\[K(x,y) = (\langle x, y \rangle + 1)^d\] (9)

3 Data Set

There is no standard test collection for free-text reviews in the Thai language specifically for opinion classification. The data set used in this research consisted of customer reviews of cosmetics collected from three popular beauty websites, i.e. www.lazada.co.th, www.kony.com and www.vanilla.in.th with a total of 2,770 reviews. Each review was annotated by five readers who were familiar with the subject matter and the label with the majority votes was selected as the result. Accordingly, each annotation had three levels: the opinion label, consisting of two types (objective and subjective); the sentiment label, consisting of two types (positive and negative); and the emotion label, consisting of six labels (anger, disgust, fear, sadness, happiness and surprise). Table 4 shows the characteristics of the data set.

Table 4 Characteristics of the data set.

LabelNumber
of reviews
ExampleRemark
สูตรของ Bio-Oil
เป็นการผสานกันของสารสกัดจากพืชและวิตามินที่อยู่ในรูปของน้ำ
มัน โดยมีสารประกอบ PurCellin Oil
Objective138ซึ่งทำให้สูตรของ Bio-Oil มีความบางเบาดูดซึ้มสู่ผิวง่าย
3[The Bio-Oil formulation is a combination of plant extracts and vitamins, suspended in an oil base. It
contains PurCellin Oil, which makes it light and not
greasy.]
เนื้อครีมมันมาก วันไหนอากาศร้อนตกบ่ายหน้ามันเยิ้มเลย
Subjective2,632[It's very oily. It turns greasy when the weather is hot.]
500.4หลอดนึงใช้ได้หลายเดือนคุ้มค่ามากpositive and
Positive994[A tube lasts for many months. Worth the money!]negative
labels are
NT1.620เนื้อครีมมันมาก วันไหนอากาศร้อนตกบ่ายหน้ามันเยิ้มเลยidentified
from
Negative1,638[It's very oily. It turns greasy when the weather is hot.]subjective
labels
แย่มาก ของใกล้หมดอายุ ไม่แจ้งให้ทราบ ขวดใหญ่มาก
แล้วจัดโปร 1 แถม 1 ไปอีก ใครจะไปใช้ทัน
Anger679[It is very bad. Product was almost expired, but they did
not tell the customer. The bottle is big with buy 1 get 1.anger,
Who can use it all?]disgust, fear
.205เนื้อครีมมันมาก วันไหนอากาศร้อนตกบ่ายหน้ามันเยิ้มเลยand sadness
Disgust287[It's very creamy. When the weather is hot, it turnsare
identified
greasy.]
เราใช้แล้วแพ้อ่ะ แสบตามากๆๆ มันร้อนบอกไม่ถก
from
Fear183[I am allergic to the product. My eyes are very irritated.]negative
ตัวนี้ชื้อมาแอบหวังเล็กๆ ว่าจะดี แต่กลับเฉยๆ ใช้ไปได้ครึ่งขวดlabels
ก็เปลี่ยนไปลองยี่ห้ออื่นแล้วค่ะ ไม่เห็นผลใดๆ เลย
Sadness559[I hoped that this would a good product. But after a half
of the bottle, I changed to another brand. No good.]
เนื้อครีมนุ่มมากผิวสัมผัสดี ู้ซึมไว เวลาทาแล้วทำให้หน้าดูนุ่มๆ ขึ้นhappiness
ผิวสุขภาพดีขึ้น ไม่แพ้ ไม่มีกลิ่นแรงด้วยand surprise
Happiness489[The cream is very soft and absorbed into the skinare
quickly. It makes my face look soft and healthy skin. Noidentified
strong smell.]from positive
Surprise435หลอดนึงใช้ได้หลายเดือนคุ้มค่ามาก
[A tube leats for many months ]
labels
[A tube lasts for many months.]140 010

4 Experimental Evaluations

The feature extraction process generates five feature subsets for emotion classification. They are: Tf-Idf of unigram words (TUW), Tf-Idf of bigram words (TBW), Tf-Idf of unigram POS (TUP), Tf-Idf of bigram POS (TBP) and Thai sentiment lexicon (TL). This research used Decision Tree, Multinomial

Naïve Bayes and Support Vector Machine as classifiers. Eighty percent of the data were used to construct the model and the remaining twenty percent were used as test data. The first experiment studied the performance of Approach 1 with six patterns of feature sets and three classification algorithms. The results of the first approach are shown in Table 5. We can see that the TBW feature had the best performance, especially with Multinomial Naïve Bayes. The highest precision, recall and F-measure values of this approach were 0.689, 0.652 and 0.657, respectively.

Table 5 Effectiveness of the first approach.

Effectiveness of Approach 1
ClassifierFeature setEmotion Classification (7 labels)
PrecisionRecallF-measure
TUW0.5050.4920.493
TUW+TL0.4910.4830.483
TUW+TL+TUP0.4990.4820.486
Decision TreeTBW0.5050.4940.494
TBW+TL0.4820.4680.469
TBW+TL+TBP0.4920.4730.477
TUW0.6480.6190.625
TUW+TL0.6140.5700.570
MultinomialTUW+TL+TUP0.6230.5900.591
Naïve BayesTBW0.6890.6520.657
TBW+TL0.6320.1250.134
TBW+TL+TBP0.6680.1900.227
TUW0.6050.5950.596
TUW+TL0.6090.6000.600
SupportTUW+TL+TUP0.6110.6040.604
Vector MachineTBW0.6400.6330.632
TBW+TL0.6490.6400.639
TBW+TL+TBP0.6270.6190.619

The results of the second approach (two-level hierarchy of opinion filtering and emotion classification) are shown in Table 6. They show that the TBW+TL+TBP feature set with Support Vector Machine had the best performance in filtering opinions, with 0.985, 0.986 and 0.984 for precision, recall and F-measure, respectively. Next, the emotion of each opinion was classified and the results show that Support Vector Machine with the TBW feature performed best with 0.688, 0.684 and 0.681 for precision, recall and Fmeasure, respectively.

Classifier Feature set Effectiveness of Approach 2 Opinion Filtering and Emotion Classification Level 1 Opinion Filtering (2 labels) Level 2 Emotion Classification (6 labels) Precision Recall F-measure Precision Recall F-measure Decision Tree TUW 0.963 0.960 0.960 0.515 0.513 0.511 TUW+TL 0.966 0.962 0.962 0.488 0.490 0.488 TUW+TL+TUP 0.963 0.962 0.962 0.504 0.498 0.499 TBW 0.962 0.957 0.959 0.496 0.487 0.486 TBW+TL 0.965 0.960 0.962 0.501 0.494 0.492 TBW+TL+TBP 0.959 0.957 0.958 0.501 0.494 0.488 Multinomial Naïve Bayes TUW 0.979 0.969 0.972 0.652 0.624 0.615 TUW+TL 0.979 0.968 0.972 0.659 0.635 0.629 TUW+TL+TUP 0.982 0.972 0.975 0.651 0.635 0.629 TBW 0.959 0.697 0.783 0.693 0.426 0.377 TBW+TL 0.960 0.734 0.810 0.682 0.449 0.394 TBW+TL+TBP 0.962 0.820 0.869 0.648 0.452 0.419 Support Vector Machine TUW 0.979 0.980 0.978 0.646 0.643 0.643 TUW+TL 0.978 0.979 0.978 0.640 0.635 0.636 TUW+TL+TUP 0.982 0.982 0.982 0.635 0.631 0.630 TBW 0.983 0.983 0.967 0.688 0.684 0.681 TBW+TL 0.983 0.983 0.981 0.678 0.673 0.670 TBW+TL+TBP 0.985 0.986 0.984 0.668 0.665 0.662

Table 6 Effectiveness of the second approach.

The third approach is a three-level hierarchy, where an opinion is first filtered, then the polarity of the opinion is identified and finally the emotion of an opinion with positive or negative polarity is classified accordingly. Its effectiveness in opinion filtering and sentiment classification is shown in Table 7 and the result of classifying positive and negative emotion classification is shown in Table 8. The results of opinion filtering using the third approach were in the same direction as those of the second approach. Table 7 shows that the best sentiment-filtering configuration was TBW+TL+TBP with Support Vector Machine. The precision, recall and F-measure of this type were 0.947, 0.947 and 0.946, respectively. The positive and negative opinions were sent to the positive and the negative emotion classifiers, respectively. The results of positive emotion classification show that the TUW+TL+TUP feature set with Multinomial Naïve Bayes had the best performance with 0.768, 0.765 and 0.764 for precision, recall and F-measure. Lastly, the results of negative emotion classification show that the TUW+TL feature set with Support Vector Machine had the best performance with 0.719, 0.709 and 0.705 for the three measures, respectively.

Table 7 Effectiveness of opinion filtering and sentiment filtering in the third approach.

Effectiveness of Approach 3
Opinion Filtering, Sentiment Filtering and Emotion Classification
ClassifierFeature setLevel 1 Opinion FilteringLevel 2 Sentiment Filtering
(2 labels)(2 labels)
PrecisionRecallF-measurePrecisionRecallF-measure
TUW0.9630.9600.9600.8120.8020.804
TUW+TL0.9660.9620.9620.8230.8060.808
DecisionTUW+TL+TUP0.9630.9620.9620.7700.7640.766
TreeTBW0.9620.9570.9590.8070.7910.793
TBW+TL0.9650.9600.9620.8050.7870.789
TBW+TL+TBP0.9590.9570.9580.8080.7950.797
TUW0.9790.9690.9720.8780.8780.878
TUW+TL0.9790.9680.9720.8650.8630.864
MultinomialTUW+TL+TUP0.9820.9720.9750.8640.8630.864
Naïve BayesTBW0.9590.6970.7830.9020.8970.895
TBW+TL0.9600.7340.8100.9060.9050.904
TBW+TL+TBP0.9620.8200.8690.8680.8670.865
TUW0.9790.9800.9780.8780.8750.875
TUW+TL0.9780.9790.9780.8850.8820.883
SupportTUW+TL+TUP0.9820.9820.9820.8810.8780.879
VectorTBW0.9830.9830.9670.9320.9320.932
MachineTBW+TL0.9830.9830.9810.9390.9390.939
TBW+TL+TBP0.9850.9860.9840.9470.9470.946

Table 8 Effectiveness of positive and negative classification in the third approach.

Effectiveness of Approach 3
Opinion Filtering, Sentiment Filtering and Emotion Classification
Level 3-1 Positive EmotionLevel 3-2 Negative Emotion
ClassifierFeature setClassificationClassification
(2 labels)(4 labels)
Precisio
n
RecallF-measurePrecisionRecallF-measure
TUW0.6750.6730.6720.6240.6180.617
TUW+TL0.6130.6120.6100.6270.6180.618
DecisionTUW+TL+TUP0.5840.5820.5750.6310.6180.615
TreeTBW0.6350.6330.6300.6250.6120.608
TBW+TL0.6370.6330.6280.6360.6120.608
TBW+TL+TBP0.6550.6530.6510.6250.6180.619
TUW0.7570.7450.7410.7230.7030.699
TUW+TL0.7650.7550.7520.7070.6910.685
MultinomialTUW+TL+TUP0.7680.7650.7640.7180.6970.688
Naïve BayesTBW0.7330.6020.5310.6800.5580.516
TBW+TL0.7330.6020.5310.6710.5700.534
TBW+TL+TBP0.6890.6840.6800.6580.5640.524
TUW0.6950.6940.6930.7010.7030.700
TUW+TL0.6950.6940.6930.7020.7030.700
SupportTUW+TL+TUP0.7170.7140.7130.6940.6970.693
VectorTBW0.6640.6630.6620.7120.7030.698
MachineTBW+TL0.6540.6530.6520.7190.7090.705
TBW+TL+TBP0.6630.6630.6630.7110.7030.699

Table 9 compares the performance between the two-level hierarchical classification (Approach 2) and the three-level hierarchical classification (Approach 3) using the feature sets and algorithms that yielded the highest accuracy according to Tables 6, 7 and 8. The results show that the third approach achieved the best performance with accuracy at 69.60%. According to the precision and F-measure measurement, Approach 3 achieved better performance in classifying sentiments and emotions than Approach 2. For some emotions, the recall of Approach 2 was higher than that of Approach 3.

Table 10 shows a comparison between Approach 1 and Approach 3. For all the negative emotions (anger, disgust, fear, sadness) and one positive emotion (happiness), Approach 3 achieved higher precision than Approach 1. For objective, sadness and happiness, Approach 1 achieved higher recall than Approach 3.

Overall, the Tf-Idf of bigram word feature is the most effective feature subset to be used for filtering opinions, determining polarity and classifying negative emotions. Lexicon resources such as a Thai sentiment lexicon and a POS tag set at the morphology level can improve accuracy for the opinion filtering in Approaches 2 and 3.

Support Vector Machine achieves high performance in identifying contrasting opinions such as objective versus subjective opinions, and positive versus negative sentiments. Multinomial Naïve Bayes achieves high performance in identifying closely related emotions, such as happiness versus surprise in positive emotion classification.

There are four main reasons for incorrect classification. Firstly, inaccurate Thai word segmentation and POS tagging due to the complexity of the Thai language. Secondly, ambiguities in the Thai sentiment lexicon: (1) The scores of positive and negative emotion do not conform with the answer. For example, an anger emotion getting a positive score. (2) Reviewer can be sarcastic, for example 'ดีจริงจริ๊งเลย แป้งยี่ห ้อนี้' [this face powder is good], where the review actually has a negative sentiment. (3) The Thai sentiment lexicon only has two labels (positive and negative), which is not sufficient to classify emotions. (4) Bigrams cannot detect patterns of word pairs whose distance is further than two words, especially negative words. For example, a positive review [ไม่|เหนียว| เหนอะ|หนะ|] can generate 3 bigrams, such as 'ไม่เหนียว', 'เหนียวเหนอะ', 'เหนอะหนะ'.

The word 'ไม่' [not] is a negative word; 'ไม่เหนียว' expresses a positive sentiment, but 'เหนียวเหนอะ' and 'เหนอะหนะ' express a negative sentiment. The resulted sentiment of the entire opinion is thus negative since the weight of a negative bigram is higher than the weight of a positive one. Thus, bigram is not effective in handling this type of problems.

Table 9 Effectiveness of hierarchical emotion classification.

LabelApproach 2 - Two-level HierarchyClassificationApproach 3 - Three-level Hierarchy
Classification
PrecisionRecallF-measurePrecisionRecallF-measure
Objective0.7691.0000.8701.0000.7690.870
Anger0.8440.6920.7610.8440.6670.745
Disgust0.7240.7500.7370.8080.7240.764
Fear0.5000.7690.6060.5260.7690.625
Sadness0.6520.7320.6900.5470.6300.586
Happiness0.6540.6070.6300.7000.6730.686
Surprise0.5920.6170.6040.6460.7380.689
Average0.6760.7380.6990.6910.7430.709
Accuracy68.50%69.60%

Table 10 Effectiveness of Approach 3 versus Approach 1.

Approach 3 - Three-levelApproach 1 - Emotion
LevelLabelHierarchy ClassificationClassification
PrecisionRecallF-measurePrecisionRecallF-measure
Level 1Objective1.0000.7690.8701.0000.7860.880
Opinion labelSubjective0.9891.0000.994
Level 2Positive0.9390.9750.957
Sentiment label
(based on resultNegative0.9590.9040.931
of level 1)
Anger0.8440.6670.7450.7690.6940.730
Level 3Disgust0.8080.7240.7640.8000.5560.656
Emotion labelFear0.5260.7690.6250.5000.5380.519
(based on resultSadness0.5470.6300.5860.5300.7140.609
of level 2)Happiness0.7000.6730.6860.6330.7170.673
Surprise0.6460.7380.6890.6560.5830.618
Average0.6910.7430.7090.6880.6670.670
Accuracy69.60%66.67%

5 Conclusion

In this paper, a framework for hierarchical classification of fined-grained emotions of cosmetic reviews written in the Thai language was presented. The proposed framework begins by extracting important words that express the opinion of the reviewer, representing each review with a set of features consisting of characteristics of unigram and bigram words and part-of-speech tags, and a Thai sentiment lexicon.

Three approaches of emotion classification were proposed and studied in detail, i.e. direct emotional classification of review texts; opinion filtering and emotional classification of opinions; and a hierarchical approach of opinion filtering, opinion polarity identification and emotion classification. At each step, Decision Tree, Support Vector Machine and Multinomial Naïve Bayes were tested as classifiers.

A set of experiments was conducted to evaluate the effectiveness of the different approaches and configurations on a collection of actual informal freetext reviews acquired from the Internet. The results showed that the proposed hierarchical approach had the best performance with precision, recall and Fmeasure at 0.691, 0.743 and 0.709, respectively. In addition, the Tf-Idf of bigram words was found to be the most effective set of features to tackle this problem.

Research Intelligence

Data from OpenAlex ↗

Metrics

0.00
FWCIfield-weighted
22th
Percentilevs same year + field
Article
Work type
Open Access

Semantic Profile AI-classified research signals

Java 0.67
level 2
level 0
Arithmetic 0.64
level 1

Institution Network

References

  1. Cowie, R., Douglas-Cowie E., Savvidou, S., McMahon E., Sawey, M. & Schroder M., Emotion Recognition in Human-computer Interaction, IEEE Signal Processing Magazine, 18(1), pp. 32-80, Jan. 2001.
  2. Donato, G., Bartlett, M.S., Hager, J.C., Ekman, P. & Sejnowski, T.J., Classifying Facial Actions, IEEE Trans Pattern Anal Mach Intell, 21(10), pp. 974-989, Oct. 1999. DOI: 10.1109/34.799905
  3. Cohen, I., Sebe, N., Garg, A., Chen, L.S. & Huang, T.S., Facial Expression Recognition from Video Sequences: Temporal and Static Modeling, Computer Vision and Image Understanding, 91(1-2), pp. 160-187, Jul. 2003. DOI: 10.1016/s1077-3142(03)00081-x
  4. Mehta, D., Siddiqui, M.F.H. & Javaid, A.Y., Facial Emotion Recognition: A Survey and Real-world User Experiences in Mixed Reality, Sensors, 18(2), pp. 416, Feb. 2018. DOI: 10.3390/s18020416
  5. Strapparava, C. & Mihalcea, R., Learning to Identify Emotions in Text, in Proceedings of the 2008 ACM Symposium on Applied Computing, New York, NY, USA, pp. 1556-1560, 2008. DOI: 10.1145/1363686.1364052
  6. Chirawichitchai, N., Emotion Classification of Thai Text Based Using Term Weighting and Machine Learning Techniques, presented at the 11th International Joint Conference on Computer Science and Software Engineering (JCSSE), pp. 91-96, 2014. DOI: 10.1109/jcsse.2014.6841848
  7. Inrak, P. & Sinthupinyo, S., Applying Latent Semantic Analysis to Classify Emotions in Thai Text, presented at the 2010 2nd International Conference on Computer Engineering and Technology, 6, pp. V6-450- 454, 2010. DOI: 10.1109/iccet.2010.5486137
  8. Burget, R. & Karasek, J., Recognition of Emotions in Czech Newspaper Headlines, 20(1), pp. 39-47, 2011.
  9. Sriphaew, K., Takamura, H. & Okumura, M., Sentiment Analysis for Thai Natural Language Processing, in Proceedings of the 2nd Thailand-Japan International Academic Conference TJIA, pp. 123-124, 2009.
  10. Haruechaiyasak, C., Kongthon, A., Palingoon, P. & Trakultaweekoon, K., S-Sense: A Sentiment Analysis Framework for Social Media Sensing, in Proceedings of the IJCNLP 2013 Workshop on Natural Language Processing for Social Media (SocialNLP), Nagoya, Japan, pp. 6-13, 2013.
  11. Chumwatana, T., Using Sentiment Analysis Technique for Analyzing Thai Customer Satisfaction from Social Media, presented at the Proceedings of the 5th International Conference on Computing and Informatics, Turkey, pp. 659-664, 2015.
  12. Lui, B., Sentiment Analysis and Opinion Mining, Morgan & Claypool Publishers, 2012.
  13. Lee, D,. Jeong, O-R. & Lee, S., Opinion Mining of Customer Feedback Data on the Web, p. 230, 2008.
  14. P. Ekman, An Argument for Basic Emotions, Cognition and Emotion, 6 (3-4), pp. 169-200, May 1992.
  15. Breazeal, C., Emotion and Sociable Humanoid Robots, International Journal of Human-Computer Studies, 59(1-2), pp. 119-155, Jul. 2003. DOI: 10.1016/s1071-5819(03)00018-1
  16. Sudprasert, S. & Kawtrakul, A., Thai Word Segmentation Based on Global and Local Unsupervised Learning, in Proceedings of the 7th National Computer Science and Engineering Conference (NCSEC
  17. Kok, D. de, Jitar HMM Part of Speech Tagger, https://github.com/danieldk/jitar, 2018. (9 July 2018)
  18. Asanee, K., Supapas, K., Thitima, J. & Chanvit, J., A Lexibase Model for Writing Production Assistant System, in Proceeding of the Second Symposium on Natural Language Processing, pp. 226-236, 1995.
  19. Liu, B., Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, 2nd ed. Berlin Heidelberg: Springer-Verlag, 2011.
  20. Phatthiyaphaibun, W., lexicon-thai/sentiment at master · PyThaiNLP/lexicon-thai · GitHub, 2016. [Online]. Available: https://github.com/PyThaiNLP/lexicon-thai/tree/master/sentiment. (9 July 2018).
  21. Salzberg, S.L., C4.5: Programs for Machine Learning by J. Ross Quinlan, Morgan Kaufmann Publishers, Inc., 1993,
  22. McCallum, A. & Nigam, K., A Comparison of Event Models for Naive Bayes Text Classification, in AAAI-98 Workshop on Learning for Text Categorization, pp. 41-48, 1998.
  23. Yekkehkhany, B., Safari, A., Homayouni, S. & Hasanlou, M., A Comparison Study of Different Kernel Functions for SVM-based Classification of Multi-temporal Polarimetry SAR Data, in ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XL-2/W3, pp. 281-285, 2014. DOI: 10.5194/isprsarchives-xl-2-w3-281-2014