검색 상세

Identifying Negative Deception Using Opinion Mining Techniques in Twitter

  • 발행기관 아주대학교
  • 지도교수 Prof. Kyung-Ah Sohn
  • 발행년도 2015
  • 학위수여년월 2015. 8
  • 학위명 석사
  • 학과 및 전공 일반대학원 컴퓨터공학과
  • 실제URI http://www.dcollection.net/handler/ajou/000000020234
  • 본문언어 영어
  • 저작권 아주대학교 논문은 저작권에 의해 보호받습니다.

초록/요약

Nowadays, a huge amount of opinions are posted and tweeted on the Web. Such opinions are a very important source of information for customers and companies. A lot of researchers describe that users are relying on online opinions to make their purchase decisions. Unfortunately, due to the business that is behind, there is an increasing number of deceptive opinions in order to deceive consumers by promoting a low quality product (positive deceptive opinions) or criticizing a potentially better quality product (negative deceptive opinions). This thesis will focus on detection of negative deceptive opinions from a negative tweet on specific brands. We applied lexical, personal profile and personal behavioral features to detect negative deceptive opinions using supervised machine learning classifiers, i.e. Support Vector Machine (SVM), Naïve Bayes, Decision Tree, Maximum Entropy, BAGGING and Random Forest. We tested our method using user opinions about different Samsung products and related issues that are collected from five official twitter accounts. One of the challenges in evaluating such a system is the lack of a large-scale labeled dataset. To resolve this issue, we construct our own dataset by recruiting multiple people to label the collected tweets. The labels are assigned by the majority vote. The acquired results indicate that our proposed system accomplishes 100% exactness with maximum entropy and 98% utilizing Naïve Bayes on our first kinds of datasets that consist of tweets having unanimous labels by all examiners, and 94% and 91% on the full labeled dataset. It is a promising approach for detecting deceptive opinions. Our approach also can help to identify defamers by analyzing the profile information of users, comment giving behavior and writing style of each user.

more

목차

TABLE OF CONTENTS
1. Introduction 1
1.1 Importance, aims and outcomes 2
1.2 Organization of the thesis 3
1.3 Summary of the proposed work 3
2. Background and Related Works 4
2.1 Deceptive opinion (opinion spam) 4
2.1.1 Positive deceptive opinion 4
2.1.2 Negative deceptive opinion 5
2.2 Related works 5
3. Data Collection and Preprocessing 7
3.1 Twitter crawling 7
3.2 Data pre-processing 7
4. Negative Deception Detection 9
4.1 Approach and system architecture 9
4.2 General sentiment classification module 10
4.2.1 Users social network graph construction 11
4.2.1.1 General social graph 12
4.2.1.2 Social graph with time stamp 13
4.2.1.3 Social graph with sentimental score 15
4.3 Negative deception detection module 18
4.3. 1 Feature generation 18
4.3.1.1 Personal profile and behavioral features 18
4.3.1.2 Lexical features 19
5. Experiments and Results 22
5.1 Experiment 1 23
5.2 Experiment 2 24
5.3 Experiment 3 25
5.4 Experiment 4 27
5.5 Experiment 5 28
6. Conclusion and Future Work 31
REFERENCE 32

more