Incorporating Background Checks with Sentiment Analysis to Identify Violence Risky Chinese Microblogs †
Abstract
:1. Introduction
- A framework for step by step evaluation of the violence risk of Chinese microblogs is proposed.
- Sentimental analysis is combined with background checks for violence activity detection.
- Our method is topic-independent and easy to apply.
2. Related Work
3. Our Approach
3.1. Calculation of Violence Risk Score
3.1.1. Extraction of Subjective Microblogs
3.1.2. Extraction of Sentiment Words
3.1.3. Semantic Rules
- (1)
- Transition relationship. The polarity of the later sentence will reverse, and the whole sentence polarity is consistent with the latter.
- (2)
- Progressive relationship. The strength of the whole sentence will be enhanced.
- (3)
- Concession relationship. The polarity of the later sentence will reverse, and the final sentence polarity is the same as the former.
3.1.4. Contribution of Emoticons
3.1.5. Final Score
3.1.6. Activity Risk
3.2. Background Checks of Key Users
3.2.1. Sentiment of Historical Microblogs
3.2.2. Opinion of the Key User’s Circle of Friends
4. Experiment and Results
4.1. Datasets and Criteria
4.2. Preprocessing
4.3. Parameter Setting
4.4. Calculation of Sentiment Polarity
4.5. Calculation of Activity Risk
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Zhang, L.; Wang, S.; Liu, B. Deep learning for sentiment analysis: A survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2018, 8, e1253. [Google Scholar]
- Fang, Y.; Tan, H.; Zhang, J. Multi-Strategy Sentiment Analysis of Consumer Reviews Based on Semantic Fuzziness. IEEE Access 2018, 6, 20625–20631. [Google Scholar] [CrossRef]
- Ren, R.; Wu, D.D.; Liu, T.X. Forecasting Stock Market Movement Direction Using Sentiment Analysis and Support Vector Machine. IEEE Syst. J. 2019, 13, 760–770. [Google Scholar] [CrossRef]
- Aloufi, S.; Saddik, A.E. Sentiment Identification in Football-Specific Tweets. IEEE Access 2018, 6, 78609–78621. [Google Scholar] [CrossRef]
- Yuan, Z.; Wu, S.; Wu, F.; Liu, J.; Huang, Y. Domain Attention Model for Multi-Domain Sentiment Classification. Knowl. Based Syst. 2018, 155, 1–10. [Google Scholar] [CrossRef]
- Han, P.; Li, S.; Jia, Y.F. A Topic-Independent Hybrid Approach for Sentiment Analysis of Chinese Microblog. In Proceedings of the IEEE 17th International Conference on Information Reuse and Integration, Pittsburgh, PA, USA, 28–30 July 2016; pp. 463–468. [Google Scholar]
- Martineau, J.C.; Cheng, D.; Finin, T. Tisa: Topic independence scoring algorithm. In Machine Learning and Data Mining in Pattern Recognition; Springer: Berlin, Germany, 2013; pp. 555–570. [Google Scholar]
- Read, J.; Carroll, J. Weakly supervised techniques for domain-independent sentiment classification. In Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, Hong Kong, China, 6 November 2009; pp. 45–52. [Google Scholar]
- Cui, A.; Zhang, H.; Liu, Y.; Zhang, M.; Ma, S. Lexicon-Based Sentiment Analysis on Topical Chinese microblog messages. In Semantic Web and Web Science; Springer: Berlin, Germany, 2013; pp. 333–344. [Google Scholar]
- Yan, B.; Yecies, B.; Zhou, Z.Q. Metamorphic relations for data validation: A case study of translated text messages. In Proceedings of the IEEE/ACM 4th International Workshop on Metamorphic Testing (MET ’19), Montreal, QC, Canada, 26 May 2019; pp. 70–75. [Google Scholar]
- Taboada, M.; Brooke, J.; Tofiloski, M.; Voll, K.; Stede, M. Lexicon-based methods for sentiment analysis. Computational linguistics. Comput. Linguist. 2011, 37, 267–307. [Google Scholar] [CrossRef]
- Pang, B.; Lee, L.; Vaithyanathan, S. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, Philadelphia, PA, USA, 6–7 July 2002; pp. 79–86. [Google Scholar]
- Xu, G.X.; Yu, Z.H.; Yao, H.S.; Li, F.; Meng, Y.; Wu, X. Chinese Text Sentiment Analysis Based on Extended Sentiment Dictionary. IEEE Access 2019, 7, 43749–43762. [Google Scholar] [CrossRef]
- Tang, D.; Wei, F.; Qin, B.; Zhou, M.; Liu, T. Building large-scale twitter-specific sentiment lexicon: A representation learning approach. In Proceedings of the COLING 2014: 25th International Conference on Computational Linguistics, Dublin, Ireland, 23–29 August 2014; pp. 172–182. [Google Scholar]
- Keshavarz, H.; Abadeh, M.S. ALGA: Adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs. Knowl. Based Syst. 2017, 122, 1–16. [Google Scholar] [CrossRef] [Green Version]
- Xie, L.; Zhou, M.; Sun, M. Hierarchical structure based hybrid approach to sentiment analysis of Chinese micro blog and its feature extraction. J. Chin. Inf. Process. 2012, 26, 73–83. [Google Scholar]
- Tan, C.; Lee, L.; Tang, J.; Jiang, L.; Zhou, M.; Li, P. User-level sentiment analysis incorporating social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 21–24 August 2011; pp. 1397–1405. [Google Scholar]
- Socher, R.; Perelygin, A.; Wu, J.; Chuang, J.; Manning, C.D.; Ng, A.; Potts, C. Recursive deep models for semantic Compositionality over a sentiment treebank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Seattle, WA, USA, 18–21 October 2013; pp. 1631–1642. [Google Scholar]
- Kim, Y. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 25–29 October 2014; pp. 1746–1751. [Google Scholar]
- Wang, X.; Liu, Y.; Sun, C.; Wang, B.X.; Wang, X.L. Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015; pp. 1343–1353. [Google Scholar]
- Yang, M.; Jiang, Q.N.; Shen, Y.; Wu, Q.Y.; Zhao, Z.; Zhou, W. Hierarchical human-like strategy for aspect-level sentiment classification with sentiment linguistic knowledge and reinforcement learning. Neural Netw. 2019, 117, 240–248. [Google Scholar] [CrossRef] [PubMed]
- Xia, R.; Jiang, J.; He, H.H. Distantly Supervised Lifelong Learning for Large-Scale Social Media Sentiment Analysis. IEEE Trans. Affect. Comput. 2017, 8, 480–491. [Google Scholar] [CrossRef]
- Xu, W.; Tan, Y. Semisupervised Text Classification by Variational Autoencoder. IEEE Trans. Neural Netw. Learn. Syst. 2019, 1–14. [Google Scholar] [CrossRef] [PubMed]
- Melick, M.D. The relationship between crime and unemployment. Park Place Econ. 2003, 11, 13. [Google Scholar]
- Emotional Lexical Ontology Library. Available online: http://ir.dlut.edu.cn/EmotionOntologyDownload (accessed on 11 November 2018).
- China’s Internet Language. Available online: http://wangci.net/ (accessed on 17 November 2018).
- Zhu, Y.-L.; Min, J.; Zhou, Y.-Q.; Huang, X.-J.; Wu, L.-D. Semantic orientation computing based on hownet. J. Chin. Inf. Process. 2006, 20, 14–20. [Google Scholar]
- Zhang, C.; Liu, P.; Zhu, Z.; Fang, M. A sentiment analysis method based on a polarity lexion. J. Shandong Univ. Nat. Sci. 2012, 47, 47–50. [Google Scholar]
- Wang, Z.; Yu, Z.; Guo, B.; Lu, X. Sentiment analysis of Chinese microblog based on lexicon and rule set. Comput. Eng. Appl. 2015, 51, 218–225. [Google Scholar]
- Liu, P.Y.; Zhang, Y.H.; Zhu, Z.F.; Xun, J. Micro-blog orientation analysis based on emoticon symbol. J. Shandong Univ. 2014, 49, 8–13. [Google Scholar]
- Weibo. Available online: https://weibo.cn/pub/ (accessed on 5 March 2018).
- An Open Source and Collaborative Framework for Extracting the Data You Need from Websites. Available online: https://scrapy.org/ (accessed on 3 July 2018).
- FreeICTCLAS. Available online: https://www.oschina.net/p/freeictclas (accessed on 12 December 2018).
Degree | Example Words | Number | Strength |
---|---|---|---|
Most | 不得了 (extremely), 绝对 (absolutely) | 69 | 2 |
Over | 出头 (a little over), 过度 (excessively) | 30 | 1.7 |
Very | 不过 (moderately), 颇为 (mildly) | 42 | 1.5 |
More | 大不了 (at the worst), 更加 (all the more) | 37 | 1.3 |
-ish | 少许 (a little), 未免 (a bit too) | 29 | 0.8 |
Insufficiently | 相对 (relatively), 丝毫 (in the least) | 12 | 0.5 |
Positive | Negative | Strength |
---|---|---|
±1 | ||
±0.8 | ||
±0.6 | ||
±0.4 | ||
±0.2 |
T = 0.3 | T = 0.4 | T = 0.5 | |
---|---|---|---|
N = 10 | 64.3 | 65.7 | 64.8 |
N = 20 | 65.7 | 67.2 | 66.5 |
N = 30 | 65.9 | 68.5 | 67.1 |
Positive | Negative | |||||
---|---|---|---|---|---|---|
dic | 62.5 | 69.4 | 65.7 | 65.6 | 58.3 | 61.7 |
sim | 62.7 | 68.7 | 65.6 | 65.4 | 59.2 | 62.2 |
dic + sim | 68.7 | 76.6 | 72.4 | 73.5 | 65.1 | 69.1 |
dic + sim + emo | 69.7 | 77.7 | 73.5 | 74.8 | 66.2 | 70.2 |
SVM | 68.5 | 76.3 | 72.2 | 73 | 64.1 | 68.2 |
CNN | 67.9 | 75.4 | 71.8 | 71.5 | 63.2 | 67.6 |
Microblogs | Violence | Threat Level |
---|---|---|
发个炸弹,把国航的航班都炸飞,什么态度总让我老公误机哼 (Sending a bomb to Air China’s flights, what attitude, always makes my husband miss the opportunity) | −1.0 | High |
飞行时间好久啊,我想在飞机上抽烟! (Flying time is long, I want to smoke on the plane!) | −0.5 | Middle |
买个机票,航空公司服务人员态度好差,好想冲到机场讨个说法! (Buying a ticket, the airline service staff is in a bad attitude, I really want to rush to the airport to discuss! ) | −0.12 | Low |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jia, Y.-F.; Li, S.; Wu, R. Incorporating Background Checks with Sentiment Analysis to Identify Violence Risky Chinese Microblogs. Future Internet 2019, 11, 200. https://doi.org/10.3390/fi11090200
Jia Y-F, Li S, Wu R. Incorporating Background Checks with Sentiment Analysis to Identify Violence Risky Chinese Microblogs. Future Internet. 2019; 11(9):200. https://doi.org/10.3390/fi11090200
Chicago/Turabian StyleJia, Yun-Fei, Shan Li, and Renbiao Wu. 2019. "Incorporating Background Checks with Sentiment Analysis to Identify Violence Risky Chinese Microblogs" Future Internet 11, no. 9: 200. https://doi.org/10.3390/fi11090200
APA StyleJia, Y. -F., Li, S., & Wu, R. (2019). Incorporating Background Checks with Sentiment Analysis to Identify Violence Risky Chinese Microblogs. Future Internet, 11(9), 200. https://doi.org/10.3390/fi11090200