Development of concept and methodology for multi-level monitoring of the state of interethnic relations with the data from social media
Project Head: Olessia Koltsova
The goal of the project "Development of concept and methodology for multi-level monitoring of the state of interethnic relations with the data from social media" was to provide government officials, non-governmental organizations, and researchers with a well-tested methodology and a convenient tool to monitor the discussion of the problems of ethnicity and nationality by Internet users. The end result of our project should contain methodical recommendations and software tools for the analysis of text data mined from social networks. Our approach to monitoring is based on topic modeling, a class of algorithms that analyze cooccurrences of words in texts and based on this data find latent topics in text corpora that would be much too large to read for a human. The result of such an algorithm includes the probabilities of documents expressing certain topics and words appearing in a certain topic, which provides the user with a concise representation of the topical structure of the collection and distinguished texts that best represent specific topics. This approach is augmented with automated sentiment analysis and visualization of distributions of ethnic-related topics over regional and temporal structure of the collection. Thus, an analyst gets a graphic representation of where, when, with what polarity (positive or negative), and in what specific context certain ethnic groups are being discussed on the Web.
Topic modelling for qualitative studies (companion page for the corresponding paper)TopicMiner (software adjusted for the purposes of monitoring ethnicity)
1. M. Apishev, S. Koltsov, O. Koltsova, S.I. Nikolenko, K. Vorontsov. Mining Ethnic Content Online with Additively Regularized Topic Models //Computacion y Sistemas, vol. 20, no. 3, 2016, pp. 387–403
2. Nikolenko S.I. Topic Quality Metrics Based on Distributed Word Representations //Proc. 39th International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR 2016), 2016, pp. 1029–1032 ( ACM DL).
3. Apishev M. Parallel Non-blocking Deterministic Algorithm for Online Topic Modeling //Accepted to Proc. 5th International Conference on Analysis of Images, Social Networks, and Texts ( AIST 2016), 2016.
4. Bodrunova S. Who’s bad? Attitudes to re-settlers from post-Soviet South versus other nations in the Russian blogosphere // International Journal of Communication (under review).
/Полная коллекция будет доступна после публикации работ по ней/
/Полные словари будут доступны после публикации работ по ним/
Список этнонимов по группам (TXT, 2 Кб)
Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!