Sergei Koltcov on computational social science and topic modeling
Computational social science is to design better societies. That is what Alex Pentland, the founder of MIT Media Lab said in his keynote speech at the most representative conference on computational social science organized by Aalto university in Helsinki in June 2015. “We were extremely lucky to have been accepted to this conference, said Sergei Koltcov, deputy director of LINIS.
Olessia Kolsova, director of LINIS, Sergei Koltcov, dep.director of LINIS, and Ee-Peng Lim, director of Living Analytics Research Centre, Singapore. |
To design better societies, or even their separate sub-systems, computational social science still has a long way to go, Sergei Koltcov thinks. It still has to learn to apply sophisticated mathematical models to real-life social tasks and to real social data which are often incomplete, noisy and present difficulties for defining ground truth. One of such problems that Sergei investigates is related to LDA, a popular algorithm for automated extraction of topics from large text collections. In principle, it can be a powerful tool for social scientists by providing them a quick way to grasp topical structure of millions of blogs, newspaper articles, medical records or other meaningful texts. But for one thing: LDA gives you completely different results every time you run it on the same data. That is, you cannot say anything reliable on their topical composition.
This problem was overlooked for some time because the quality of LDA was evaluated with formal metrics that are usually applied by mathematicians (and these metrics scored LDA high). Social scientists could only observe the algorithm’s flaws being unable to offer solutions. The situation began to change when physicists came into the game uniting effort both with mathematicians and social scientists. “That’s what computational social science is about, says Sergei, it is about interdisciplinary collaboration and synergetic results”. His method of stabilization the algorithm developed in Laboratory for Internet Studies was well received by the audience. “We have got useful feedback from colleagues from UK, Finland, Qatar and Singapore. We hope that some of these contacts will grow into collaboration”.