Jan Zizka: Mining multilingual online hotel reviews. Lab seminar at LINIS
On November 26, 2012, the Internet Studies Lab organized a workshop with the participation of guest expert Jan Zizka, Professor at Mendel University in Brno, a member of the programme committee for the CoNeCo (Computer Networks & Communications) Conference, and an honorable member of the Academy & Industry Research Collaboration Center.
Zizka_1 |
In the first part of the meeting, Jan spoke on his study of hotel reviews left by booking.com visitors. A large gathering of review data in different languages (from English to Japanese!) was analyzed for keywords, which can reflect positive and negative aspects of a hotel’s operations. The method used to select such words was the generation of a decision tree based on the minimization of entropy indicators.
Zizka_4 |
In the second part of the meeting, dedicated to text mining – methods of processing text corpora and extracting implicit information from them, Sergey Koltsov, Senior Research Fellow at LINIS, presented the Lab’s results in this area. Jan Zizka told the audience about a study on building a thesaurus of positive and negative reviews for restaurants in English and Czech based on gCluto software. He also described the problems he faced: false positive and false negative responses from the system, related to the ambiguity of positive and negative evaluations with different words.
A lively discussion took place throughout the workshop. The participants had many questions for each other, as well as comments and advice for further research.