Authors:
Kostiantyn Kucher
1
;
Nellie Engström
1
;
Wilma Axelsson
1
;
Berkant Savas
1
;
2
and
Andreas Kerren
1
;
3
Affiliations:
1
Department of Science and Technology, Linköping University, Norrköping, Sweden
;
2
iMatrics AB, Linköping, Sweden
;
3
Department of Computer Science and Media Technology, Linnaeus University, Växjö, Sweden
Keyword(s):
Information Visualization, Text Visualization, Natural Language Processing, News, Editorial Media, Swedish Language, Journalism.
Abstract:
The amount of available text data has increased rapidly in the past years, making it difficult for many users to find relevant information. To solve this, natural language processing (NLP) and text visualization methods have been developed, however, they typically focus on English texts only, while the support for low-resource languages is limited. The aim of this design study was to implement a visualization prototype for exploring a large number of Swedish news articles (made available by industrial collaborators), including the temporal and relational data aspects. Sketches of three visual representations were designed and evaluated through user tests involving both our collaborators and end-users (journalists). Next, an NLP pipeline was designed in order to support dynamic and hierarchical topic modeling. The final part of the study resulted in an interactive visualization prototype that uses a variation of area charts to represent topic evolution. The prototype was evaluated thr
ough an internal case study and user tests with two groups of participants with the background in journalism and NLP. The evaluation results reveal the participants’ preference for the representation focusing on top topics rather than the topic hierarchy, while suggestions for future work relevant for Swedish text data visualization are also provided.
(More)