Linguistic Inquiry and Word Count
May 18, 2011 § Leave a comment
The quantitative analysis of the number of edits in time to Wikipedia pages related to traumatic events can reveal interesting patterns that could be associated to collective memory processes. Indeed, it seems that people tend to contribute more to these articles and talk pages during anniversaries, suggesting the presence of commemorative activities.
But if you want to study the formation of collective memory about traumatic events on Wikipedia you can’t choose not to consider the actual content of articles and talk pages. It is possible to perform content analysis going through the edits to few pages, but if you want to analyze, let’s say, 100 articles related to traumatic events – with their talk pages -, you may consider some automated tool that helps you to perform computerized text analysis.
One of these tools is the Linguistic Inquiry and Word Count(LIWC; Pennebaker, Francis, & Booth, 2001), which searches for words across more than 70 categories (for instance, linguistic categories, such as pronouns, articles, tenses; or psychological categories, such as social, affective, cognitive processes).
LIWC allows to explore the patterns of language used in a text, helping to identify the psychological processes of the text’s author. For instance, Cohn and colleagues (2004) used LIWC to analyze blog posts for two months prior to and after the 11 September 2001 attacks. They found signs of pronounced psychological changes in the language used by bloggers, mainly related to an increase of words associated to negative emotions, cognitive processing, social engagement and psychological distancing during the first days following the attacks.
Our aim is to adapt this methodology and extend it to the larger sample of Wikipedia pages about traumatic events, whose collective memory is built over time by thousands of editors. To this end we have developed PyWC, a software tool similar to LIWC, and released it as open source. We are currently analyzing if there are statistically significant differences in the use of language associated to various psychological categories, such as words related to positive and negative emotions, between pages about traumatic events and other Wikipedia pages. We are also interested in the evolution over time of these indexes in pages related to traumatic events, our hypothesis being that this is another way to detect the gradual transformation from communicative to cultural memory.