Quotation Lutzky, Ursula, Kehoe, Andrew. 2016. "Your blog is (the) shit". A corpus linguistic approach to the identification of swearing in computer mediated communication. International Journal of Corpus Linguistics, 21 (2), 165-191.




The study of swearing has increased in the last decade, diversifying to include a wider range of data and methods of analysis. Nevertheless, certain types of data and specifically large corpora of computer mediated communication (CMC) have not been studied extensively. In this paper, we fill a gap in research by studying the use of swearwords in blog data, and illustrate ways of identifying swearing in a large corpus by taking context into account. This approach, based on the examination of shared and unique collocates of known expletives, facilitates the distinction of attestations of swearing from non-swearing in the case of polysemous lexemes, and the analysis of overlaps in usage and meaning of swearwords. This work therefore goes beyond basic sentiment analysis and offers new insights into the use of collocation for refining profanity filters, providing innovative perspectives on issues of growing importance as online interaction becomes more widespread.


Press 'enter' for creating the tag

Publication's profile

Status of publication Published
Affiliation WU
Type of publication Journal article
Journal International Journal of Corpus Linguistics
Citation Index SSCI
Language English
Title "Your blog is (the) shit". A corpus linguistic approach to the identification of swearing in computer mediated communication
Volume 21
Number 2
Year 2016
Page from 165
Page to 191
Reviewed? Y
URL https://benjamins.com/#catalog/journals/ijcl.21.2.02lut/details
DOI http://dx.doi.org/10.1075/ijcl.21.2.02lut


Lutzky, Ursula (Details)
Kehoe, Andrew (Birmingham City University, United Kingdom)
Institute for English Business Communication IN (Details)
Research areas (Ă–STAT Classification 'Statistik Austria')
6604 Applied linguistics (Details)
6611 Linguistics (Details)
6633 Computational linguistics (Details)
6643 Synchronic linguistics (Details)
Google Scholar: Search