Whose tweet? Authorship analysis of micro-blogs and other short form messages

MacLeod, Nicola and Grant, Tim (2012) Whose tweet? Authorship analysis of micro-blogs and other short form messages. In: IAFL10 - The International Association of Forensic Linguists Tenth Biennial Conference, 11th - 14th July 2011, Birmingham, UK.

Full text not available from this repository.
Official URL: file:///C:/Users/sfrt2/Downloads/IAFL10%20CONFEREN...


Approaches to authorship attribution have traditionally been constrained by the size of the message to which they can be successfully applied, making them unsuitable for analysing shorter messages such as SMS Text Messages, micro-blogs (e.g. Twitter) or Instant Messaging. Having many potential authors of a number of texts (as in, for example, an online context) has also proved problematic for traditional descriptive methods, which have tended to be successfully applied in cases where there is a small and closed set of possible authors.

This paper reports the findings of a project which aimed to develop and automate techniques from forensic linguistics that have been successfully applied to the analysis of short message content in criminal cases. Using data drawn from UK-focused online groups within Twitter, the research extends the applicability of Grant’s (2007; 2010) stylistic and statistical techniques for the analysis of authorship of short texts into the online environment. Initial identification of distinctive textual features commonly found within short messages allows for the development of a taxonomy which can then be used when calculating the ‘distance’ between messages containing instances of these feature types. The end result is an automated process with a high level of success in assigning tweets to the correct author. The research has the potential to extend the scope of reliable and valid authorship analysis into hitherto unexplored contexts. Given the relative anonymity of the internet and the availability of cloaking technology, linguistic research of this nature represents a crucial contribution to the investigative toolkit.

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: authorship analysis, stylistic methods, statistical methods, online messaging
Subjects: Q100 Linguistics
Department: Faculties > Arts, Design and Social Sciences > Humanities
Depositing User: Paul Burns
Date Deposited: 13 May 2019 09:40
Last Modified: 10 Oct 2019 19:02
URI: http://nrl.northumbria.ac.uk/id/eprint/39280

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics