This paper examines what social media mining methods and tools are on offer, how they should best be used, and what sorts of ethical issues are raised by their use.
The ongoing production of staggeringly huge volumes of digital data is a ubiquitous part of life in the early twenty-‐first century. A large proportion of this data is text. This development has serious implications for almost all scholarly endeavour. It is now possible for researchers from a wide range of disciplines to use text mining techniques and software tools in their daily practice. In our own field of political communication, the prospect of cheap access to what, how, and to whom very large numbers of citizens communicate in social media environments provides opportunities that are often too good to miss as we seek to understand how and why citizens think and feel the way they do about policies, political organizations, and political events. But what are the methods and tools on offer, how should they best be used, and what sorts of ethical issues are raised by their use?
In this chapter we proceed as follows. First, we provide a basic definition of text mining. Second, we provide examples of how text mining has been used recently in a diverse range of analytical contexts, from business to media to politics. Third, we discuss the challenges of conducting text mining in online social media environments, focusing on issues such as the problem of gaining access to social media data, research ethics, and the integrity of the data corpuses that are available from social media companies. Fourth, we present a basic but comprehensive survey of the text mining tools that are currently available. Finally, we present two brief case studies of the application of text mining in the authors’ field of political communication: a research project that analysed political discussion on the popular social media service Twitter during the British general election of 2010, and a study of the early-‐2010 ‘Bullygate’ crisis in British politics. We conclude with some observations about the proper place of text mining in social science research. Our overall argument is that text mining is at its most useful when it brings together quantitative and qualitative modes of enquiry. The technology can be powerful but it is often a blunt instrument. Human intervention is always necessary during the research process in order to refine the analysis. Indeed, rather than assuming that text mining software and big datasets will do the work, social science researchers would be wise to begin any project from the assumption that they will need to combine text mining tools with more traditional approaches to the study of social phenomena.
Authored by Lawrence Ampofo, Simon Collister, Ben O’Loughlin, and Andrew Chadwick.