
Web applications (e.g., DMI-TCAT and our very own Social Feed Manager). Software libraries (e.g., Tweepy for Python and rtweet for R). They are quite varied in their capabilities and require different levels of technical skills and infrastructure. While you can write your own software for accessing the Twitter API, a number of tools already exist. Filtering real-time tweets (i.e., the tweets as they are passing through the Twitter platform upon posting). Retrieving tweets from a user timeline (i.e., the list of tweets posted by an account).
While supporting a large number of functions for interacting with Twitter, the API functions most relevant for acquiring a Twitter dataset include: Retrieve from the Twitter public APIĪPI is short for “Application Programming Interface” and in this case is a way for software to access the Twitter platform (as opposed to the Twitter website, which is how humans access Twitter).
Access or purchase from a Twitter service provider. There are 4 primary ways of acquiring Twitter data (and I’m not including “cutting and pasting” from the Twitter website!): These factors will determine the most appropriate means of acquiring a Twitter dataset. How will the researcher be performing analysis? With her own tools? Or would analytic tools for Twitter data be beneficial?. What are the technical skills of the researcher?. Does the researcher need to share the Twitter dataset as part of publication / reproducible research?. Does the researcher have funding to acquire Twitter data?.
In addition, other relevant factors of the research include: Is a complete dataset needed (i.e., every tweet that meets criteria) or is an incomplete or sampled dataset acceptable?.Are historical tweets needed? Or current tweets?.Just as the research to be performed is varied, so are the requirements for Twitter data. The purpose of this blog post is to describe the options for getting Twitter data for academic research in the hopes of lowering at least that initial barrier. However, many face an immediate barrier in understanding the options for acquiring that data. It has been my experience that faculty, students, and other researchers have no shortage of compelling research questions that require Twitter data.