We recently presented a technical research paper at 2018 Black Hat USA called Don’t @ Me: Hunting Twitter Bots at Scale. This paper provides an in-depth look at the entire process of gathering a large Twitter dataset and using a practical data science approach to identify automated accounts within that dataset.
In the “Anatomy of Twitter Bots: Fake Followers” blog post, we explored fake followers, which are bots that follow accounts to artificially boost that account’s popularity. Among other things, we exemplified how, by examining at an account in relation to their social network, we can identify their purpose.
Fake followers are not the only type of Twitter bot. In this post, we want to explore another aspect of the bot ecosystem. One such bot exists not to amplify accounts, but rather to amplify content through artificial retweets and likes. We call these amplification bots. In this post, we’ll explore the characteristics that make up amplification bots and how to build a crawler that can map out entire botnets of these types of bots.
01. Searching for Normal
Amplification bots boost content through likes and retweets. In this post, we focus on bots that inflate a tweet’s popularity by retweeting it. There are a couple of reasons why we focus on bots that spread information through retweets. First, as mentioned in the research paper, there is no Twitter API endpoint for determining which accounts liked a tweet. Second, we consider this automated retweeting of a tweet to be more damaging to social network conversation, since it actively spreads content as opposed to just artificially boosting the content’s popularity.
When looking for bots, we first have to ask the question: “What’s normal behavior?” Having used Twitter for a while, it is reasonable to expect that the number of likes for a particular tweet would be higher than the number of retweets, since liking a tweet is a lower-impact action. While this feels intuitively correct, we need to validate this with real data so we can see what normal really is when thinking about real tweets and the number of likes and retweets they actually receive.
Fortunately, as part of our Don’t @ Me: Hunting Twitter Bots at Scale project, we collected a dataset of 576 million tweets that we can use to get hard numbers on what this ratio looks like at a large scale.
To get these results, we filtered our dataset for tweets that had greater than 50 retweets. This helped avoid the case where low numbers of likes and retweets could cause ratios that might skew the overall numbers.
Plotting these ratios show that half of the tweets in our dataset have nearly a 2:1 ratio of likes vs. retweets, while 80 percent of the tweets have at least more likes than retweets (greater than 1:1 ratio):
In addition to examining the ratio of retweets, we sought to understand the composition of an average user’s timeline (last 200 tweets). We expect bots that amplify content through retweets to have a timeline that is mostly, if not completely, composed of retweets while normal users would have retweets that are a mixture of retweets, replies and tweets of original content.
In our dataset, we found that an average account’s timeline is composed 37.6 percent of retweets while the 90th percentile was composed of 75 percent of retweets. Because our dataset of tweets does include accounts that exhibit bot-like characteristics, it’s important to note that the the overall distribution of retweets in an account’s timeline may be affected by their behavior.
Finally, we believe that a genuine account’s timeline will generally be in chronological order due to how content appears on the app or the web user interface and the tweets that they author themselves. On the other hand, we believe an amplification bot’s timeline would be scattered. The difference between the two timelines is illustrated below. The x-axis represents the order in which a tweet appears on a user’s timeline from newest to oldest, and the y-axis is the date a tweet was authored by the target user or the user whose tweet they retweeted.
From looking at the two charts, it is evident that the amplification bot’s timeline is indeed more scattered than Jordan’s (@jw_sec). To measure this quantitatively, we determined the inversion count, which, in this case, measures the sortedness of a user’s timeline. We would expect the inversion count of an amplification bot’s timeline to be higher than that of a genuine account. In our example above, the inversion count for Jordan’s timeline is 63, while the inversion count is 2028 for the amplification bot.
02. So We Know What’s Normal, Now Let’s Find Bots
Now that we know what normal activity looks like, we can search for amplification bots that consistently have abnormal behavior. In our case, this means finding accounts that not only have a large number of retweets compared to original tweets, but also that the retweeted statuses consistently have a higher retweet-to-like ratio.
Here’s an example of an amplification bot:
At first glance, there’s nothing about this account that screams that it is a bot. The screen name and display name both seem normal enough. The account doesn’t have many followers in comparison to the number of accounts it is following, but since people connect and use Twitter differently, that alone doesn’t make the account a bot let alone an amplification bot.
However, the account’s most recent (re)tweet has 969 retweets and 164 likes, which is strange. Most tweets with that many retweets won’t have a retweet-to-like ratio of almost 6:1. To put some numbers to how rare this is, only 0.2 percent of tweets in our dataset had more than at least 900 retweets and a similar retweet-to-like-ratio.
As we continue down the account’s timeline, we see many other retweets that appear to be amplified given their retweet-to-like ratio. Also, the account appears to have not authored a tweet. All of these characteristics point to this account being a potential amplification bot.
We can assume that no one purchases just one retweet. Moreover, accounts that are used to amplify one tweet are most likely used to amplify others. With that in mind, we wrote a script that, given the ID of an account that is deemed to be an amplification bot, can find other potential amplification bots.
Here’s how it works:
- Fetches the tweets of the seed account.
- Filters out original tweets or tweets that contain “retweet” in the text.
- Most of the tweets with the string “retweet” in them aren’t amplified, but rather, users are retweeting to pick one choice over another or to support a cause.
- Finds amplified tweets among the remaining retweets.
- We deemed a tweet to be amplified if had more than 50 retweets and a retweet-to-like ratio greater than five.
- Find retweeters of amplified retweets through the use of the
- This endpoint only returns 100 user IDs and only allows for 75 requests in a 15-minute window. Because of these constraints, we opted to only use this endpoint once per tweet.
Now that we have the IDs of the retweeters, we need to determine if they are, in fact, amplification bots. We made this determination by checking the following:
- If at least 90 percent of their tweets were retweets
- If at least ⅓ of their retweets were amplified
- If the number of inversions for their timeline was greater than 100
If an account met all of those criteria, we would repeat the process of gathering their retweets, obtaining retweeters and determining if they were a potential amplification bot. When gathering a user’s retweets, we used the
statuses/user_timeline API endpoint, which returns a user’s last 200 tweets. Instead of analyzing all of a user’s retweets, we opted for breadth and only used those 200 tweets.
After running this search for just over one day, we found over 7,000 potential amplification bots. In the video below, you can see the amplified tweets (in green) and the potential amplification bots in (black) that we found over time.
In theory, if we continued to let our script run, we would continue to find amplification bots.
In our approach, we established criteria by which we deemed an account to be an amplification bot or not. These criteria were based on the data that we had available along with some assumptions. There is the possibility that our thresholds were too lenient or too strict.
For example, we said a tweet was amplified if its retweet-to-like ratio was greater than 5 because of the distribution of this ratio among popular tweets. It’s certainly possible that there are amplified tweets with a lower ratio. Also, we chose to only get 100 retweeters from each tweet. If we had fetched every single retweeter for each tweet, we would have had more black nodes and fewer green nodes in the graph above.
One of the things that makes social networks great is that we are allowed to share content whether we are the author or someone else on the platform is. Moreover, on Twitter, we are able to express our endorsement and/or appreciation of this content and share it through a retweet. By artificially inflating the popularity of content, amplification bots not only affect how content spreads, but also its perceived credibility.
In this post, we showed how finding one bot and fetching its retweets can uncover thousands of additional bots.
For more information on how to gather a large Twitter dataset and find bots within that dataset, be sure to check out our research paper Don’t @ Me: Hunting Twitter Bots at Scale.