Social networks like Twitter, Facebook and Instagram allow people to share content and build communities. Each user has their own social network of other users they're connected to. This means that we can think of social networks like a large graph, with accounts as nodes in the graph and the connections between accounts as edges.
Earlier this month, we presented a technical research paper at Black Hat USA 2018 called Don’t @ Me: Hunting Twitter Bots at Scale that details the process of gathering a large public Twitter dataset and finding automated accounts (bots) within that dataset. In this research, we examined a large botnet consisting of over 15,000 bots that actively spread a cryptocurrency giveaway scam. Moreover, we showed how mapping out the connections between accounts allowed us to discover the structure and organization of the botnet.
Visually mapping social network connections reveals patterns that may otherwise be hidden in the data. If that weren’t enough, these graphs also serve as compelling pieces of generated artwork.
This post shows the step-by-step process to create a graph of your own social network using Gephi.
Introduction to Gephi
Gephi is open-source software that makes it easy to generate beautiful layouts of graphs and networks. The graphs generated by Gephi can be explored, analyzed, filtered and modified.
There are many options offered by Gephi that allow graphs to be formatted so that they most effectively tell the story you're trying to tell. Everyone's workflow is different, but my standard process when building a graph with Gephi is:
- Import the GEXF file
- Apply a layout
- Color/resize the nodes and edges by attribute or through automatic community detection.
- Export the resulting SVG
While our previous work focused on using this process to map relationships between bots, the same process can be applied to any graph of social networks.
To show how this process works, let's take a look at how to create a map of your own social network.
Graphing Your Social Network
Gathering the Data
In order to make a graph of a social network, we first need to gather the data. As part of our research, we open-sourced a script, crawl_network.py, that crawls the social network for a user and exports the results in GEXF format.
To fetch my own social network, I can run the script like this:
python crawl_network.py --degree=1 --max-connections=5000 --root-connections jw_sec
The first thing to consider when running the script is that gathering connections is a slow process. To gather the data, we use the followers/ids and friends/ids API endpoints. At the time of this writing, these endpoints are rate limited at 15 requests per 15 minutes. This means that crawling thousands of accounts may take multiple days. We can first limit the number of degrees we want to crawl using the
--degree flag. We can further limit the number of connections we fetch per account using the
The next thing to consider is how much data Gephi can handle. When crawling social networks, it's easy to generate graphs with hundreds of thousands of nodes and edges. Large graphs reduce Gephi’s ability to quickly apply layouts. To help manage the size of our graph, we can use the
--root-connections flag to only map connections between nodes that are immediately in our social network.
Running the script outputs two files: the raw JSON results in ndjson format, and a GEXF file for use in Gephi.
Visualizing the Graph
When opening up the GEXF file in Gephi, we're presented with a large group of nodes clustered together.
We first want to clean up the layout of the graph, which will make it easier to see how the nodes and edges are structured.
Graphs are visualized using a layout. These are algorithms that organize nodes and edges in unique ways. Our graph has 2,700 nodes, making it a good candidate for using Force Atlas 2 as the layout. You can adjust settings as needed in the “Layout” window, but I'll typically apply a "Scaling" factor greater than 1,000 to help spread out the nodes. Then, I'll enable the "Stronger Gravity" option to help keep the overall graph contained.
These are the options I used to graph my social network:
After setting our options, we can click "Run" to apply the layout. I let this run for a few moments to stabilize before hitting "Stop," resulting in the following graph:
You’ll notice that the nodes in my social network are tightly connected to one another, creating a circular graph. Most of the accounts in my network are related to infosec, so it’s expected that many of them follow each other.
If you want a black and white graph, you can skip to the "Exporting a Work of Art" section below. Otherwise, let's add some color to our graph.
Adding Some Color
Gephi allows you to assign colors to nodes using attributes. Some attributes can be provided in the GEXF file (such as whether an account is a bot or not), while others can be calculated in Gephi itself.
A common practice is to color groups of nodes by communities. Gephi can run an algorithm that determines which nodes are likely in the same community based on their connections, and then color each community differently. This is useful when we want to find groups of users within a population.
To identify the communities in a graph, run the "Modularity" option located under the "Statistics" sidebar, accepting the default options. After this is completed, you'll be presented with a screen showing how many communities were found and the number of nodes in those communities.
Running this process also adds an attribute that can be used to assign colors. To color the graph based on these communities, select the "Nodes" tab under the "Appearance" panel in the left sidebar. Then, select the nested "Partition" tab. Finally, select "Modularity Class" as the attribute you want to use for assigning color.
Gephi will automatically assign colors for you, but these can be changed. For now, we'll use the default - clicking "Apply" to color the graph.
It’s interesting to explore the communities that Gephi uncovers. In my case, most of the communities were still related to infosec, but the green community in the bottom left of the graph consisted largely of accounts of software developers or designers.
With the colors applied, all that's left is to export our graph!
Exporting a Work of Art
Now we have a graph that is organized and colored by communities. All that’s left is to generate a clean, high-resolution work of art!
Opening the “Preview” pane gives a list of options that determine what the final graph looks like. Gephi offers a few presets to make this easier. A common preset you’ll see used is “Black Background,” which creates a graph with curved edges on a black background.
For our case, I used the “Default Curved” preset, and reduced the opacity of the edges. I also removed the labels since, for this graph, there are so many nodes that it would be difficult to read.
Here are the final options that I used to generate my graph:
After setting the options, pressing the “Refresh” button will generate a preview of your graph:
With the layout the way we want it, we can use the “Export” button to export our final image:
This was just a brief introduction into how Gephi is used to create compelling graphs of networks. There are quite a few ways to customize the graph to fit your preferences, so I encourage you to explore the various options and tutorials offered by Gephi.
If you’re interested in how we used this network mapping to map out relationships between bots in a botnet, we encourage you to check out our research Don't @ Me: Hunting Twitter Bots at Scale.
After releasing the initial research, we got feedback asking us to release the graphs as wallpapers. As part of this post, we are excited to announce that we’re releasing high-resolution wallpapers of each graph that you are free to use.