Aug 8, 2017

Hunting Malicious npm Packages

By Jordan Wright

Last week, a tweet was posted showing that multiple packages were published to npm, a Javascript package manager, that stole users’ environment variables:

@kentcdodds Hi Kent, it looks like this npm package is stealing env variables on install, using your cross-env package as bait: pic.twitter.com/REsRG8Exsx
— Oscar Bolmsten (@o_cee) August 1, 2017

The names of these malicious packages were typosquats of popular legitimate packages. In this case, the attackers relied on developers incorrectly typing in the name of the package when they ran npm install.

This is dangerous because many environments store secret keys or other sensitive bits of information in environment variables. If administrators mistakenly installed these malicious packages, these keys would be harvested and sent to the attacker. And, in this particular attack, the malicious packages were listed to "depend" on the legitimate counterparts, so the correct package would eventually be installed and the developer would be none the wiser.

With npm having a history of dealing with malicious packages - either hijacked legitimate packages or malicious packages created from scratch - we decided to analyze the entire npm package repository for other malicious packages.

This Isn't npm's First Rodeo

This isn't the first time npm has had incidents like this. In 2016, an author unpublished their npm packages in response to a naming dispute. Some of these packages were listed as dependencies in many other npm packages, causing widescale disruption and concerns around possible hijacking of the packages by attackers.

In another study published earlier this year, a security researcher was able to gain direct access to 14% of all npm packages (and indirect access to 54% of packages) by either brute-forcing weak credentials or by reusing passwords discovered from other unrelated breaches, leading to mass password resets across npm.

The impact of hijacked or malicious packages is compounded by how npm is structured. Npm encourages making small packages that aim to solve a single problem. This leads to a network of small packages that each depend on many other packages. In the case of the credential compromise research, the author was able to gain access to some of the most highly depended-upon packages, giving them a much wider reach than they would have otherwise had.

For example, here's a map showing the dependency graph of the top 100 npm packages (source: GraphCommons)

Dependency Graph

How Malicious npm Packages Take Over Systems

In both of the previous cases, access to the packages was gained by researchers. However, the question stands - what if an attacker gained access to the packages? How can they use this access to gain control of systems?

The easiest way, which was also the way leveraged by the malicious typosquat packages, is to abuse the ability for npm to run preinstall or postinstall scripts. These are arbitrary system commands specified in the npm package's package.json file to be run either before or after the package is installed. These commands can be anything.

Having this ability is not, by itself, an issue. In fact, these installation scripts are often used to help set up packages in complex ways. However, they are an easy way for attackers to leverage access to packages - hijacked or created - in order to easily compromise systems.

With this in mind, let's analyze the entire npm space to hunt down other potentially malicious packages.

Hunting for Malicious npm Packages

Getting the Packages

The first step in our analysis is getting the package information. The npm registry runs on CouchDB at registry.npmjs.org. There used to be an endpoint at /-/all that returned all the package information as JSON, but it has since been deprecated.

Instead, we can leverage a replica instance of the registry at replicate.npmjs.org. We can use the same technique leveraged by other libraries to get a copy of the JSON data for every package:

curl https://replicate.npmjs.com/registry/_design/scratch/_view/byField > npm.json

Then, we can use the JSON processing tool jq to parse out the package name, the scripts, and the download URL with this nifty one-liner:

cat npm.json | jq '[.rows | to_entries[] | .value | objects | {"name": .value.name, "scripts": .value.scripts, "tarball": .value.dist.tarball}]' > npm_scripts.json

To make analysis easier, we'll write a quick Python script to find packages with preinstall, postinstall, or install scripts; find files executed by the script; and search those files for strings that could indicate suspicious activity.

Findings

PoC Packages

Developers have known about the potential implications of installation scripts for quite a while. One of the first things we noticed when doing this research were packages that aimed to show the impact of these exact issues in a seemingly benign way:

Tracking Scripts

The next thing we found were scripts that tracked when the packages were installed. Npm provides some download metrics on the package listing itself, but it appears that some authors wanted more granular data, causing potential concerns around user privacy. Here are some packages using Google Analytics or Piwik to track installations:

Some packages were less obvious about their tracking, in that they hid the tracking scripts within Javascript installation files rather than just embedding shell commands in the package.json.

Here are the other tracking packages we discovered:

ikst (Tracking script)
botbait (Tracking script)
mktmpio (Tracking script))
anarchy (Tracking script)

Malicious Scripts

Finally, we looked for packages that had installation scripts that were obviously malicious in nature. If installed, these packages could have disastrous effects on the user's system.

The Case of `mr_robot`

Digging into the remaining packages, we came across an interesting installation script in the shrugging-logging package. The package's claims are simple: it adds the ASCII shrug, ¯_(ツ)_/¯, to log messages. But, it also includes a nasty postinstall script which adds the package's author, mr-robot, to every npm package owned by the person who ran npm install.

Here's a relevant snippet. You can find the full function listing here.

This script first uses the npm whoami command to get the current user’s username. Then, it scrapes the npmjs.org website for any packages owned by this user. Finally, it uses the npm owner add command to add mr_robot as an owner to all of these packages.

This author has also published these packages, which include the same backdoor:

test-module-a
pandora-doomsday

Worming into Local Packages

The last malicious package we discovered had code that was, in many ways, identical to the packages from mr_robot, but had a different trick up its sleeve. Instead of just modifying the owners of any locally-owned npm packages, the sdfjghlkfjdshlkjdhsfg module shows a proof of concept of how to infect and re-publish these local packages.

The sdfjghlkfjdshlkjdhsfg installation script shows what this process would look like by modifying and re-publishing itself:

You can find the full source here.

While this is a proof-of-concept, this exact technique can be easily modified to worm into any local package owned by the person doing the install.

Conclusion

It’s important to note that these issues don't just apply to npm. Most, if not all, package managers allow maintainers to specify commands to be executed when a package is installed. This issue is just arguably more impactful to npm simply due to the dependency structure discussed earlier.

In addition to this, it's important to note that this is a hard problem to solve. Static analysis of npm packages as they are uploaded is difficult - so much so that there are companies dedicated to solving the problem.

There are also reports from npm developers that suggests there may be work being done to leverage various metrics to help prevent users from downloading malicious packages:

I'm working on a thing that uses quality metrics and prompts users. It would probably catch just about everything folks have brought up
— 多分◯ちゃんよね🕵🏼‍♀️ (@maybekatz) August 1, 2017

In the meantime, it's recommended to continue being cautious when adding dependencies to projects. In addition to minimizing the number of dependencies, we recommend enforcing strict versioning and integrity checking of all dependencies, which can be done natively using yarn or using the npm shrinkwrap command. This is an easy way to get peace of mind that the code used in development will be the same used in production.

Npm Packages