Keeping Dependencies Straight in the Software Supply Chain

A new attack method illustrates how developers' "blind trust" in the software components they use could be abused by attackers to breach internal networks.

Security researcher Alex Birsan developed a "dependency confusion" attack, where he tricked build systems into pulling code from malicious packages and libraries and not the legitimate ones when building the application. He showed that if he knew the names of private libraries used by the company to build their applications, he could register those names with public package repositories and the build systems would grab the public components instead of looking for the internal components. These backdoored applications could give an attacker access onto the network to carry out activities such as siphoning out data or infecting other machines.

"Squatting valid internal package names was a nearly sure-fire method to get into the networks of some of the biggest tech companies out there," Birsan wrote in the Medium post. “[The success rate was simply astonishing.”

Birsan developed this method on Node's npm and the npm registry, Python's pip using Python Package Index, and RubyGems. He succeessfully accessed the network for more than 35 well-known technology companies--a list which includes some well-known brands, such as Microsoft, Apple, PayPal, Shopify, Netflix, Yelp, Tesla and Uber. The companies gave express permission for him to use this method and he collected $130,000 in bug bounties for the work.

Developers put a lot of trust in the installers used to install software packages from code repositories. When a Python developer types pip install to download and install third-party code, the developer is trusting that the code is authentic and not malicious.

Security experts have long warned developers to be aware of "dependency-chain" attacks, where a developer accidentally grabs the incorrect software component than the one originally intended, but those attacks typically relies on some form of social engineering or human error to succceed. Typosquatting, where the names of the dependencies look similar to legitimate packages, is very common, such as the malicious JavaScript package electorn (pretending to be electron, a framework for writing cross-platform desktop applications) which npm removed last September. Another common method involves uploading malicious code under expired dependencies and forcing the application to downgrade to that version. The difference between Birsan's method and these attacks is the fact that Birsan didn't need to trick the developers--there was no social engineering involved.

Birsan and another researcher, Justin Gardner, was looking through the package.json file for an application PayPal uses internally when they realized the file listed both public packages available from npm and non-public package names that did not exist on the public npm registry. The non-public package names most likely were being hosted internally by PayPal. Build systems rely on the .json file to obtain the correct versions of software components so that every developer working on the application is using the same components.

Birsan uploaded packages of their own to npm with the same names as PayPal's internal components. His packages collected basic information about each machine, as well as the hostname, username, and the filepath the application was being installed on. When the developer ran npm on his or her computer for that application (perhaps to update some other dependency or to start new work), the application ran through the list of dependencies and installed the public packages instead of the private ones.

“Along with the external IPs, this was just enough data to help security teams identify possibly vulnerable systems based on my reports, while avoiding having my testing be mistaken for an actual attack,” Birsan said. He encoded the data he'd collected into DNS queries, knowing that those are less likely to be blocked on the way out of the network.

The dependency confusion attack relies on the targeted company using both private and public software components, which typically implies a large company with mature software development processes. Birsan said the majority of the affected companies had more than a thousand employees, “which most likely reflects the higher prevalence of internal library usage within larger organizations.”

Birsan found internal project names inside publicly visible package.json files, internal packages hosted on public repositories, and even within posts on Internet forums. The best place to find private package names turned out to be inside JavaScript files, since project dependencies are embedded into public script files.

JavaScript may be highly susceptible to the dependency confusion attack, but Birsan found that the method worked successfully with PyPI (Python Package Index) and RubyGems packages, as well. Birsan identified internal Ruby gems for eight organizations, and he was able to get inside four of them—including Shopify–using this method.

"None of the package hosting services can ever guarantee that all the code its users upload is malware-free," Birsan wrote.

Microsoft paid Birsan $40,000 for his work, and published a white paper describing the technique, which the company referred to as a "substitution attack". The issue is identified as CVE-2021-24105 for its Azure Artifactory repository. The white paper is intended for organizations relying on both public and private library sources and includes recommendations such as "configuring the client to reference a single private feed," which could take manually shifting public packages to private. Developers working with hybrid package managers should use controlled scopes, namespaces, or prefixes to indicate where packages are located and which ones to include. Utilizing client-side verification features, such as version pinning and integrity verification, can prevent forced downgrades, acccording to the white paper.

Microsoft also noted that this method wasn't actually exploiting a software bug in its code repository but more of a design flaw in package managers. "While we are treating this as a severe security issue, it ultimately has to be fixed by reconfiguring installation tools and workflows, and not by correcting anything in the package repositories themselves," Microsoft told BleepingComputer's Ax Sharma. "[We consider the root cause of this issue to be a design flaw (rather than a bug) in package managers that can be addressed only through reconfiguration."

Birsan didn't consider the research complete as "there is more left to discover." There could be other ways to uncover internal package names, and other programming languages and repositories (such as JFrog and NuGet) were probably also vulnerable to depedency confusion. "I believe that finding new and clever ways to leak internal package names will expose even more vulnerable systems, and looking into alternate programming languages and repositories to target will reveal some additional attack surface for dependency confusion bugs," Birsan wrote.

Software Development Supply Chain Open Source