An unknown threat actor recently launched a series of unprecedentedly large DDoS attacks against a number of cloud services and organizations, using a previously unknown weakness in the HTTP/2 protocol that allows a client to rapidly request and then cancel TCP connections in parallel, consuming vast amounts of server resources.
Beginning in late August, the attacks targeted customers of many of the large cloud service providers, including AWS, Cloudflare, and Google, and one of the attacks was three times larger than any observed before, peaking at more than 201 million requests per second. Although the attacks disclosed Tuesday mainly targeted the cloud platforms, researchers said the weakness affects any web server that has a standard implementation of HTTP/2. The threat actor involved in the massive attack that Cloudflare observed used a relatively small botnet of about 20,000 machines to generate the huge volume of traffic.
Aside from the massive volume of the attacks, what caught the attention of researchers at the cloud providers was the usage of the HTTP/2 weakness, which had not been identified previously. The attacks took advantage of a feature in the protocol called stream multiplexing that enables a client to open multiple concurrent streams in one TCP connection with a given server. And unlike HTTP/1.1, which processes TCP requests serially, one after another, HTTP/2 processes them in parallel.
The new attack, which researchers named HTTP/2 Rapid Reset, takes advantage of this feature, along with a separate feature that allows a client to reset a given stream by sending a RST_STREAM frame to the server.
“This attack is called Rapid Reset because it relies on the ability for an endpoint to send a RST_STREAM frame immediately after sending a request frame, which makes the other endpoint start working and then rapidly resets the request. The request is canceled, but leaves the HTTP/2 connection open,” Google engineers said in a post.
“The HTTP/2 Rapid Reset attack built on this capability is simple: The client opens a large number of streams at once as in the standard HTTP/2 attack, but rather than waiting for a response to each request stream from the server or proxy, the client cancels each request immediately. The ability to reset streams immediately allows each connection to have an indefinite number of requests in flight. By explicitly canceling the requests, the attacker never exceeds the limit on the number of concurrent open streams. The number of in-flight requests is no longer dependent on the round-trip time (RTT), but only on the available network bandwidth.”
Once they detected the attacks and discovered the weakness that the threat actors were using, researchers developed custom mitigations and coordinated the response with their peers at other cloud platforms and organizations. Since then, they have seen a few different variants of the Rapid Reset attack, one of which doesn’t immediately cancel streams after opening them, but rather opens a large number of streams and leaves them open for a long time before canceling them and then repeating the process. Another variant tries to open more streams than the target server advertises, attempting to keep the server’s request pipeline jammed.
Though the major cloud providers have implemented mitigations to address these attacks, other organizations that have implemented HTTP/2 may also be vulnerable.
“Because the attack abuses an underlying weakness in the HTTP/2 protocol, we believe any vendor that has implemented HTTP/2 will be subject to the attack. This included every modern web server,” Cloudflare engineers said.
Cloudflare CSO Grant Bourzikas urged other CSOs and security leaders to move quickly to address any exposures in their environments.
“To me, this is reminiscent of a vulnerability like Log4J, due to the many variants that are emerging daily, and will continue to come to fruition in the weeks, months, and years to come. As more researchers and threat actors experiment with the vulnerability, we may find different variants with even shorter exploit cycles that contain even more advanced bypasses,” Bourzikas said.
“And just like Log4J, managing incidents like this isn’t as simple as “run the patch, now you’re done”. You need to turn incident management, patching, and evolving your security protections into ongoing processes — because the patches for each variant of a vulnerability reduce your risk, but they don’t eliminate it.”