Cloudbleed: Rethinking The Current State of Internet Security

The Cloudflare incident that derailed weekend plans for many cyber security professionals is turning out to be a huge eye opener. The culprit, a bug now known as Cloudbleed, resulted in a serious data leak which affected an abundance of websites. As the dust began to settle, it became clear that the implications were far more severe than anticipated. The Internet has undergone considerable changes, and security issues must now be handled differently.


An overview of Cloudflare

In order to understand the extent of Cloudbleed, it is crucial to become familiar with the role of Cloudflare in today’s Internet. While most people might not know what Cloudflare is, there is a good chance that these same individuals actually interact with a Cloudflare service multiple times a day. Take, for instance, the following example:


The above image is that of a Cloudflare feature in action. It serves the purpose of protecting the site that a given individual is about to visit, keeping it safe from spammers and bots by making sure the person is, in fact, human. Cloudflare is a content delivery network that also provides DDoS protection, web application firewalls, DNS (domain name server), and reverse proxy services. This network serves millions of websites, acting as a layer between those sites and their visitors.


That has some advantages and disadvantages. One disadvantage is that if something goes wrong at that layer, a lot of websites are going to be affected.


Cloudbleed in a nutshell

Cloudbleed was a bug in the Cloudflare infrastructure that resulted in a data leak involving passwords, keys, cookies, POST data, and HTTPS requests; the whole nine yards. Some of this data, which came from Cloudflare customer websites, included PII (personally identifiable information), private messages, and other confidential information. Although data would only leak out if certain conditions were met, the sheer size of Coudflare’s customer base made the incident a major issue.

To make matters worse, some of the information that leaked was caught by search engine crawlers and cached into memory. This means that, even after the bug was identified and fixed, one critical, and rather tricky, task still remained.

Cloudflare had to work with Google and other search engine providers to make sure leaked data was completely scrubbed from search engine caches. Otherwise, the data would run the risk of being targeted by shady (or simply curious) characters as soon as details of the security incident broke out.

Technically speaking, the bug only affected three features:

  1. Email Obfuscation – a feature that prevents bots from harvesting email addresses (which attackers use in their spamming campaigns) from web pages;
  2. Server-Side Excludes – a feature that hides sensitive content on your site from suspicious visitors; and
  3. Automatic HTTPS Rewrites – a feature that rewrites HTTP links to HTTPS.

In addition, the bug was only triggered if (in addition to any of those features being enabled) the HTML page that was served was malformed, i.e. the page had to end with an unterminated attribute.

However, because much of Cloudflare’s infrastructure is basically shared by its large customer base, a lot of websites were still affected by the bug – even if those sites did not have any of those three features enabled. Some of the popular websites known to have been impacted included Uber, Fitbit, and OkCupid. A much longer list of potentially affected sites can be found on Github.

In order to perform their specific functions, all three features – email obfuscation, server-side excludes, and automatic HTTPS rewrites – have to parse and modify HTML pages as the pages pass through Cloudflare’s edge servers and are subsequently served to clients. To do this, the three features rely on what is known as an HTML parser. It was this HTML parser that contained the bug, which in turn caused random memory leaks.

Cloudflare’s detailed explanation can be found here for those wanting to dive into more technical information.


Discovery and Remediation

The data leak was accidentally discovered by Tavis Ormandy, a security analyst at Google’s Project Zero.

“On February 17th 2017, I was working on a corpus distillation project, when I encountered some data that didn’t match what I had been expecting”, Tavis posted in a bug report. He added that “…It became clear after a while we were looking at chunks of uninitialized memory interspersed with valid data.”

When he and other members of the team managed to fetch some live samples, they were able to obtain data that should not have been out in the open – encryption keys, cookies, passwords, chunks of POST data and even HTTPS requests. As soon as they were able to pinpoint Cloudflare as the source of the leaked data, they immediately tried to establish contact with Cloudflare’s security team. Tavis decided to reach out through twitter.

Fortunately, some members of the Cloudflare security team were active on social media and word of the tweet reached them. This led to a series of exchanges and some collaborative work. The bug was found and a patch was deployed in less than a day. As soon as the bug’s location was identified (but before a patch was released), Cloudflare initiated what they called a “global kill” switch that disabled the features in question throughout their global network.

However, it did take more time to work with the search engine engineers in purging leaked data that was already cached. After further investigation, it was later discovered that the bug may have been leaking data for months. The leak is estimated to have stretched from late September 2016 to February 2017.

That estimated period over which data leaked is quite alarming, considering that search engines aren’t the only entities who mobilize crawlers across the Web. Ideally, leaked data cached in those systems would be purged as well.

To be on the safe side, all Cloudflare customers must assume that their sites might have leaked some amount of data and take appropriate measures. Some of the countermeasures include resetting passwords, forcing re-authentication, invalidating session cookies, rolling internal authorization tokens, adapting two-factor authentication, and educating users of potential risks.


Cloudbleed and the Current State of the Internet

Cloudbleed (and the recent Amazon outage) indicates a couple of things. First, that the Internet, as many of us know it, is now heavily dependent on just a handful of organizations. The downside is that, when something goes wrong with the infrastructure of those few organizations, a massive fraction of the Internet – and everyone using it – suffers.

Indeed, if services from the top providers like Amazon and Cloudflare go offline or acquire some form of vulnerability, a lot of businesses will be affected. There is, however, an upside to this. Because things are provided as-a-service by a single provider, a fix can be made in a central location and then rapidly propagated across all nodes.

In the case of Cloudbleed, a simple activation of a kill switch at Cloudflare switched off all affected features, thereby plugging the holes throughout its entire global infrastructure in a matter of seconds. As soon as the bug was fixed, that fix was then propagated throughout the infrastructure, almost instantly.

The rise of ubiquitous cloud service providers has altered the Internet landscape in a big way. In order to mitigate the risks that accompany this new landscape, on can take the lessons learned from Cloudbleed into consideration when developing new security strategies.