Cloudflare, a platform providing network services, faced a DDoS attack last week and was unintentionally responsible for it. Earlier, Cloudflare was linked to a major outage in June, impacting sites such as Spotify, Google, Snapchat, Discord, and Character.ai due to a Google Cloud failure. Recently, Cloudflare dealt with another problem, although not as serious as the summer outage, and this time it was self-generated.
“We experienced an outage in our Tenant Service API, resulting in a widespread outage of many of our APIs and the Cloudflare Dashboard,” wrote Tom Lianza, Cloudflare’s VP of engineering, and Joaquin Madruga, VP of engineering for the developer platform, in a blog post dated Sept. 13. “The incident’s effects arose from various issues, but the immediate cause was a bug in the dashboard.”
The bug led to “repeated, unnecessary requests to the Tenant Service API.” Cloudflare inadvertently included a “problematic object in its dependency array,” which was recreated, perceived as new, and triggered multiple reruns, resulting in several API calls during a single dashboard render instead of merely once.
“When the Tenant Service became overwhelmed, it affected other APIs and the dashboard because Tenant Service is integral to our API request authorization process. Without Tenant Service, API request authorization cannot be assessed. When authorization assessment fails, API requests respond with 5xx status codes,” the blog articulates.
Cloudflare has now resumed normal operations. “We deeply regret the disruption,” the blog post expresses. “We will keep investigating this matter and enhance our systems and processes.”