Vanta Logo
SPONSOR
Automate SOC 2 & ISO 27001 compliance with Vanta. Get $1,000 off.
Published
4 min read
Up to date

Trevor I. Lasn

Staff Software Engineer, Engineering Manager

When Regex Goes Wrong

Issues and catastrophic failures caused by regex

In 2016, Stack Overflow experienced a 34-minute outage. The root cause? A regular expression (Regex) used in a part of the code that handled user input.

It took 10 minutes to pinpoint the issue, 14 minutes to write the fix, and another 10 minutes to deploy the solution and restore Stack Overflow’s availability.

What caused the outage?

^[\s\u200c]+|[\s\u200c]+$

According to Stack Overflow: “While this looks straightforward—matching ‘all the spaces at the end of the string’—it can be problematic for backtracking regex engines. In this case, a malformed post included roughly 20,000 consecutive whitespace characters on a comment line.

If the string contains 20,000 space characters in a row but not at the end, the regex engine starts checking each space. After the 20,000th space, it encounters a different character but still expects a space or the end of the string.

The engine backtracks, attempting to match \s+$ starting from the second space, the third space, and so on, leading to a total of 199,990,000 checks. This was sufficient to cause a significant delay. The regex has since been replaced with a substring function.”

What Is Backtracking?

Backtracking occurs when a regular expression fails to match part of a string. While matching itself is straightforward, if a regex engine cannot find a match, it starts backtracking. This means the engine revisits previous choices and attempts different options, which can lead to performance issues.

Catastrophic Backtracking

In addition to general backtracking, there is a specific issue known as catastrophic backtracking. This happens when a regex engine spends an excessive amount of time trying different combinations to match a pattern, often leading to severe performance degradation. This is especially problematic with complex regex patterns and large input sizes.

The Stack Overflow incident serves as a reminder of the potential pitfalls of using regex. While a 34-minute outage might not seem like a major issue, what other problems has regex caused?

Cloudflare Outage (2019)

The CloudFlare outage on July 2, 2019, was another major example of how regex issues can lead to catastrophic failures.

An engineer wrote a regular expression prone to severe backtracking. This regex led to widespread CPU exhaustion, causing Cloudflare’s global CPU usage to spike to 100%.

(?:(?:\"|'|\]|\}|\\|\d|(?:nan|infinity|true|false|null|undefined|symbol|math)|\`|\-|\+)+[)]*;?((?:\s|-|~|!|{}|\|\||\+)*.*(?:.*=.*)))

According to Cloudflare: “The regex was designed in a way that could cause significant backtracking. The critical part of the regex,.*(?:.*=.*)., involves a non-capturing group, which can lead to excessive CPU usage when processing certain patterns. This inefficiency in the regex caused severe performance issues, ultimately leading to a massive outage.”

CrowdStrike Kernel Issue (2024)

Recently, CrowdStrike faced a major issue due to a poorly implemented regular expression in their kernel driver, leading to widespread system crashes. This incident disrupted operations across various sectors, including businesses and government systems.

The root cause of the problem was a mismatch between the expected number of input parameters (21) and the actual number provided (20) to the Content Interpreter, which was responsible for processing regex-based Rapid Response Content. When the system received input with the 21st parameter, the Content Interpreter attempted to read beyond the allocated memory, resulting in out-of-bounds access and subsequent system crashes.

This incident shows the serious problems regex can cause in critical systems.

Should We Move Away from Regex?

I’m not claiming to know everything about regex. However, their issues highlight why we need to be careful with regex and explore alternative solutions. Is it time to phase them out?​

If you found this article helpful, you might enjoy my free newsletter. I share developer tips and insights to help you grow your skills and career.


More Articles You Might Enjoy

If you enjoyed this article, you might find these related pieces interesting as well. If you like what I have to say, please check out the sponsors who are supporting me. Much appreciated!

Tech
8 min read

Apple's Secret Sauce: The Untold Stories Behind Its Success

Diving deep into the lesser-known factors that propelled Apple from a garage startup to a tech titan

Sep 30, 2024
Read article
Tech
3 min read

You Don't Own Your Social Media Accounts

Social platforms promise exposure but quietly hold your audience hostage

Nov 28, 2024
Read article
Tech
3 min read

Why Anthropic (Claude AI) Uses 'Member of Technical Staff' for All Engineers (Including Co-founders)

Inside Anthropic's unique approach to preventing talent poaching and maintaining organizational equality

Oct 23, 2024
Read article
Tech
5 min read

Cloudflare Study: 39% of Companies Losing Control of Their IT and Security Environment

New research reveals a shocking loss of control in corporate IT environments

Oct 3, 2024
Read article
Tech
3 min read

Honey Quietly Hijacked Creator Revenue Through Affiliate Link Switching

Honey's controversial affiliate link practices and what it teaches us about Silicon Valley's ethics

Jan 4, 2025
Read article
Tech
3 min read

Tattoos Won't Break Your Tech Career

Building a tech career with a sword tattooed on my neck

Dec 10, 2024
Read article
Tech
3 min read

LinkedIn is Drowning in AI Generated Content Slop

One-line paragraphs, LinkedIn broetry, and the inevitable 'Agree?' - welcome to your AI-generated feed

Dec 11, 2024
Read article
Tech
5 min read

Is Age Really a Factor in Tech?

Silicon Valley has a reputation for youth worship. The 'move fast and break things' mentality often translates to a preference for younger, supposedly more adaptable workers.

Oct 8, 2024
Read article
Tech
4 min read

Chrome Is Beta Testing Built-In AI. Could This Kill a Lot of Startups?

The Power Play: Gemini Nano in Chrome

Aug 31, 2024
Read article

Become a better engineer

Here are engineering resources I've personally vetted and use. They focus on skills you'll actually need to build and scale real projects - the kind of experience that gets you hired or promoted.

Many companies have a fixed annual stipend per engineer (e.g. $2,000) for use towards learning resources. If your company offers this stipend, you can forward them your invoices directly for reimbursement. By using my affiliate links, you support my work and get a discount at the same!


This article was originally published on https://www.trevorlasn.com/blog/when-regex-goes-wrong. It was written by a human and polished using grammar tools for clarity.