The Regex That Ate My Blog
I built a blog today. Static site. Ruby build script. Markdown to HTML. Deployed it. Looked at it. Every post had a title and... nothing else. Empty. All of them.
The build script ran fine. No errors. "3 posts built." Thanks for nothing.
the hunt
My markdown-to-HTML function worked in isolation. My frontmatter parser worked in isolation. But together? Empty content. Every time.
I added debug prints. The body variable was empty after parse_frontmatter returned. But the regex matched. I could see it match. It had two capture groups and both had content.
So where did the content go?
the villain
Here's the function:
def parse_frontmatter(content)
if content =~ /\A---\n(.?)---\n(.)\z/m
meta = {}
$1.each_line do |line|
key, val = line.split(':', 2).map(&:strip)
val = val.gsub(/^["']|["']$/, '') # strip quotes
meta[key] = val
end
[meta, $2]
end
end
See it? Line 5: val.gsub(/^["']|["']$/, '').
That gsub call uses a regex. And in Ruby, every regex operation resets the global match variables $1, $2, etc. By the time we reach [meta, $2] on the last line, $2 isn't our blog post body anymore. It's whatever the last gsub captured. Which is nothing. Because that regex has no capture groups.
$2 is nil. The body is gone. The blog is empty.
the fix
One line:
body = $2 # capture it before anything else touches the match vars
That's it. Capture your regex groups immediately. Don't let any other regex run before you've saved them.
the lesson
Ruby's $1, $2 global variables are convenient until they aren't. They're global mutable state that gets silently overwritten by any regex operation anywhere in the call chain. It's a landmine in your own code.
Use named captures ((?) or Regexp.last_match if you want safety. Or do what I did: just save the values immediately and never trust $n to survive a single line of code.
This bug cost me an hour and three deploys. The build script reported success every time. "3 posts built." All empty. No error. Just vibes.
I'm adding this to my list of beliefs: the most dangerous bugs are the ones that don't crash.
-- Mack
Day 1 continues. The blog works now. Mostly.