Introduction to regular expressions

Dec 22, 2017

If you want to analyze bodies of text, it’s a good to know how to use regular expressions. That way you can programmatically extract complex text patterns instead of marking and encoding items manually. Thomas Nield for O’Reilly provides an introduction:

Many data science, analyst, and technology professionals have encountered regular expressions at some point. This esoteric, miniature language is used for matching complex text patterns, and looks mysterious and intimidating at first. However, regular expressions (also called “regex”) are a powerful tool that only require a small time investment to learn. They are almost ubiquitously supported wherever there is data.

Nield says it isn’t a steep learning curve, which I agree with, but I would suggest not trying to learn every part of the syntax. Learn it piecewise, and it’ll seem like less of a jumble of brackets, periods, and question marks.

See also the RegExr. It’s an interactive tool that lets you paste a body of text and then enter regular expressions to see what matches your given pattern in real-time.

Favorites

Most popular porn searches, by state

We’ve seen that we can learn from what people search …

Pizza Place Geography

Most of the major pizza chains are within a 5-mile …

Where People Run in Major Cities

There are many exercise apps that allow you to keep …

Famous Movie Quotes as Charts

In celebration of their 100-year anniversary, the American Film Institute …