Skip to main content

Online Learning Resources: Regular Expressions

This post is part of a series of posts on online learning resources for data science and programming.

Regular expressions, or regex for short, are used to search for patterns in text. By using particular sequences of characters, you can construct regular expressions to search for not only “dog” but also any word that starts with the letter d. Or not only a specific phone number, but any text that matches the pattern of a phone number.

Regular expressions are an essential tool for anyone working with text data. Writing regular expressions can be a bit like casting magical spells. It takes some work to get them just right, but when you do, it can feel like magic.

Regex are used across programming languages. While the specific details of searching and matching vary, the core patterns remain the same.

RegexOne: learn regular expressions online through short exercises that clearly show you what the regular expressions you write are matching. Focuses on general principles applicable across programming languages. If you want to get started writing regular expressions and seeing how they work right away, start here.

Mastering Regular Expressions, by Jeffrey E.F. Friedl. This book is available through the Northwestern Library as a part of O’Reilly Books Online. It includes several chapters of information that are generally applicable across programming languages, as well as details on how regular expressions are implemented in Java, Perl, and a few other languages. You can learn the basics from the introductory chapter, or keep reading to learn how regular expressions are actually processed by different systems.

R for Data Science: Strings, by Hadley Wickham and Garrett Grolemund. This book covers the basics of regular expressions and how to use the stringr package to use regular expressions in R. A great place to get started if you are an R user.

RegExr: trying to build a regular expression and want to see if it’s working? RegExr lets you test an expression against text that you paste into the tool, or build a set of positive and negative examples to test your expression against. It will also tell you what different components of a regular expression mean, which can be helpful if you’re trying to decipher a regular expression someone else has written.

Regular Expressions: Regexes in Python by John Sturtz. This tutorial explains how to use the re module in Python. For some additional practice using regular expressions in Python, see the string patterns lesson from our Next Steps in Python workshop series.

Stuck?

If you have a question about using regular expressions, remember you can always request a free consultation with our data science consultants.