Three Ways to Warm Up to Regex

Despite my initial fear of regex, I now take delight in parsing data using regex. Don’t get me wrong, regex can still be challenging, yet it is exhilarating to successfully write matching patterns. In this blog post, I will share how I grew to like regex through exposure and practice.

Review RegexOne

I recently learned about RegexOne, a site where I can learn regular expressions with simple, interactive exercises. The lessons are organized and scaffolded well. Further, the Lesson Notes sidebar is always available to reference. This is a good place to start learning and building confidence in regex.

Practice through Alteryx Challenges

Practice makes better. You can take on the same challenges that Bianca Ng tackles in her Parsing and Data Prep series:

In addition, I recommend Challenge #35: Data Cleansing Practice. There are four opportunities to exercise regex in a single challenge!

Retrieve data via an API

This is the most practical use of regex. After downloading data from an API, it is necessary to reshape the data and in order to crosstab the data, from a tall to a wide table, it is helpful to group the records based on an ID. By parsing out elements of the JSON_Name column, an ID can be created to support this reshaping.

Let’s go through an example.

After using the Download tool on an API’s URL and a JSON Parse tool to output values into data type specific fields, my table looks like this:


Let’s use a Regex tool to parse out parts of the JSON_Name column.


The number in the middle (highlighted in yellow) can be the Spell ID – Spell ID 0 is Acid Arrow and Spell ID 1 is Acid Splash. Additionally, the word to the right of the number (highlighted in orange) would make a perfect column name.


After parsing out those elements from the JSON_Name column, we can filter out records without a Spell ID and then crosstab the data, like so:


The results are exactly what I was looking for to continue my workflow:


If you are intimidated by an entire API project, you can try Alteryx Challenge #7: Download Data and Parse JSON instead.


There are many benefits to knowing regex whether you’re searching for specific patterns or parsing data from a website. Regex can be a powerful tool for automating and streamlining your Alteryx workflows. With a little practice and patience, you too can enjoy regex. Happy coding!

Author:
Elaine Yuan
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2024 The Information Lab