Basics of Regex

Today, at The Data School, I had the opportunity to learn Regular Expressions (Regex) for the first time. I have to say this was an entirely novel concept to me. As a data professional, the prospect of exploring this powerful tool and its endless possibilities filled me with excitement and curiosity. While initially challenging to grasp the underlying principles of Regex, with theoretical discussions and hands-on practice, I began to understand the basic building blocks of this tool.

In this blog, I aim to reflect on my learning with Regex and share some of the valuable expressions I discovered along the way.

Regex, short for Regular Expression, is a text pattern-matching tool used in data processing to search, find, and manipulate specific patterns or sequences of characters within a text. It is more versatile than a regular search and is universal which can be used in various programming and data analytics platforms such as Alteryx, Tableau, Python, R, etc.

Here are some of the major expressions I learned today to find specific patterns of characters within the text.

Qualifier Expressions

Qualifiers are used to specify the exact number of occurrences of a character or pattern in the text.

Qualifiers

Represents

.

Wildcard (anything)

\w

Alphanumeric character or _

\d

digits (0-9)

\s

Space

\W

Not alphanumeric or _

\D

Not digits

\S

Not space

[A-Z]

Letters from A to Z

[a-z]

Letters from a to z

[0-9]

Number from 0 to 9

[abc]

a or b or c (anything from the set)

[^abc]

Anything but those in the set

Quantifier Expressions

Quantifiers are used to specify a general range of occurrences of a character or pattern in the text

Quantifiers

Represents

+

Wildcard (anything)

*

Alphanumeric character or _

?

Zero or one

(When used with Qualifiers)

?

As few as possible

(When used with Quantifiers)

{x}

Match exactly x times

{x,y}

Between x and y

{x,}

Minimum x

Others

Others

Represents

|

OR operator

^

Start anchor

$

End anchor

(…)

Capture group

\

Escaped Character

Let's see some examples based on the following text

I learned Regex in The Data School on July 24.

  • Expression to capture the text "Regex" is [A-Z]\w.+x
  • Expression to capture the number "24" is \d\d
  • Expression to capture the text "July" is (J\w.+)\s
Author:
Nitesh Shrestha
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab