Python Text Processing Regular Expressions Tutorial
Intermediate Level Pattern Matching

Python Regular Expressions: Complete Guide with Examples

Master Python regular expressions - powerful tools for pattern matching and text processing. Learn regex syntax, re module functions, and apply them to real-world scenarios.

Regex Syntax

Comprehensive patterns

Guided examples

With sample output

re Module

All functions covered

Real Applications

Validation, Extraction

What is a regular expression?

A regular expression (regex) is a sequence of characters used to:

  • search text
  • match patterns
  • validate data
  • extract information

Python provides the built-in module re for working with regex.

Why use regex?

Regex is useful for:

  • Email validation
  • Password validation
  • Searching words
  • Finding numbers
  • Data extraction
  • Text replacement

Importing the re module

import re
import re

Basic re functions

Function Purpose
re.match()Match at beginning
re.search()Search anywhere
re.findall()Return all matches
re.finditer()Return iterator of matches
re.sub()Replace matches
re.split()Split string
re.compile()Compile regex pattern

1. re.match()

Matches the pattern only at the beginning of the string.

Match at start
import re

text = "Python is easy"

result = re.match("Python", text)

print(result)

Output:

<re.Match object; span=(0, 6), match='Python'>

Match failure

If the pattern is not at the start, re.match() returns None.

Pattern not at beginning
import re

text = "I love Python"

result = re.match("Python", text)

print(result)

Output:

None

3. re.findall()

Returns all matches as a list.

All occurrences
import re

text = "cat bat mat"

result = re.findall("at", text)

print(result)

Output:

['at', 'at', 'at']

4. re.finditer()

Returns an iterator of match objects (memory-friendly for many matches).

Iterator of matches
import re

text = "cat bat mat"

matches = re.finditer("at", text)

for match in matches:
    print(match.start())

Output:

1
5
9

5. re.sub()

Replaces matched text with a replacement string.

Replace matches
import re

text = "I like Java"

result = re.sub("Java", "Python", text)

print(result)

Output:

I like Python

6. re.split()

Splits a string using a regex pattern as the delimiter.

Split on comma or semicolon
import re

text = "apple,banana;orange"

result = re.split("[,;]", text)

print(result)

Output:

['apple', 'banana', 'orange']

Special regex characters

Symbol Meaning
.Any character
^Start of string
$End of string
*Zero or more
+One or more
?Zero or one
{}Exact repetitions
[]Character set
|OR operator
()Grouping

Character classes

Pattern Meaning
\dDigit
\DNon-digit
\wWord character
\WNon-word character
\sWhitespace
\SNon-whitespace

Example: digits

Find digits
import re

text = "Age is 25"

print(re.findall(r"\d", text))

Output:

['2', '5']

Example: word characters

Word characters
import re

text = "Python123"

print(re.findall(r"\w", text))

Output:

['P', 'y', 't', 'h', 'o', 'n', '1', '2', '3']

Quantifiers

Quantifier Meaning
*0 or more
+1 or more
?0 or 1
{n}Exactly n times
{m,n}Between m and n

Example: +

One or more
import re

text = "Hellooo"

print(re.findall(r"o+", text))

Output:

['ooo']

Example: *

Zero or more
import re

text = "Hellooo"

print(re.findall(r"lo*", text))

Output:

['looo']

Character sets

Match cat or bat, not mat
import re

text = "cat bat mat"

print(re.findall(r"[cb]at", text))

Output:

['cat', 'bat']

Validation examples

Email

Email validation
import re

email = "test@gmail.com"

pattern = r"^[\w\.-]+@[\w\.-]+\.\w+$"

if re.match(pattern, email):
    print("Valid Email")
else:
    print("Invalid Email")

Output:

Valid Email

Mobile number

10-digit mobile
import re

mobile = "9876543210"

pattern = r"^[0-9]{10}$"

if re.match(pattern, mobile):
    print("Valid Number")
else:
    print("Invalid Number")

Output:

Valid Number

Password strength

Password pattern
import re

password = "Pass@123"

pattern = r"^(?=.*[A-Z])(?=.*[a-z])(?=.*\d).{8,}$"

if re.match(pattern, password):
    print("Strong Password")
else:
    print("Weak Password")

Output:

Strong Password

Raw strings in regex

Use an r prefix before the quote: r"..." so backslashes are treated literally. This avoids escape-sequence issues with patterns like \d and \w.

Raw string pattern
pattern = r"\d+"

# Same idea in use:
import re
print(re.findall(pattern, "Room 404"))

Output:

['404']

Groups in regex

Parentheses () capture parts of the match; use .group(1), .group(2), … to read them.

Date groups
import re

text = "2026-05-09"

pattern = r"(\d{4})-(\d{2})-(\d{2})"

match = re.search(pattern, text)

print(match.group(1))
print(match.group(2))
print(match.group(3))

Output:

2026
05
09

re.compile()

Compile a pattern once and reuse it—clearer and faster when you run many operations with the same pattern.

Compiled pattern
import re

pattern = re.compile(r"\d+")

print(pattern.findall("A1 B22 C333"))

Output:

['1', '22', '333']

Flags in regex

Flag Purpose
re.IIgnore case
re.MMultiline
re.SDot matches newline

Ignore case

re.I
import re

text = "PYTHON"

print(re.findall("python", text, re.I))

Output:

['PYTHON']
Advantages of regex
  • Powerful text processing
  • Fast searching
  • Data validation
  • Pattern matching
  • Reduces code complexity
Disadvantages of regex
  • Hard to read for beginners
  • Complex patterns can confuse
  • Debugging may be difficult

Real-life applications

Application Example
ValidationEmail, password
Search engineKeyword search
Data extractionLogs, reports
Text replacementEditing documents

Regex key takeaways

  • Use import re; prefer raw strings r"..." for patterns.
  • re.match() checks the start; re.search() finds the first match anywhere.
  • re.findall() returns strings; re.finditer() yields match objects with positions.
  • re.sub() and re.split() transform and split text by pattern.
  • re.compile() is ideal when reusing the same pattern many times.
  • Combine character classes, quantifiers, and groups for validation and parsing.
Next Topics: We'll cover Python multithreading with the threading module and safe coordination between threads