Python RegEx Tutorial: Mastering Regular Expressions

Last updated 2 weeks, 2 days ago | 51 views 75     5

Tags:- Python

Regular Expressions (RegEx or regex) are powerful tools for pattern matching and text manipulation. Python’s built-in re module enables you to work with regex seamlessly.

Whether you're validating emails, parsing logs, or scraping data, regex helps you process strings efficiently and flexibly.


What Is a Regular Expression?

A regular expression is a special string pattern that describes a search pattern in text. It can match, find, replace, or split strings based on complex rules.

Example:

import re

pattern = r"\d+"
text = "There are 123 apples"
result = re.findall(pattern, text)
print(result)  # Output: ['123']

Python re Module Functions

Function Description
re.match() Checks for a match at the beginning of a string
re.search() Searches the entire string for a match
re.findall() Returns all non-overlapping matches
re.finditer() Returns an iterator over all matches
re.sub() Replaces matches
re.split() Splits a string using a pattern
re.compile() Compiles a regex pattern for reuse

Regex Syntax Basics

Pattern Meaning Example
. Any character except newline a.c → matches abc, axc
^ Start of string ^Hello
$ End of string world$
* 0 or more ab*cac, abc, abbc
+ 1 or more ab+cabc, abbc
? 0 or 1 colou?rcolor, colour
[] Character set [aeiou]
` ` OR
\d Digit [0-9]
\w Word character [a-zA-Z0-9_]
\s Whitespace space, tab, newline
{n} Exactly n times \d{3}
(…) Group (\d{3})-(\d{2})

Using re.match()

Matches pattern only at the start of the string.

import re

result = re.match(r"Hello", "Hello World")
print(result.group())  # Output: Hello

Using re.search()

Searches for the pattern anywhere in the string.

result = re.search(r"World", "Hello World")
print(result.group())  # Output: World

Using re.findall()

Returns a list of all matches.

text = "Contact: 123-456-7890 or 987-654-3210"
phones = re.findall(r"\d{3}-\d{3}-\d{4}", text)
print(phones)

Output:

['123-456-7890', '987-654-3210']

Using re.sub() for Replacements

text = "The price is $100"
new_text = re.sub(r"\$\d+", "$XXX", text)
print(new_text)  # Output: The price is $XXX

Grouping with () and Accessing with .group()

text = "Name: John, Age: 30"
match = re.search(r"Name: (\w+), Age: (\d+)", text)
if match:
    print(match.group(1))  # Output: John
    print(match.group(2))  # Output: 30

⚡ Using re.compile() for Reusability

pattern = re.compile(r"\d{4}-\d{2}-\d{2}")
dates = ["2024-01-01", "Date: 2025-05-07"]
for date in dates:
    match = pattern.search(date)
    if match:
        print(match.group())

Real-World Example: Validate Email Address

import re

def is_valid_email(email):
    pattern = r"^[\w\.-]+@[\w\.-]+\.\w{2,}$"
    return re.match(pattern, email) is not None

print(is_valid_email("[email protected]"))  # True
print(is_valid_email("bad@email"))         # False

⚠️ Common Pitfalls

Pitfall Issue Solution
Using match() instead of search() match() only checks start of string Use search() for full search
Forgetting raw strings (r"…") Backslashes may be interpreted by Python Always use raw strings
Greedy matches .* consumes too much Use .*? for non-greedy
Misusing character sets [abc](abc) Use () for grouping, [] for sets
Overusing regex For simple substring search, use in Use regex only when needed

Tips for Working with RegEx in Python

  • ✅ Always prefix patterns with r"…", e.g., r"\d+"

  • ✅ Use re.compile() for performance in repeated matches

  • ✅ Break long patterns using verbose mode (re.VERBOSE)

  • ✅ Use finditer() for large texts (returns match objects lazily)

  • ✅ Test patterns online (e.g., regex101.com)


Complete Working Code Example

import re

def extract_emails(text):
    pattern = r"[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+"
    return re.findall(pattern, text)

def anonymize_emails(text):
    return re.sub(r"([a-zA-Z0-9_.+-]+)@([a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)", r"\1@***.com", text)

# Example usage
text = "Contact us at [email protected] or [email protected]"
emails = extract_emails(text)
print("Found emails:", emails)

anonymized = anonymize_emails(text)
print("Anonymized text:", anonymized)

Summary Table

Method Use
re.match() Match from the beginning
re.search() Search anywhere in string
re.findall() Find all non-overlapping matches
re.sub() Replace matched patterns
re.split() Split by pattern
re.compile() Compile regex for reuse

Conclusion

Python’s re module gives you powerful pattern matching capabilities. With regex, you can automate everything from simple string validation to complex data extraction.

Start small, build your patterns step by step, and always test thoroughly!

 

Tips and Tricks


What is pass in Python?

Python | Pass Statement

The pass statement is used as a placeholder for future code. It represents a null operation in Python. It is generally used for the purpose of filling up empty blocks of code which may execute during runtime but has yet to be written.

 

def myfunction():
    pass

 


How can you generate random numbers?

Python | Generate random numbers

Python provides a module called random using which we can generate random numbers. e.g: print(random.random())

 

 

We have to import a random module and call the random() method as shown below:

 import random

 print(random.random())

The random() method generates float values lying between 0 and 1 randomly.


To generate customized random numbers between specified ranges, we can use the randrange() method
Syntax: randrange(beginning, end, step)
 

import random

print(random.randrange(5,100,2))

 


What is lambda in Python?

Python | Lambda function

A lambda function is a small anonymous function. This function can have any number of parameters but, can have just one statement.
 

 

Syntex: 
lambda arguments : expression
 

a = lambda x,y : x+y

print(a(5, 6))

It also provides a nice way to write closures. With that power, you can do things like this.

def adder(x):
    return lambda y: x + y

add5 = adder(5)

add5(1)    #6

As you can see from the snippet of Python, the function adder takes in an argument x and returns an anonymous function, or lambda, that takes another argument y. That anonymous function allows you to create functions from functions. This is a simple example, but it should convey the power lambdas and closures have.
 


What is swapcase() function in the Python?

Python | swapcase() Function

It is a string's function that converts all uppercase characters into lowercase and vice versa. It automatically ignores all the non-alphabetic characters.
 

string = "IT IS IN LOWERCASE."  

print(string.swapcase())  

 


How to remove whitespaces from a string in Python?

Python | strip() Function | Remove whitespaces from a string 

To remove the whitespaces and trailing spaces from the string, Python provides a strip([str]) built-in function. This function returns a copy of the string after removing whitespaces if present. Otherwise returns the original string.
 

string = "  Python " 
 
print(string.strip())  

 


What is the usage of enumerate() function in Python?

Python | enumerate() Function

The enumerate() function is used to iterate through the sequence and retrieve the index position and its corresponding value at the same time.
 

lst = ["A","B","C"] 
 
print (list(enumerate(lst)))

#[(0, 'A'), (1, 'B'), (2, 'C')]

 


Can you explain the filter(), map(), and reduce() functions?

Python | filter(), map(), and reduce() Functions

  • filter()  function accepts two arguments, a function and an iterable, where each element of the iterable is filtered through the function to test if the item is accepted or not.
    >>> set(filter(lambda x:x>4, range(7)))
    
    # {5, 6}
    
    

     

  • map() function calls the specified function for each item of an iterable and returns a list of result

    >>> set(map(lambda x:x**3, range(7)))
    
    # {0, 1, 64, 8, 216, 27, 125}

     

  • reduce() function reduces a sequence pair-wise, repeatedly until we arrive at a single value..
     

    >>> reduce(lambda x,y:y-x, [1,2,3,4,5])
    
    # 3
    

    Let’s understand this:

    2-1=1
    3-1=2
    4-2=2
    5-2=3

    Hence, 3.

 


What is a namedtuple?

Python | namedtuple

A namedtuple will let us access a tuple’s elements using a name/label. We use the function namedtuple() for this, and import it from collections.

>>> from collections import namedtuple

#format
>>> result=namedtuple('result','Physics Chemistry Maths') 

#declaring the tuple
>>> Chris=result(Physics=86,Chemistry=92,Maths=80) 

>>> Chris.Chemistry
# 92

 


Write a code to add the values of same keys in two different dictionaries and return a new dictionary.

We can use the Counter method from the collections module

from collections import Counter

dict1 = {'a': 5, 'b': 3, 'c': 2}
dict2 = {'a': 2, 'b': 4, 'c': 3}

new_dict = Counter(dict1) + Counter(dict2)


print(new_dict)
# Print: Counter({'a': 7, 'b': 7, 'c': 5})


 


Python In-place swapping of two numbers

 Python | In-place swapping of two numbers

>>> a, b = 10, 20
>>> print(a, b)
10 20

>>> a, b = b, a
>>> print(a, b)
20 10

 


Reversing a String in Python

Python | Reversing a String

>>> x = 'PythonWorld'
>>> print(x[: : -1])
dlroWnohtyP

 


Python join all items of a list to convert into a single string

Python | Join all items of a list to convert into a single string

>>> x = ["Python", "Online", "Training"]
>>> print(" ".join(x))
Python Online Training

 


python return multiple values from functions

Python | Return multiple values from functions

>>> def A():
	return 2, 3, 4

>>> a, b, c = A()

>>> print(a, b, c)
2 3 4

 


Python Print String N times

Python | Print String N times

>>> s = 'Python'
>>> n = 5

>>> print(s * n)
PythonPythonPythonPythonPython

 


Python check the memory usage of an object

Python | Check the memory usage of  an object

>>> import sys
>>> x = 100

>>> print(sys.getsizeof(x))
28