Published on August 03, 2024By DeveloperBreeze

Python Regular Expressions Cheat Sheet

Basic Syntax

Every regular expression operation in Python requires importing the re module:

import re

Special Characters

CharacterDescription
.Matches any character except a newline.
^Matches the start of a string.
$Matches the end of a string.
*Matches 0 or more repetitions of the preceding pattern.
+Matches 1 or more repetitions of the preceding pattern.
?Matches 0 or 1 repetition of the preceding pattern.
{n}Matches exactly n repetitions of the preceding pattern.
{n,}Matches n or more repetitions of the preceding pattern.
{n,m}Matches between n and m repetitions of the preceding pattern.
[]Matches any single character in brackets.
[^]Matches any single character not in brackets.
\Escapes a special character.
|Matches either the pattern before or after the |.
()Groups patterns.

Character Classes

CharacterDescription
\dMatches any digit; equivalent to [0-9].
\DMatches any non-digit; equivalent to [^0-9].
\wMatches any word character (alphanumeric and underscore); equivalent to [a-zA-Z0-9_].
\WMatches any non-word character; equivalent to [^a-zA-Z0-9_].
\sMatches any whitespace character (space, tab, newline).
\SMatches any non-whitespace character.

Common Patterns

PatternDescription
r"\b"Matches a word boundary.
r"\B"Matches a non-word boundary.
r"\A"Matches the start of a string.
r"\Z"Matches the end of a string.
r"\G"Matches the end of the previous match.
r"\n"Matches a newline character.
r"\t"Matches a tab character.

re Module Functions

Compiling Regular Expressions

re.compile(pattern, flags=0): Compiles a regex pattern for reuse.

pattern = re.compile(r'\d+')
  

Basic Functions

re.search(pattern, string, flags=0): Searches the string for the first location where the regex pattern produces a match, and returns a corresponding MatchObject instance.

match = re.search(r'\d+', 'The price is 100 dollars')
  if match:
      print(match.group())  # Output: 100
  

re.match(pattern, string, flags=0): Determines if the regex pattern matches at the start of the string.

match = re.match(r'\d+', '123 apples')
  if match:
      print(match.group())  # Output: 123
  

re.fullmatch(pattern, string, flags=0): Checks if the entire string matches the regex pattern.

match = re.fullmatch(r'\d+', '12345')
  if match:
      print(match.group())  # Output: 12345
  

Finding All Matches

re.findall(pattern, string, flags=0): Returns all non-overlapping matches of the pattern in the string as a list of strings.

matches = re.findall(r'\d+', 'There are 12 apples and 5 oranges')
  print(matches)  # Output: ['12', '5']
  

re.finditer(pattern, string, flags=0): Returns an iterator yielding MatchObject instances over all non-overlapping matches.

matches = re.finditer(r'\d+', 'There are 12 apples and 5 oranges')
  for match in matches:
      print(match.group())  # Output: 12 5
  

Substitution

re.sub(pattern, repl, string, count=0, flags=0): Returns the string obtained by replacing the leftmost non-overlapping occurrences of the pattern with the replacement string.

result = re.sub(r'\d+', '#', 'There are 12 apples and 5 oranges')
  print(result)  # Output: There are # apples and # oranges
  

re.subn(pattern, repl, string, count=0, flags=0): Returns a tuple containing the new string and the number of substitutions made.

result, num_subs = re.subn(r'\d+', '#', 'There are 12 apples and 5 oranges')
  print(result, num_subs)  # Output: There are # apples and # oranges 2
  

Splitting Strings

re.split(pattern, string, maxsplit=0, flags=0): Splits the string by occurrences of the pattern.

parts = re.split(r'\s+', 'Split this string by spaces')
  print(parts)  # Output: ['Split', 'this', 'string', 'by', 'spaces']
  

Flags

FlagDescription
re.IGNORECASE (re.I)Case-insensitive matching.
re.MULTILINE (re.M)^ and $ match the start and end of each line.
<

code>re.DOTALL (re.S)

. matches any character, including newline.
re.VERBOSE (re.X)Allows for more readable regex with comments and whitespace.

MatchObject Methods

When a match is found, a MatchObject is returned, providing several useful methods:

.group([group1, ...]): Returns one or more subgroups of the match.

.start([group]): Returns the starting position of the match.

.end([group]): Returns the ending position of the match.

.span([group]): Returns a tuple containing the start and end positions of the match.

match = re.search(r'(\d+)', 'The price is 100 dollars')
if match:
    print(match.group())  # Output: 100
    print(match.start())  # Output: 12
    print(match.end())    # Output: 15
    print(match.span())   # Output: (12, 15)

Examples

Validate an Email Address

email_pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
email = 'example@example.com'
if re.match(email_pattern, email):
    print('Valid email')
else:
    print('Invalid email')

Extract Phone Numbers

text = "Contact me at 123-456-7890 or 987.654.3210"
phone_pattern = r'\d{3}[-.]\d{3}[-.]\d{4}'
phones = re.findall(phone_pattern, text)
print(phones)  # Output: ['123-456-7890', '987.654.3210']

This cheat sheet provides a quick reference to the most common regex patterns and functions in Python. For more complex regex patterns and usage, consider exploring the [Python re module documentation](https://docs.python.org/3/library/re.html).

Comments

Please log in to leave a comment.

Continue Reading:

Validate Password Strength

Published on January 26, 2024

javascript

Creating a Simple REST API with Flask

Published on August 03, 2024

python

Python Code Snippet: Simple RESTful API with FastAPI

Published on August 04, 2024

jsonpython

QR Code with Embedded Logo

Published on August 08, 2024

python

Automate Tweet Posting with a Python Twitter Bot

Published on August 08, 2024

python

Python: How to Reverse a String

Published on August 12, 2024

python