DeveloperBreeze

Basic Syntax

  • import re: Import the re module to work with regular expressions.

Special Characters

CharacterDescription
.Matches any character except a newline.
^Matches the start of a string.
$Matches the end of a string.
*Matches 0 or more repetitions of the preceding pattern.
+Matches 1 or more repetitions of the preceding pattern.
?Matches 0 or 1 repetition of the preceding pattern.
{n}Matches exactly n repetitions of the preceding pattern.
{n,}Matches n or more repetitions of the preceding pattern.
{n,m}Matches between n and m repetitions of the preceding pattern.
[]Matches any single character in brackets.
[^]Matches any single character not in brackets.
\Escapes a special character.
\|Matches either the pattern before or after the |.
()Groups patterns.

Character Classes

CharacterDescription
\dMatches any digit; equivalent to [0-9].
\DMatches any non-digit; equivalent to [^0-9].
\wMatches any word character (alphanumeric and underscore); equivalent to [a-zA-Z0-9_].
\WMatches any non-word character; equivalent to [^a-zA-Z0-9_].
\sMatches any whitespace character (space, tab, newline).
\SMatches any non-whitespace character.

Common Patterns

PatternDescription
r"\b"Matches a word boundary.
r"\B"Matches a non-word boundary.
r"\A"Matches the start of a string.
r"\Z"Matches the end of a string.
r"\G"Matches the end of the previous match.
r"\n"Matches a newline character.
r"\t"Matches a tab character.

re Module Functions

Compiling Regular Expressions

pattern = re.compile(r'\d+')

Basic Functions

match = re.search(r'\d+', 'The price is 100 dollars')
if match:
    print(match.group())  # Output: 100

match = re.match(r'\d+', '123 apples')
if match:
    print(match.group())  # Output: 123

match = re.fullmatch(r'\d+', '12345')
if match:
    print(match.group())  # Output: 12345

Finding All Matches

matches = re.findall(r'\d+', 'There are 12 apples and 5 oranges')
print(matches)  # Output: ['12', '5']

matches = re.finditer(r'\d+', 'There are 12 apples and 5 oranges')
for match in matches:
    print(match.group())  # Output: 12 5

Substitution

result = re.sub(r'\d+', '#', 'There are 12 apples and 5 oranges')
print(result)  # Output: There are # apples and # oranges

result, num_subs = re.subn(r'\d+', '#', 'There are 12 apples and 5 oranges')
print(result, num_subs)  # Output: There are # apples and # oranges 2

Splitting Strings

parts = re.split(r'\s+', 'Split this string by spaces')
print(parts)  # Output: ['Split', 'this', 'string', 'by', 'spaces']

Flags

FlagDescription
re.IGNORECASE (re.I)Case-insensitive matching.
re.MULTILINE (re.M)^ and $ match the start and end of each line.
re.DOTALL (re.S). matches any character, including newline.
re.VERBOSE (re.X)Allows for more readable regex with comments and whitespace.

MatchObject Methods

match = re.search(r'(\d+)', 'The price is 100 dollars')
if match:
    print(match.group())  # Output: 100
    print(match.start())  # Output: 12
    print(match.end())    # Output: 15
    print(match.span())   # Output: (12, 15)

Examples

Validate an Email Address

email_pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
email = 'example@example.com'
if re.match(email_pattern, email):
    print('Valid email')
else:
    print('Invalid email')

Extract Phone Numbers

text = "Contact me at 123-456-7890 or 987.654.3210"
phone_pattern = r'\d{3}[-.]\d{3}[-.]\d{4}'
phones = re.findall(phone_pattern, text)
print(phones)  # Output: ['123-456-7890', '987.654.3210']

This cheat sheet provides a quick reference to the most common regex patterns and functions in Python. For more complex regex patterns and usage, consider exploring the Python re module documentation.

Continue Reading

Handpicked posts just for you — based on your current read.

Discussion 0

Please sign in to join the discussion.

No comments yet. Start the discussion!