This guide will delve into the advanced concepts of regular expressions (regex), which are essential for manipulating text and data.
Common Regex Symbols
.
- Matches any character except a newline.*
- Matches zero or more of the preceding element.+
- Matches one or more of the preceding element.?
- Matches zero or one of the preceding element.[]
- Defines a character class, matching any one of the characters inside.^
- Asserts the position at the start of a line.$
- Asserts the position at the end of a line.
Advanced Techniques
- Lookaheads and Lookbehinds: These are zero-width assertions that assert that a pattern exists or does not exist at a certain position in the text without consuming any of the text.
- Example:
(?=hello)
matches any string that contains "hello" but does not include "hello" in the match.
- Example:
- Named Groups: You can give a name to a group in a regex pattern using parentheses with a name after the
:
character.- Example:
(name)=(.*)
captures the name and its value separately.
- Example:
Example
Suppose you want to extract the email addresses from a given text. You can use the following regex pattern:
\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b
Resources
For more information on regex, you can check out our Regex Basics Guide.
Regex Diagram