Introduction to Regular Expressions in Python

In this tutorial we are going to learn about using regular expressions in Python, including their syntax, and how to construct them using built-in Python modules. To do this we’ll cover the different operations in Python’s re module, and how to use it in your Python applications.

What are Regular Expressions?

Regular expressions are basically just a sequence of characters that can be used to define a search pattern for finding text. This “search engine” is embedded within the Python programming language (and many other languages as well) and made available through the re module.

To use regular expressions (or “regex” for short) you usually specify the rules for the set of possible strings that you want to match and then ask yourself questions such as “Does this string match the pattern?”, or “Is there a match for the pattern anywhere in this string?”.

You can also use regexes to modify a string or to split it apart in various ways. These “higher order” operations all start by first matching text with the regex string, and then the string can be manipulated (like being split) once the match is found. All this is made possible by the re module available in Python, which we’ll look at further in some later sections.

Regular Expression Syntax

A regular expression specifies a pattern that aims to match the input string. In this section we’ll show some of the special characters and patterns you can use to match strings.

Matching Characters

Regular expressions can contain both special and ordinary characters. Most ordinary characters, like ‘A’, ‘a’, or ‘0’, are the simplest regular expressions; they simply match themselves. There are also other special characters which can’t match themselves, i.e. ^, $, *, +, ?, {, }, [, ], \, |, (, and ). This is because they are used for higher-order matching functionality,

