Regular Expressions in Python - Lecture Notes

Introduction

Topic: Working with regular expressions (regex) using Python's re module
Definition: Regex provides a method to search and match text patterns
Usage: Can be used across various programming languages, text editors, etc.
Importance: Useful for searching, modifying, and manipulating text based on patterns

Importing re module: import re
Text to Search: Can be a multi-line string containing various text patterns (e.g., lower and upper case letters, digits, URLs, phone numbers, etc.)
Raw Strings: Strings prefixed with r (e.g., r'string') to avoid special treatment of backslashes

Compiling Patterns: re.compile(pattern)
Finding Matches: pattern.finditer(text) returns an iterator of match objects
Match Objects: Contains information like span (start and end indices) and group (matched text)
String Slicing: Using text[span[0]:span[1]] for exact matches

Literal Matches: Direct text matches
Escaping Characters: Use backslash () to escape special characters (e.g., . becomes \.)
Dot (.): Matches any character except newline
Digit (\d): Matches a digit (0-9)
Non-digit (\D): Matches any character except a digit
Word Character (\w): Matches alphanumeric characters and underscore
Non-word Character (\W): Matches any character except alphanumeric and underscore
Whitespace (\s): Matches any whitespace character (space, tab, newline)
Non-whitespace (\S): Matches any non-whitespace character
Anchors: ^ (start of string) and $ (end of string)
Word Boundaries: \b (word boundary), \B (non-word boundary)

Grouping Patterns: Use parentheses ( ) to create groups
Capture and Reference: Capture parts of the pattern and reference them using backreferences
Example: Capturing domain and top-level domain in URLs

Regular expressions are powerful tools for text pattern matching and manipulation
Practice and familiarity are key to mastering regex
Future videos will cover advanced topics in regex for deeper understanding

Questions: Feel free to ask questions or request further explanations in the comments or discussion forums.