Python File Handling Overview

Aug 5, 2025

Overview

This lecture covers comprehensive file handling in Python, focusing on text and binary files, including operations such as opening, closing, reading, writing, traversing files, and working with the pickle module for binary data.

Introduction to Files

  • Files are used to store data permanently on secondary storage (e.g., hard disk, pen drive).
  • Without permanent storage, data entry would be repetitive and impractical.
  • A file is a named location for storing data for later access.

Types of Files

  • Two main file types: text files (store characters: letters, numbers, symbols) and binary files (store data as zeros and ones).
  • Text file examples: .txt, .py, .csv; stored as characters, but saved as binary internally.
  • Binary file examples: images (.jpg), audio (.mp3), executables (.exe); not human-readable.

Opening and Closing a Text File

  • Use Python's open() function with filename and mode parameters (e.g., 'r' for read).
  • File modes include: 'r' (read), 'w' (write), 'a' (append), 'rb'/'wb' (binary), and plus versions (e.g., 'r+').
  • Always close a file using .close() to free memory and avoid data corruption.
  • The with clause can be used for automatic file closure.

Writing to a Text File

  • Files must be opened in write ('w') or append ('a') mode to write data.
  • The .write() method writes a string to a file; add \n for a new line.
  • .writelines() writes multiple strings at once from a list or tuple.
  • Data is temporarily buffered; only written to disk once the file is closed.

Reading from a Text File

  • Files must be opened in reading modes: 'r', 'r+', 'w+', or 'a+'.
  • .read(n) reads n bytes or characters; default reads entire file.
  • .readline() reads one line at a time; can use a loop to read all lines.
  • .readlines() returns all lines as a list of strings.

Setting Offsets in a File

  • A file pointer (offset) indicates the current read/write position.
  • .tell() returns the current pointer position in the file.
  • .seek(offset) moves the pointer to a specified position.

Creating and Traversing a Text File

  • To create a new file, use open(filename, 'x'); error if file exists.
  • Opening in 'w' or 'a' will create the file if it doesn't exist.
  • Traversing means reading each character/line, typically using a loop.

Pickle Module and Binary File Handling

  • Pickle module serializes (pickling) Python objects to a binary file and deserializes (unpickling) them back.
  • Serialization: converting objects to byte streams for storage.
  • Deserialization: converting byte streams back to Python objects.
  • pickle.dump(obj, file) writes (dumps) an object; file must be in 'wb' mode.
  • pickle.load(file) reads (loads) an object; file must be in 'rb' mode.

Key Terms & Definitions

  • File β€” A named location for storing data persistently.
  • Text File β€” Stores data as readable characters.
  • Binary File β€” Stores data as bytes (0s and 1s); not human-readable.
  • File Mode β€” Mode that defines how a file is opened (read, write, append, binary).
  • Buffer β€” Temporary memory area for file data before writing to disk.
  • Pointer/Offset β€” Current position marker within a file.
  • Serialization (Pickling) β€” Converting Python objects to byte streams.
  • Deserialization (Unpickling) β€” Reversing byte streams to Python objects.
  • EOL (End of Line) β€” Special character (default '\n') marking the end of a line.

Action Items / Next Steps

  • Practice all file operations: opening, closing, reading, writing, traversing, and using the pickle module.
  • Review and write code using all file modes and methods discussed.
  • Complete any assigned exercises or practice problems on file handling.