CS50 Week on File I/O

Jul 1, 2024

CS50 - Introduction to Programming with Python

Week on File I/O (Input and Output of Files)

Key Concepts

  • File I/O: Writing code to read from (load information) and write to (save information) files.
  • Storage in Memory vs. Storage in Files: Data stored in memory (non-persistent) vs. stored in files (persistent).

Working With Lists

  • Lists and Memory: Lists store multiple values in memory. Upon program exit, the contents in memory are lost.
  • Program Example: Collecting names using a list.
    names = []  # Empty list
    for _ in range(3):
        name = input("What's your name?")
        names.append(name)
    for name in sorted(names):
        print(f"Hello, {name}") 
    
  • Problem: After program exits, collected names are lost.

Reading and Writing to Files

  • Writing to Files: Using the open function with mode 'w' (write).
    with open("names.txt", "w") as file:
        name = input("What's your name?")
        file.write(name + "\n")  # Ensuring new line
    
  • Appending to Files: Using mode 'a' (append) to add to the file without overwriting.
  • Common Issue: Writing without new lines results in concatenated text.
  • Best Practice: Use with statement to handle file opening and closing automatically.
    with open("names.txt", "a") as file:
        name = input("What's your name?")
        file.write(f"{name}\n")
    

Reading from Files

  • Reading Line by Line: Use readlines() to read all lines at once.
    with open("names.txt") as file:
        lines = file.readlines()
    for line in lines:
        print(f"Hello, {line.strip()}")  # Stripping new lines
    
  • Using Loops for Efficiency: Optimizing reading and manipulation:
    with open("names.txt") as file:
        for line in file:
            print(f"Hello, {line.strip()}")
    
  • Sorting Names: Load, sort, and then print for ordered output.
    names = []
    with open("names.txt") as file:
        for line in file:
            names.append(line.strip())
    for name in sorted(names):
        print(f"Hello, {name}")
    
  • Reverse Order Sorting: Use reverse=True parameter in sorted function.

CSV Files

  • CSV (Comma-Separated Values): Storing multiple related data points.
  • CSV Example:
    name,house
    Hermione,Gryffindor
    Harry,Gryffindor
    Ron,Gryffindor
    Draco,Slytherin
    
  • Reading CSV: Using csv.reader for simple parsing and csv.DictReader for column keys.
    import csv
    with open("students.csv") as file:
        reader = csv.DictReader(file)
        for row in reader:
            print(f"{row['name']} is in {row['house']}")
    
  • Writing CSV: Using csv.writer with field names.
    import csv
    name = input("What's your name?")
    house = input("What's your house?")
    with open("students.csv", "a") as file:
        writer = csv.writer(file)
        writer.writerow([name, house])
    

Advanced CSV Operations

  • Using Lambda Functions: For custom sorting keys.
    sorted(students, key=lambda student: student['name'])
    
  • Handling CSV Headers: Storing the header in CSV files to ensure data is correctly parsed.
    with open("students.csv") as file:
        reader = csv.DictReader(file)
        for row in reader:
            print(f"{row['name']} is from {row['home']}")
    

Handling Binary Files (Images)

  • Pillow Library: For image manipulation and creation of animated gifs.
    from PIL import Image
    images = []
    for arg in sys.argv[1:]:
        image = Image.open(arg)
        images.append(image)
    images[0].save("animation.gif", save_all=True, append_images=images[1:], loop=0)
    

Summary

  • Files are essential for persistent data storage.
  • Employ libraries (e.g., CSV, Pillow) to handle complex file operations and formats.
  • Reading, writing, appending to files and ensuring clean data handling with practical examples.

This concludes the lecture on file I/O and handling different file types in Python.