Understanding Python Threading Misconceptions

Aug 23, 2024

Lecture on Misconceptions about Python Threading

Overview

  • Two Common Misconceptions
    • Misconception 1: Python is a single-threaded language.
    • Misconception 2: To use multiple cores in Python, avoid threading in favor of multiprocessing.

Misconception 1: Python is a Single-threaded Language

  • Reality:

    • Python supports threading with real system threads, not just green threads.
    • Demonstration with htop and iPython showing multiple threads.
    • Threading Example:
      • Import threading.
      • Create and start a new thread with a task.
      • Threads show up in htop as system-level threads.
  • Async in Python:

    • Suitable for IO-bound tasks and web-related activities.
    • Utilizes green threads and is single-threaded.

Misconception 2: Use Multiprocessing for Multiple Cores

  • Context-Dependent Reality:

    • Technically true for pure Python code due to the Global Interpreter Lock (GIL).
    • GIL:
      • Forces threads running Python bytecode to acquire a shared lock, thus limiting to one core.
      • Multiprocessing launches separate interpreter processes, overcoming this limitation.
  • Exceptions with Libraries:

    • Libraries like Numpy, Numba, Cython, and others allow you to bypass the GIL.
    • Example with Numpy:
      • Threading can be efficient if acting on Numpy arrays as it internally releases GIL.

Demonstration of By-passing GIL

  • Example Function:

    • JIT compiled function with Numba.
    • Initial setup underestimated CPU capacity, requiring adjustment.
    • Use of Thread Pool Executors:
      • Demonstrated CPU utilization without releasing GIL.
      • Released GIL to show increased CPU usage.
  • Matrix Multiplication Example:

    • Showed improved CPU usage with GIL released.
    • Illustrates the potential when the GIL restriction is bypassed.

Comparison of Threading to Multiprocessing

  • Threading Advantages:

    • Lower RAM impact.
    • Potentially higher CPU usage when GIL can be bypassed.
  • Multiprocessing:

    • Higher RAM requirements.
    • Example showed process pool using more RAM and inefficient core usage.

Conclusion

  • Detailed explanation of the nuances of Python threading and the GIL.
  • Importance of understanding when and how to bypass the GIL for improved performance.
  • Encouragement to explore further nuances and subscribe for more Python-related content.