Dave's Lecture on the CrowdStrike Issue

Jul 22, 2024

Dave's Lecture on the CrowdStrike Issue

Introduction

  • Speaker: Dave Plumber, retired software engineer from Microsoft.
  • Topics:
    • CrowdStrike issue and blue screen crashes.
    • Kernel mode vs. user mode.
    • Fixing affected machines.

Background

  • Context:
    • Dave's past experience with Windows NT and handling blue screen crashes.
    • Explanation of how Microsoft handled system stress tests.
      • Anti-stress tests: Automated tests on machines to catch bugs in real-time.
      • Debugging: Connecting to target machines, troubleshooting in Assembly Language.

Kernel Mode vs. User Mode

  • Kernel Mode:

    • Core system tasks: hardware interaction, memory management, thread scheduling.
    • Runs at a higher privilege level and has full system access.
    • Crash in kernel mode results in system crash (blue screen).
  • User Mode:

    • Runs application code with limited privileges.
    • Application crashes don't affect the entire system.
  • Interaction:

    • When user mode requires kernel services, it raises exceptions and waits for kernel threads to execute necessary functions.

CrowdStrike Specifics

  • Falcon Sensor: CrowdStrike security product operating in kernel mode.

    • Analyzes application behavior to detect new attacks.
    • Operates as a device driver, giving it deep system access.
  • WHQL Certification: Ensures drivers are tested and certified safe for Windows.

Issue Explanation

  • Dynamic Definition Files: Used to keep CrowdStrike updated against new threats.

    • Allows faster updates but adds risk of unsigned code running in kernel mode.
  • Bug Details:

    • Crash example: Invalid pointer causing a crash (Twitter example).
    • Cause: CrowdsStrike updates had zeroed-out data files leading to crashes.
  • Postmortem Debugging: Identifying the null pointer issue and its upstream causes.

    • Inadequate parameter validation in CrowdStrike driver.

System Recovery

  • Safe Mode: Limited driver loading allowing access to fix crashed systems.
    • Steps:
      1. Boot into safe mode.
      2. Navigate to system32/drivers/CrowdStrike folder.
      3. Delete the faulty driver C000000291.sys.
      4. Reboot the system which should then function normally.

Conclusion

  • Emphasis on understanding kernel and user mode differences and the implications for system stability.
  • Practical steps to fix systems affected by the CrowdStrike issue.
  • Additional resources: Dave's book on autism spectrum.