Low Latency C++ Trading System Notes

Jun 25, 2024

Low Latency C++ Trading System

Introduction

  • Importance of building automated trading systems.
  • Challenges: trading problem and technological problem.
  • Increase in market participation over the last decade.

Why Low Latency is Essential

  • Market makers like Optivia: numerous orders (hundreds of thousands to millions) at any given time.
  • Immediate action required during market-moving events (e.g., canceling obsolete buy orders).
  • Building Blocks of Low Latency Systems:
    • Data Model Access
    • System Tuning
    • Performance Measurement

Designing for Performance

  • Donald Knuth: Premature optimization is the root of all evil.
  • Performance should be considered at the beginning.
  • Strategy vs. Tactics:
    • Strategy: Overall approach to meet performance goals.
    • Tactics: Implementation details.
  • Importance of data modeling for performance.

Goals and Current Latency

  • Goal: handle events with a certain latency distribution (e.g., average less than 100 nanoseconds).
  • Financial market latency requirements are very stringent (e.g., ~10 nanoseconds with FPGA implementation).
  • C++ still relevant for low latency due to the complexity and multiple steps in trading systems.

Data Model Access

  • Profile results show that improper data modeling leads to widespread slowness.
  • Example of optimizing an instrument store: using a stable vector instead of a std::unordered_map for better locality and cache efficiency.
  • Importance of keeping working set size (WSS) small and localized in memory.

Concurrent Access

  • Sharing Data between Applications:
    • Use of ring buffer for events.
    • State and Event approaches for data sharing.
  • Example: SCC Lock (Sequence Counter Lock) for efficient read-write operations.
    • Good for systems with one producer and many consumers.
  • Event Notifications using Shared Memory queues for low-latency communication.

System Tuning

  • Importance of CPU Clock Frequency and Power States (C-states and P-states) for consistent performance.
  • System tuning practices to minimize jitter and ensure stable performance (disable power-saving features when necessary).
  • Optimization techniques include interrupt affinity, CPU core isolation, leveraging huge pages, and NUMA awareness.

Performance Measurement and Scaling

  • Continual measurement and monitoring are essential for long-term system performance.
  • Building performance metrics and logging into the system from the beginning.
  • Example of using RdTSC for timestamping and evaluating system performance over time.

Conclusion

  • Summary of key strategies and tactics discussed.
  • Importance of robustness and speed in trading systems.
  • References for further reading on low-latency and data modeling.

References

  • Ulrich Drepper’s paper on memory and cache profiling.
  • Mike Acton’s talk on data-oriented design.
  • Papers on Seq Lock and performance tuning.