🔒

Resiliency

Feb 23, 2025

Information Security and High Availability

Importance of Uptime and Availability

  • Key focus in information security: maintaining system uptime and availability.

High Availability (HA)

  • Provides enhanced system resiliency.
  • Systems are always on and available.
  • If one system fails, another takes over immediately.
  • Costs: Additional components and systems increase costs.

Engineering for High Availability

  • Multiple Systems: Necessary for redundancy.
  • Upgraded Power: Required for continuous operation.
  • Higher Quality Components: Adds to overall costs.

Server Clustering

  • Definition: Multiple servers acting as one large server.
  • Scalability: Ability to add/remove servers as needed.
  • Interoperability: Servers typically run identical operating systems.
  • Shared Storage: Servers use a shared directory for data synchronization.

Load Balancing

  • Function: Distributes load across multiple servers.
  • Independence: Servers unaware of each other.
  • Flexibility: Can run different operating systems.
  • Dynamic Allocation: Allows adding/removing servers.

Site Resiliency

  • Definition: Recovery sites in different locations for disaster recovery.
  • Types of Recovery Sites:
    • Hot Site: Exact replica of main data center.
    • Cold Site: Empty building; requires setup during disaster.
    • Warm Site: Mid-point with some equipment and data.
  • Geographical Dispersion: Sites should be far enough apart to avoid simultaneous disaster impact.

Platform Diversity

  • Vulnerability Mitigation: Use different OS to avoid systemic vulnerabilities.
  • Example: Mixing Windows, Linux, and macOS.

Cloud Resilience

  • Multiple Providers: Use different cloud providers for redundancy (e.g., AWS, Azure, Google Cloud).
  • Security Perspective: Reduces risk if one provider is compromised.

Continuity of Operations Planning (COOP)

  • Non-Technical Alternatives: Prepare for service provision without technology.
  • Examples:
    • Manual transaction completion.
    • Paper receipts.
    • Phone-based credit card approvals.

Conclusion

  • Planning for high availability and disaster recovery is crucial.
  • Implementing diverse and redundant systems minimizes downtime and ensures continuity.