Coconote
AI notes
AI voice & video notes
Try for free
🤖
AI and Machine Learning in Network Operations
May 9, 2025
Lecture Notes: AI and Machine Learning in Network Operations
Introduction
Application of AI and machine learning to prevent IT infrastructure failures.
Focus on eliminating noise and reducing remediation time.
OCTA's role in transforming network operations.
Overview of OCTA
Founded by the speaker, who is the CEO.
Applies AI/ML to operational data in networks, servers, and infrastructure.
Operates in data centers, service provider networks, and large-scale LLM infrastructures.
Challenges in Network Operations
Historically, network operations have been noisy and reactive.
Presence of vast data across various layers (cloud, data center, 5G, etc.).
Operators have been using siloed tools for data mining.
OCTA's AI and ML Solutions
Uses real-time algorithms, largely unsupervised, to find insights from data.
Detects anomalies, predicts issues, and correlates events to reduce noise.
Proven to decrease tickets by 70-90% and detection time from 47 minutes to 1 minute.
Platform is software-only, scalable, and can be deployed on-premise or as a SaaS.
Integration and Data Collection
Collects data from infrastructure components (switches, routers, servers, etc.).
Integrates with data lakes like Prometheus and Splunk.
Designed for big data and Telemetry streaming.
Unique Algorithms and Approach
Built custom algorithms after finding open-source solutions inadequate.
Focuses on misbehavior detection across TCP, optical, HTTP layers.
Automates actions such as ticket generation and issue remediation.
Use Cases and Applications
Observability and AI Ops in hybrid cloud, data center, 5G environments, etc.
Use cases include:
TCP retransmissions and congestion correlation.
Optical misbehavior detection ahead of failures.
Post-change verification for BGP and other changes.
LLM infrastructure job metrics analysis.
Synthetic probing for quick issue identification.
AI Ops and the Role of LLMs
AI Ops is a reality, already implemented at large scales.
LLMs play a supportive role, but not central in OCTA’s strategy.
Unsupervised ML is used for log analysis and anomaly detection.
Market Impact and Future
Growing adoption and interest in AI Ops, significant ROI reported by users.
Gartner projects a major increase in enterprise adoption by 2030.
OCTA aims to help enterprises mature their network operations with AI Ops.
Conclusion
AI Ops in the network is mature and beneficial.
OCTA offers a comprehensive approach to transform network operations.
Encouragement to engage with OCTA for operational transformation.
📄
Full transcript