Overview
This lecture introduces software sockets, covering what they are, how they work in network communication, their types, lifecycle, and their crucial role in modern computing.
What is a Socket?
- Sockets are an operating system abstraction that enables communication between processes on the same machine or over a network.
- Each socket acts as an endpoint in a two-way communication channel and wraps a protocol (TCP or UDP), IP address, and port number.
Sockets and the OSI Model
- Sockets operate mainly at the transport layer (Layer 4) of the OSI model.
- The application layer (Layer 7) interacts with sockets to send or receive data, which sockets wrap into TCP or UDP segments for network transmission.
Types of Sockets (TCP vs UDP)
- TCP sockets are connection-oriented, reliable, provide ordered and error-checked data transfer, and use a three-way handshake to establish connections.
- UDP sockets are connectionless, unreliable, don't guarantee delivery or order, but are faster and used for real-time applications like video streaming.
Socket Lifecycle and Server/Client Interaction
- Servers create a listening socket bound to an IP and port, which waits for incoming connections.
- Upon a client connection, servers create a new dedicated socket per client; synchronous setups use multithreading, but scalable systems use non-blocking IO or event-driven architectures.
- Event notification (select, poll, epoll, kqueue) optimizes handling many sockets with minimal CPU usage.
- Clients create sockets, connect to server IP and port, and communicate through read/write operations; sockets are treated as file descriptors (integers) on Unix-like systems.
- Both sides must close sockets after use to free resources and avoid leaks.
Socket States and Uniqueness
- TCP sockets maintain a state machine (e.g., listen, established, time-wait) for connection management and troubleshooting.
- Sockets are uniquely identified by the 5-tuple: protocol, source IP, source port, destination IP, destination port.
Unix Domain Sockets (UDS)
- UDS are used for interprocess communication on the same host, identified by a file path, and are faster than network sockets due to bypassing the network stack.
Socket Security
- Sockets are insecure by default; encryption is achieved by wrapping sockets in TLS (Transport Layer Security) to protect data from attacks.
Sockets in Distributed Systems
- Distributed systems and microservices rely heavily on sockets (e.g., REST APIs, gRPC, load balancers, and data stores like Redis and Cassandra use sockets for communication).
Key Terms & Definitions
- Socket — Software endpoint for bidirectional communication over a network or within the same machine.
- Port — Numeric identifier for a specific process or network service on a host.
- File Descriptor — Integer that uniquely represents an open file or socket in a process.
- TCP Socket — Connection-oriented socket ensuring reliable, ordered data transfer.
- UDP Socket — Connectionless socket focused on fast, unreliable data transfer.
- Unix Domain Socket (UDS) — Socket for fast interprocess communication on the same host.
- TLS — Protocol for securing data transmitted over sockets.
Action Items / Next Steps
- Review the use of sockets in your preferred programming language.
- Explore implementing a simple client-server demo using TCP and UDP sockets.
- Study event-driven networking approaches (e.g., epoll or Node.js) for scalable socket handling.