Understanding Interprocess Communication (IPC)

this video was sponsored by jetbrains what I'm showing here looks like an image but it's actually a program I'm using two processes to render a fractal the part implemented in Rust performs the mathematical calculations to determine the color of each pixel in the fractal it adds this data to an array and once all pixel calculations are complete it triggers an event sending the array as a message in typescript I listen for that event receive the data and then pass it to a function that draws the pixels on an HTML canvas applications like this one built in a modular way are possible thanks to interprocess Communications today we will talk about how processes which are isolated by default can interact with other processes by exchanging information hi friends my name is George and this is core dumped in a previous episode we learned that when a program is loaded into memory to begin execution it receives a new name a process but a process is more than just executable code loaded into memory it also includes a CPU State the memory region allocated to the process a list of open files and other resources such as IO devices thus a process is not just a program it's the entire context in which the program operates this isolation however introduces a significant drawback as there are many scenarios in which a process needs to cooperate with others if processes were entirely isolated there would be no way for them to coll elaborate so processes can be classified based on whether they cooperate with other processes a process can be either independent or cooperating two processes that share data between them are clearly cooperating processes but there are additional reasons for enabling process cooperation for example computational speed up in systems that support parallelism complex tasks can be divided into smaller ones that can be executed simultaneously reducing the total time required to complete the larger task or modularity constructing a system in a modular way allows for the division of system functions into separate processes but notice that in both cases processes need a way to coordinate their activities to function correctly cooperating processes require a mechanism for interprocess communication that enables them to exchange data sending and receiving information between each other there are two fundamental models of interprocess communication shared memory and message passing we will discuss how both interprocess communication mechanisms work along with examples from Modern operating systems let's start with shared memory as the name suggests this mechanism allows processes to share a memory space directly remember when multiple processes are running on a system the operating system allocates a separate block of memory for each one this is known as the processes address space through a mechanism called privilege instructions the the operating system enforces isolation preventing processes from accessing each other's memory if a process attempts to read or write to another process's address space the operating system will immediately interrupt and terminate that process enforcing the isolation policy to ensure data safety so to allow two or more processes to share a memory area the operating system requires them to agree to remove this restriction typically a shared memory region is created using system calls and resides in the address space of the process that created it any process wishing to communicate through this region must attach it to its own address space Also through system calls once the shared memory region is established both processes can communicate by writing to and reading from this shared area here's an important detail once the operating system grants the shared memory region it no longer manages what the processes do with it what I mean is that the way the data is structured and the specific locations within the shared region where that data is written are entirely determined by the processes not the operating system for example consider a producer consumer model where one process the producer generates data and the other the consumer reads it suppose the shared region contains an array where the producer writes 8 bit signed numbers if the consumer reads these as unsigned 8bit numbers it will interpret the exact same bits but with a different meaning similarly if the consumer expects a different data type like an unsigned 32-bit integer or reads from the wrong address misinterpretations will occur as you can see there's a lot that can go wrong here the consumer must know precisely how the data is structured and where it's being written failing to follow these conventions could lead to failures in one or both processes or Worse undefined Behavior the processes are also responsible for ensuring they don't write to the same location at the same time otherwise they could encounter race conditions an issue we will cover in a future episode on thread synchronization okay but are there popular programs that use shared memory I'll share the answer after a quick message from jet brains if you're a developer just learning self-educating or working on personal projects setting up your entire development environment might not be the most productive way to use your time you might need a full IDE loaded with all the tools you need right out of the box that's where jet brains products come in with Ryder and webstorm you get just that and I'm glad to tell you that you can now access the full versions of these idees for free for non-commercial use whether you're developing games building web apps or exploring theet ecosystem Ryder offers all the tools you need including full support for Unity unreal asp.net and more webstorm is the ideal IDE for modern JavaScript and typescript development supporting Frameworks like like react nodejs angular and so much more this is perfect for those of you who want to grow create and stay productive both idees come loaded with features like code analysis safe refactoring project-wide navigation and Version Control giving you everything you need to work efficiently and write great code getting started is simple just download writer or webstorm pick a free non-commercial license and you're all set head to the link in the description to start your devel Journey with jet brains and now back to the video you might have noticed that when using Chrome multiple processes are created even if you're only using a single tab any chromium based browser is a perfect example of a modular system which relies on interprocess communication as we know dynamic web pages often contain JavaScript which can sometimes come with bugs that could potentially crash the browser to address this problem chromium uses a shared memory approach roach dividing tasks among three types of processes browser renderer and plug-in processes the browser process manages the user interface and IO operations and is created only once when Chrome starts renderer processes handle webpage content such as HTML JavaScript and images with a new renderer process for each open Tab and plug-in processes manage specific plugins like flash QuickTime or PDF readers this isolation ensures that if one tab crashes only its renderer process fails leaving other tabs unaffected other examples of systems that use share memory are simulation software game engines databases Management systems and deep learning Frameworks now let's talk about the second approach message passing sharing memory isn't always the best way for processes to communicate as it can be error prone and might Place more responsibility on the programmers than they're comfortable with an alternative method is for the operating system to provide a mechanism that allows processes to communicate and synchronize their actions without sharing the same address space address spaces can remain isolated and instead of writing to memory processes can communicate by sending messages popular message passing mechanisms include pipes sockets and remote procedure calls although the specific implementations of these mechanisms are beyond the scope of this video the general concept is as follows a message passing facility provides a set of basic operations for for instance suppose we have two processes A and B if process a wants to send messages to process B it must invoke a system call to request that the operating system establish a link with process B this link serves as The Logical pathway through which messages are sent and notice that I just said logical path what is actually happening is that the operating system's kernel creates a Quee in its own address space and this Quee will serve as a mailbox where process a can send messages and process B can receive them the behavior of this queue can vary based on communication requirements for instance we may need asynchronous instead of synchronous communication or a buffered cue if we want to limit the number of messages the mailbox can hold for full duplex communication a mailbox can be implemented with two cues allowing both processes to send messages to each other simultaneously without conflict but some of you might have already noticed that there's something odd here since the mailbox resides in the operating system address space processes shouldn't be able to access it and yes that is true processes cannot directly read or write to mailboxes thus the operating system must provide system calls for at least two operations send and receive when process a needs to send a message to process B it uses a system call to instruct the operating system I need to place this message in the mailbox shared with process B but I cannot access it directly so please handle this for me because the mailbox exists in the Colonel's address space the operating system can copy the message to the mailbox similarly when process B wants to check for messages it uses its corresponding system call to ask the operating system I cannot read the mailbox shared with process a directly could you please check if there are any messages for me if there's a message the operating system can return it as the return value of the function if you don't fully understand why system calls are necessary for this process I highly recommend reviewing the previous videos in this series and this is the general idea behind message passing Communications one of the first operating systems to introduce the idea of shared mailboxes was match an OS whose derivatives form the foundation of some Modern systems including iOS and Macos match treated processes as tasks and had a very specific name for mailboxes ports it's important to remember that messages are sent to Ports not directly to processes this distinction matters because two processes can have more than one communication link and not all ports are associated with exactly two processes a process may keep an open port to receive messages from any other process this type of Port is known as a listening port and its purpose is to handle special messages called connection requests to establish new private communication links between processes some of these Concepts May have started to sound familiar you see message passing has a huge advantage since it doesn't require two processes to share their address space processes don't even need to be on the same machine to communicate of course this requires a more complex underlying implementation such as networking capabilities to establish a connection between computers this involves driver level programming for components like network interface cards but if implemented correctly at the operating system level the process should be seamless for Developers message passing between processes on different machines will function the same way as it does between processes on the same machine the client server architecture is typically implemented using the socket interface which is an example of a message passing mechanism supported by nearly all mainstream operating systems today it is usually Illustrated this way although I'm not sure I completely agree as it suggests that the clients and server are machines in reality they are simply processes running on those machines and communication doesn't always require a network the client and server processes can run on the same machine as we see when using Local Host to test projects anyway when a client sends a request to a server the IP address identifies the machine hosting the server process while the port represents the mailbox through which the server process receives requests servers providing specific services like http P FTP or SSH rely on the socket interface which makes me remind the guys from codec Crafters have a nice challenge-based course where you can learn how to use sockets to implement an HTTP server from scratch in the programming language of your choice I'll leave a link in the description in case you want to support me when subscribing unfortunately message passing systems do have drawbacks because ports reside in the Colonel's address space processes must use system calls each time they want to send or receive messages this can be quite costly in terms of performance this limitation does not apply to the shared memory approach with shared memory system calls are only necessary when creating the shared region and attaching processes to it once these steps are complete processes can read from and write to the shared region as if it were part of their own address space without needing further system calls this results in communication that is extremely fast essentially as fast as direct memory access and just to clarify I'm not saying that message passing is slow shared memory is simply faster in 99% of cases message passing is more than sufficient and if you learned something today don't forget to like And subscribe see you in the next one

Transcript for:Understanding Interprocess Communication (IPC)

Transcript for:
Understanding Interprocess Communication (IPC)