Transcript for:
Exploring the Linux Kernel and System Calls

hi and hope you are doing well i'm Judy and today I was doing some task which was very fun kind of a security thing and thought it's nice to share the general method with you guys i was checking system calls to make sure a program is doing what it claims to do but thought it's very fun to show you the method also we can have a quick chat about how Linux kernel works internally and in general what a operating system does technically when you run a command in Linux what actually happens behind the scenes is very very interesting kernel should manage lots of thing to run a command more than that to run your computer uh your GUI your different programs and everything the Linux kernel as any other kernel is the core of our operating system it is what stands between your apps and your hardware it's a bridge between hardware and software if I want to show you as a shematic I can draw you the kernel this is Linux kernel you have some hardware here let it be your memory your CPU your hard disk and other stuff network card many other things screen output input whatever you have mouse keyboard whatever and you have some programs you need here for example in this case I'm running a presentation tool in my terminal here i have OBS running which records the screen i have another terminal here i have another browser behind this terminal i will show it later and lots and lots and lots of at least 100 other processes just to run the operating system my kernel should manage all of these should give each of these share time of the CPU technically you can run you can't run them in parallel in reality they are running very fast alongside each other if this needs some memory it will ask for memory from the kernel says I need 10 megabytes of memory kernel goes to the memory if this is my memory allocates this section gives it back this section so it can use it like a virtual memory if they want to write to a file same thing happens they will ask for a file they will write on it but everything goes through kernel this is a monolithic kernel design which Linux is there are other types like micro kernels and other stuff but in our case this is what kernel does this is called user space this is called kernel space whatever is happening in kernel space is uh much more important higher privileged controlled by the kernel high access higher access roles and these kind of stuff whatever happens in user space has lower not in a bad meaning but should go through kernel to do what it needs to do i have a chart here prepared one from Wikipedia this is from the user space and kernel space Wikipedia page you can study it but this is what happens behind the scene in a kernel if you want to write your own kernel as we were making fun of one of the tries it is not a bad thing if you are doing it to learn to study to educate yourself or to do something amazing but you should not brag about it when you started it you cannot say I'm going to write something which is better than Windows Linux and Mac and everything or combine them in one but when a PC runs a bootloader hands over the execution to your kernel this is your small kernel it may just write hello world on the screen and die still an operating system but a boring one if you want to add to it you have to be able to run another executable for example ELF executable you can say okay load then a prompt if a user typed something try to read the disk load this and execute it when it's finished return back to kernel this prompt you will have something like a old school does program if you want to go further you have to be able to run things in parallel you can run different things in parallel support more and more uh what is it called support more and more file systems you should go further than for example FAT and start to grow your operating system in some place you will have an operating system which is much much much better it will understand different file system it can run files in parallel it can access the network but still it's not very useful because you should develop whatever you want whatever needed on it there is an amazing project called Holly OS if I'm not wrong was it Holly OS a guy with paranoia have done this temple OS sorry it's called Temple OS a guy with paran it was written by Holly C via using Holly Cy a amazing programming language I may cover it once more but many people just do to gather views it's fun I may do views are good tell your friends anyway if you continue writing your own operating system like that you will end up with something like temple OS operating system which is super amazing for a oneperson job but useless in everyday life because you cannot for example use SSH on it unless you write your own SSH we need user space programs like SSH like bash Firefox to run to be able to run on our operating system so what should happen one solution is first writing different modules write processes Process scheduling so different processes can run ipc so processes can speak with each other memory management so you can dedicate parts of the memory to each process virtual file system so programs will be able to write on disk and networking okay HOS have all of this I think except networking he didn't believe in networking at least God told him not to cover networking if I'm not mistaken uh then you should let programs to speak with your operating system that happens with system calls you have to have different system calls so if a program needs memory they will call your mAP system call and you will provide the memory and return it back if they wanted to write on a file and Linux everything is a file so many different things would considered as a file but if they wanted to write on a file they would call your write system call and you will answer back now user mode programs are able to do useful things with your hardware interact with your operating system so these system calls are like gateways to your operating system which is super cool but now different programs should respect your system calls as soon as you have this system calls if you implement exactly these system calls these 380 system calls you will have a Unix like operating system so BSD Mac OS Linux even to some extent Windows will have the same at least same logic on Unix like systems exact same system calls but it's very difficult to use this for actual work not very difficult but these are difficult we are lowlevel we are lowlevel on a kernel mode hardware is here to make things easier different programming language provide another layer here which is called CS standard library because it comes from C all the schools see and practically many many many programs do use them things that we know like m alloc memory alloc memory copy what is local time open a file all of these are technically translated to equivalent system calls now if you're writing a program with CS standard library and you have CS standard library on your operating system regardless of what your operating system is if you are using memcopy and you are on Linux GIPC will translate your memcopy to appropriate system calls on Linux kernel fun same program will run on uh Windows if your Windows do have a C standard library which it has if you're writing a program operating system called temple OS and you provide CS standard library any standard program which is written on CS standard library will run on your operating system because CS standard library will translate them to your operating system system call good thing about Unix is these are a standard so all operating systems provide all of them most of them at least with based on an standard so all unique systems can work with each other this is super cool if you want to have a complete list you can do a search for these system calls you can search for unique system calls what I did ah uh Linux what Linux system call table for example this one is nice it's searchable and all system calls are here you can go through them implement one by one and you will end up with a Unix compatible system there are 456 now so because these are for Linux these are not standard Unix anyway Okay so this is how system works layer by layer on a higher level than this one we have system components like your initi demons SSHD system demons your windowing systems like X11 which can draw a rectangular which can uh show the mouse understand where the mouse is and this kind of a stuff and on a not a higher but another level is GTK cute and others when you are creating a graphical application you can use cute cute will work on X11 the cute is like create a window create a button here and input here if I push this do this here and these kind of graphical things gtk is the same and many others so your program like Mosilla Firefox Blender or whatever you write or bash a terminal will use these they will call CS standard library cs standard library will call CIS calls to talk with the kernel this is how an operating system works on a lower level today I was checking this system calls for the program sometimes you have this you have a new program here and they say run this on your server this is a cool program your manager says check it what is this and you go and check every single system call this program does and you have a much better understanding if this program does what it says or not you cannot have a uh final word but you will have a much better understanding this is also very cool in debugging and stuff so this is how an operating system works for scheduling and processes here you have commands like uh top which shows all the processes what is running this is my CPU this is my memory this program is using and PS will show you all the processes you also have edgetop which is a fancier version of top CPU percentage memory percentage of each program you can click and say show me the program with the most CPU incentive tasks this is my memory this much is used what I did these are my CPUs and whatever data you want for the memory you have things like free edge will show you how much memory you have or you can go with cat pro me info thanks to warp it's here it's cool so you have different ideas your memory total memory free and whatever is going on this much memory is available at the moment although this much is cached so I can free this inactive memory and see lots and lots of different things things like for example dirty pages you have the ones which you have not written yet or other stuff anyway this is cool you can get lots of information here mapped these are system calls with that M map we saw not one by one but very very related but here I wanted to show you the system calls a program does let's go to a directory checking let's create one some files here i can do an ls it will show me the name of the files I just created with touch here if I want to see what system calls ls does there is a very cool command here called s trace you can do an S trace ls it traces all the system calls this program does technically showing you on a very low level what system call is being called by the command you're using technically how this program is talking with kernel so how it's accessing network how it's accessing disk how it's requesting for memory when which files and uh technically you will understand the general behavior of a program if you have a virus which does lots of access to other files you will see it here how it works first when you do s trace ls you are running ls tracing all the system calls it's not very difficult to read this is the system call these are parameters so when starting you can see exec this file cool when you say exec it calls a system call to execute something and sends this uh file on the disk to this system call then prepares some memory then accesses the preload libraries when you're writing a program you oftenly use different libraries so this is my program these are the libraries I'm using for example if I'm using ls I might need to say ls a star what does this star mean i have to expand it this is like a reax all different things can use reaxes so you use a reax library you may use a fancy printing library you may use a networking library in different programs so here s trace shows me that it accesses preload library to see if any preload library is there the answer this is the result of this system call is minus one so no preloaded one so it checks the cache of our libraries for the linker then maps something and closes it you will see lots of things with this pattern it opens a library for example here SE Linux se Linux is security enhanced Linux uh does lots of security stuff on the accesses so here I'm opening this library shared object so it's a library i'm using a library reading it using it in memory and closing it this will happen times and times with a any executable you run technically now you can check what libraries this program uses ls doesn't use much because it's ls and it should work without much dependencies it needs lipsy so it calls lipsy again reads it uh allocates memory and closes it then it goes to lip pearl maybe common reax system so this is the one I was telling you opens it reads it puts it in memory and closes it so it's loading all the libraries one by one first it was the execution then it was the loading the libraries then it goes to some system calls for uh threads prepares it for threads and product checks for limits every single of these is a system called in a Linux system so you can check for this in your uh for example here there were all the searchable Linux system calls you can go with this I searched for this okay here you can go to the direct kernel source where this is defined I will show you later so here I went here again it checks the se Linux get random many of the programs will get random eventually somewhere because they want to do some security stuff they need some random then it checks the prog file systems because if it runs reads some data it should understand the file systems then read blah blah again checks the SE Linux it's minus one so SE Linux is not being used fat memory uh we are at the end so let's have my pen some flags open at this now it reads the current directory with some flags it gets back the data Cool get dent s 64 not sure what this does let's search and we will learn at least as a name even knowing the name is cool and you will learn like this step by step get directory entries cool so here is the main part it reads here and then get directory entries read the entries closes now it has the entries it needs to write system comp it writes 1 2 3 enter the result is eight maybe bytes which is written or something two so 1 2 3 4 5 6 7 8 yes write returns the number of bytes written and then closes now you saw exactly step by step what this program does so we know that ls is legit it doesn't do anything extra on my kernel it may do some calculations behind the scenes but it not is not calling kernel for any favors it's not sending anything on the network or such it was cool the main part was here it opened this get directory entry and then wrote these on the terminal this was very nice you can check this one by one here on the web we checked it but as you can see this is like a manual page on Debian what I can do is you can go with manual write here chose your right send a message to another user because we are on right section one section one of the right i prefer to see them as different books book one is about executables you can check that with a fun command man check the manual of the manual it shows you that there are different sections or as I prefer books book one is executable programs book two is system calls book three is library calls book four is a special files so we need the right from book two or section two so I will go with man to write it gives me the write system call information write to a file descriptor standard C library it works like this write FD buffer size and it return the size okay this is cool if I want to check the source I can go to the uh Linux source code on git this is not where it is developed this is um kind of a mirror of the whole thing so there are pull requests so maybe this is not a mirror i might be wrong i don't know can I open a p request don't know h I cannot remember so here I can do a search for Cisco right h I cannot remember this H don't know anyway but I think Linux kernel is not being developed here for sure the patches go through email but I'm not sure because this is a pull request anyway this is another story uh let's go for this one system calls do use a macro which is system called define and there are different macros based on the number of the parameters how many parameters we had 1 2 3 so we need three right although it might have been if you haven't know about this you could browse the tree go to FS go to read write or open right or whatever it is called read write this is it's defined here as you can see cisco define tree is a macro which defines one of the system calls write and gets the same parameters easily it just calls K sis write so let's check this one on GitHub nowadays you can do this so when you click on this one it will show you where the more information is this is defined here referenced here so defined once referenced once so we will go to the definition definition is this one uh same file Linux file system read write nowadays Linux is very complicated because it should be able to run on lots and lots of architectures on lots and lots of uh file system so this is not an easy call if it was if it was my own Unix like operating system I would just open a file and write into it and answer back I did it but it needs to understand about file systems be able to run on different architectures and many things so you will see this step by step by step checking everything calling different things based on scenarios and such so Kis write technically it calls virtual file system right what I was just telling you it's defined in this file same file a little bit upper preview this looks cool I think this is what is exactly happening VFS write gets a file gets the buffer and most probably gets the size does some checks verifies area for being RW read and write you should have access to write then it starts the right and it finishes the right and then it writes it based on different criteras for example if this file has a right or this file has a right iterator but at the end it calls the writings very interesting now you're looking into the kernel one of the system calls this is more clear but it's still very difficult to understand if you want to understand now you should understand what are these two and how are these called for example I have to go to this one see where this is defined why it's different than the other news is call news is sync right sorry this is sync writing which should be easier and you can go deeper and deeper and any of them all are calling other things and such so if you want to really start understanding Linux kernel I have another SL slide for you there are three books which can help you a lot first in this video I just wanted to share with you mainly the straits and a good understanding of the Linux kernel and how it works but if you want to go further Linux kernel development development is a very good book by Robert Lo this is what I have seen that many people point to when they want to tell you how to understand this understanding Linux kernel from oral is very nice and there is something which is called a heavily commented Linux kernel this is a very very very old version of a kernel but heavily commented technically each line has a comment at this one line most of the cases more than one line each part so it's very easy to read and also if you want to understand how an engine works of a car you cannot just lift up a hood watch and see how this works it's very difficult it would be much much much easier if you could go 100 years back check an engine much easier to understand then go forward 10 years go forward 10 years that would be much easier same thing happens with this book a heavily commented Linux kernel i think a Chinese guy long time ago commented all the lines and told us what it does and it's a very old version so it's much easier to understand you will have a holistic understanding after reviewing it and you can switch to the newer books also understanding the Linux kernel from orali is very nicely explained and Linux kernel development is what people talk about nowadays if you're here I would be happy if you follow me tell your friends about this I will continue creating these videos even if you don't follow and you don't tell your friends but that will make me more encouraged like and tell me I was thinking about uh maybe writing a small kernel module to show you how it works or describe scheduling methodologies maybe or something else you tell me have fun i was jotty and still jotty