COMP1521 Week 2 Lecture 1 MIPS Assembly Overview

hello everyone and welcome to your third lecture I'm recording this from the comfort of my own home on this uh cold and windy day I hope that you too are watching it from the comfort of your own home Okay before we get started um I wanted to announce that we that Anna Brew has kindly um arranged for these uh C revision sessions to take place So this will be this week on the 12th of June 10:00 a.m to 12:00 p.m via Blackboard Collaborate So you can find more information on the forum under announcements I hope that everyone has had a chance to access the forum by now I see there's been a little bit of chatter on there I'd like to see a lot more discussions happening So in today's lecture we are going to um we're going to recap what we covered in lecture two We'll talk be talking a little bit more on loops specifically covering the break and continue condition um in commands in C and then we're going to move on to look more at data and memory and how we use that in myths Okay so recap of le last lecture or the last few lectures actually So we looked at writing instructions that can um act on registers We looked at instructions that perform simple arithmetic operations Now these always operate on registers or um immediate values or constants Then we looked at system calls so that we can print hello world Most important part about programming Okay We looked at converting constructs like loops and conditionals into go to and branch So into simplified C Okay So the registers that we looked at so these registers we have 32 of them and they're all given a number but we do prefer to use the symbolic names in that center column there because some of these have a special purpose by and putting data into registers We learned about the LI instruction for loading a constant into a register or an immediate and LA for loading an address into a register We can use labels there instead of a a hard-coded address And we also learn learned about move So we use move to copy data from one register in this case t1 into another register t0 Okay We uh learned about um simple arithmetic instructions So the we had two different types The i type and the r type So the i type that one's last there So that takes an immediate value while all the others take registers System calls that we looked at So these are the key ones Um we looked at print f which has system call number one that we load into v 0 and um arguments into a z Other one we looked at scan f We put the system call number five into v 0 And there's no arguments for this one but we get our value returned in v 0 In this case we also looked at um system call 11 and 12 for um writing a character or reading a character And we looked at printing strings as well but I'm sorry this isn't in my list here And we finished off by looking at jump and branch instructions So there's a whole collection of instructions here for branching given some condition and we branch to a different part of our code and um these jump instructions here So there's no conditional here um for these ones Okay we looked at simplified C So um when we have a while loop here we need the the idea here is to turn each C code each line of C code into a line of assembly So this while loop here gets turned into um the different sections loop in it where we set up the where we initialize the loop conditions Loop condition where we check the condition of the loop and go to the end of the loop if the condition is met The loop body where all the exciting stuff happens The loop step which we must must make sure that we don't forget we increment increment the loop condition and then we go back to the beginning and check that loop condition again Okay so onto the new stuff here So a side note on um C break So um break is not a recommended not recommended to be used in C because just like the go-to it makes it difficult to follow the flow of the program um when we're reading it later on But break can be used in a loop to exit the loop unconditionally So loop condition here makes this look like an infinite loop here So while one so one is always going to be true So this will this looks like it's going to loop indefinitely But if we look closer at the loop body here so this break this break here means that it's possible to leave the loop Um but we have we do have it guarded by this condition here So if the root character that we that we receive is eof then we break out of this loop So in simplified C break is equivalent to going to the loop's end label So this in simplified C would look like this So you can see here our conditional if C equals um N go to um go to the end of the loop Okay continue So continue is similar to break So continue but continue proceeds to the next iteration of the loop It's a way of skipping the loop body So this very terrible code prints only even numbers So we're looping through all the numbers between 0 and 10 And we check if the number is um is odd uh is even sorry And if it's even we skip the body and we jump back to the loop condition here which also includes the increment um otherwise we will print the number in simplified C So the continue is equivalent to going to the loops step label Okay so now we'll move on to data and memory in myips So how do we store and use interesting data so how does a data segment really work how do we store simple types like characters and integers storing increment store and increment a global variable So work with pointers work with 1D arrays work with 2D arrays and how are strrus handled so these are all the things that we're going to be considering in these topic in this topic So what is RAM um over here on the left I've got an image of a Raspberry Pi This is a very popular embedded computing uh computer for hobbyists and students as well And uh this silver chip here this will be the CPU It's an ARMbased CPU And this is a little RAM chip here So this is where all the memory that the CPU will access is So notice that this is very close to the CPU So we can get very fast access And top right is what you're probably most familiar with So this is the um this is dualinline memory that you would find in your laptop We have the same style of RAM chips here Um there's four on the front There's probably another four on the back of this one as well And this is what AI thinks RAM looks like So I'm not sure what kind of motherboard this would plug into with all these crazy notches on the side And you can see that these chips on here the AI has decided it would put some legs on there some wire legs So in practice these RAM chips here they have a lot of pins a lot of addresses a lot of data pins So you'll find that um the contacts for these chips are arranged as a a grid underneath the chip itself They don't use legs anymore for for chips that dense Okay so a bit of a recap on the memory layout So we have the the text section for data uh for for code the data section for your data Um we typically don't have a a heap in myips Um but we do have a stack and that will grow down Um we will cover the stacks uh next lesson next lecture And then we have the the K text which would be the operating system kernel um code section and the K data the operating system data section Um so um so note that these addresses here they're all 32 bits because we're on a 32-bit machine Um so that means that each address is four bytes long Okay so the text section text the text section or the code section is the only section here that is executable Okay we don't want to if if we're executing code from the data section then there's something wrong with our program The text section is also writable So typically what we don't want our our text section to be writable because uh then we would be able to probably um accidentally modify the code in our application and then our application can do some really weird wonderful things So we typically don't leave that section writable and the operating system will will take care of this for us set the permissions on that section But in our case in MIPS unlike a real system our code section is writable So be careful not to write data to that section Okay So memory addressing the data will live at an address in memory somewhere So we can think of it like a really large 1D array of of bytes So each bite which is usually eight bits has a unique address associated with it So memory can be thought of as as a large array and the index the array the address is somewhat like an index into that array So for example here so the bite at address 4 we start from 0 0 1 2 3 4 the bite at address 4 has the value of 26 in it So the common data types in C So the character so so um note that all these all of this can be um found out by using the size of operator in C So that take that takes one argument which can be a type itself or it can be a variable and it will return the size in bytes of the the data type or the variable So you can find out the size of all of these um these types yourself if you like But in C the character is one bite So this might be stored in memory or we can store a bite in a register as well An integer typically four bytes So this can be again stored in memory or with our 32-bit registers we can store an integer in one register A pointer So on our 32-bit architectures this will be four bytes On 64-bit it will be um eight bytes Um but Mips being 32-bit it will be four bytes in size And again this will can be this can be stored in memory or it can be on a register somewhere An array is a little bit more complicated So an array is is a sequence of these basic types Okay so it's accessed by some calculated index So in the CPU register we don't we don't have any notion of like an array of registers I know that they're numbered like um they have numbers associated with them but we can't treat them like an array Each register needs to be treated um independently So an array technically only lives in memory and we read elements of that array into our CPU registers one at a time A strruct is similar to an array except each element of a strct can be a different type So with the um same principles this would live in memory and we would load these individual um elements into registers as needed And the offsets here are a little bit more complicated to calculate because the size of the types in these strcts can be different Um whereas in the array each element is assumed to have the same type local versus global variables So local variables we like to store them in registers because they're much quicker to access Otherwise we store them on the stack and that will be covered next week So global variables is what we'll mostly be focusing on today and global variables are all stored in the data segment Okay so how do we initialize data in the data segment global data so we use directives in the assembly language to in to allocate and initialize this memory So this is all specific to assembly programming This not specific to myips So you can take this away to any instruction set architecture and use these directives there So the first one I'll mention is word So this will initialize this will allocate and initialize um four byte value into that the memory that was allocated half I'm not sure if you've heard this term before a half word So that's literally just half a word So then um this line here will store the value seven into that half word Um dot we have bite um which will store a bite In this case we're specifying a character code Um so the letter A So that'll get converted into a bite value and stored in this bite And we touched on ASIZ um last lesson Uh so this stores a string The zed means that it's zero terminated Um so one thing I might add here is that this string is then six bytes long because we've got the null terminator in there Okay another thing that we can do is just ask for some um memory of some size Just for that we say dotspace and then eight and the eight will give us um so this will give us this will allocate eight bytes to us But note that this space will not be initialized to any value in particular and we don't have a type here We it's just eight bytes Um yeah so there's no notion of whether it's a these bytes uh represent words or half words or what we just have bytes there So the C equivalence of that would be um having uh in global memory is the main down the bottom So in global memory we have an integer and assign 42 A short is half the size of an integer Assign that to seven character We put the letter A in there and we have an array D here um six characters wide and we put hello inside there and space here you see this character as well but we don't have any initializer here Okay accessing memory So to perform any of these computations any computations we first need to load um the data for these computations from memory into the C registers the CPU registers So remember the um the arithmetic instructions they only operate on registers or constants They can't operate directly on memory So storing data assume that if we're performing some computation we probably need to store that back into memory to be used later on Um so the modified data must be written back from the CPU registers to memory if we want to use it later on Um so more on loading from memory So loading the bite from address 4 would load the bite containing um containing 26 into the specified register So we have an instruction to load a bite We give it the address four and out pops the number 26 So typically small groups of bytes can be loaded and stored at once So we if we want to load a bite we use the lb and SB for load bite and store bite For two bytes or a half word we use the load half word or store halfword instruction And for four bytes or a word then load word store word Okay So in this case in this example we we're loading a half word Um so we provide the address Um right So we provide the the address and we load this word out of memory Um so what value do you think will be loaded from here we have the number 26 and the number 32 Let's assume these are hexodimal Okay So this is a a little Indian machine So that means the little end or the least significant end of a um of a number comes at the lowest address So this number would actually be ox 3226 We need to reverse the order there I'll touch on this later on as well Okay So more on working with memory So um so accessing these memory addresses with load and store instructions have two parts So first um we need to load the address in this case four into a register and then we issue the instruction to um read from that address into a register So here we have uh so we're loading the address 4 into t0 Then with the load bite instruction we're telling the CPU to look at memory at that location and store whatever is at the memory at that location into T1 Okay loading and storing a bite from memory So load address Okay we have this nice big constant here Okay And then we load from that nice big constant here into um T1 Now I assume this big constant here is probably referring to this bite down here But it's not so clear And over here we do the same for a store So we're loading a value of course into a register so that we can store that value to memory Then we load the address which is this nice magic number And then we store um we store the bite in t0 to that address But we don't want to keep track of the memory locations and and hardcode them ourselves right um so it would be if we ever added any so it's not clear that this address here maps to this bite here And um and if we ever added anything um before this bite then all these memory addresses that we have here would have to change So we do not want to keep track of memory locations and hardcode them ourselves such as as in the previous example So what if we add and remove variables as we develop our code So we use labels which are used by the assembler to represent the memory locations So here is a better example of storing and loading a bite So we have added a a label here before the bite and when we issue the load address instruction we're providing that label Okay And over on the storing side again we're using this label Um we're providing this space directive of one rather than just um leaving rather than the not reserving space there for this um this letter Okay Similarly storing and loading and storing a word Um so here we have right So here we allocate a word um and initialize it to 10 and we provide the label for the word and over here space we now have four because it's a um it's a word instead of a bite and uh we use a store word instruction here Okay so we do have some shortcuts Sorry one moment Okay so we do have some shortcuts um for for these load and store instructions Um so we can just provide the label to the store word instruction We don't need to load that address into a register first and the assembler will take care of all the the um heavy lifting for us to set up the correct instructions So similarly you you've seen the um the store instructions in this form with a constant zero in the front and then the brackets with the register Okay So we can emit that zero and just put the register in brackets Um and this is another form that you've seen here So here we provide the lab a label directly to the store bite instruction We don't need to load directly We don't need to load that into a register first We can provide an offset for that as well The assembler will take care of all the arithmetic and put that into the machine code as for us And we can also um provide that zero up the front here That was an offset into that memory provided by the register So we can also hardcode put a constant out the front to say what offset from that that register value Okay demo program time So we're going to be writing this global increment program So of course I've already got my assembler cheat sheet open Now old friend Mity Web Let's get rid of this Okay So we have our global we had our global counter We've got our global counter up the top It's an int so it's a type word Let's start a data section We will label this global counter and it's going to be a word initialized with zero Okay Then we have our main I'll start up here Okay So global counter++ what's the first thing we need to do so this is the same as global counter equals global counter + 1 So the first thing we need to do is load our global counter So um l t [Music] z global counter Okay And then we load word T1 T Z Okay All right Now we need to increment it What was the instruction for in for incrementing we don't quite have one but we can add an immediate So T1 T1 with one Okay And now we need to store this back to the global counter Oh so we still have our um global counter address in T0 So we can store word T1 into T0 Okay And don't forget the Oh we need to print that as well put this in Where is our counter value t0 so we can use t0 again for this That's no longer needed What am I missing we need our argument in a z So a z t1 Now we sis call Looks good Put char um new line character So in V 0 And I've forgotten what the sis call number is Print character number 11 Can't forget to return zero so our program doesn't crash All right Oh what have I done wh zero All right Wonderful Let's cross fingers Right Excellent We print the number one We can look at our data section here as well in Nipy So we can see that um we see our value one here This is our word which is the global counter and remember little Indian So we start reading from the left to right So this is the value here is actually uh the number one not ox 1 0 0 0 And in case this was a string we'd be able to read the characters over here as well So yes and all these these underlines here and here too This just means that this this region of memory hasn't been initialized yet In a real system this is normally initialized with zero but that's not guaranteed It could be garbage Um so um MIPY helps us here know which memory has been initialized and which hasn't Okay All right Excellent Okay Alignment So the C standard and MIP requires types of um of a size n bytes to be stored only at addresses which are divisible by n So that means a four byte int must be stored at an address divisible by four An 8 by double must be stored at addresses divisible by 8 So compound count the compound types So arrays and strcts must be aligned to their com so their components are aligned This is pretty easy with arrays because each item has the same size But in a strct each of the components of a strct must be aligned to the size of that component So for example if you're using load word or store word you must be loading and storing the four bytes from to an address divisible by four So here's a demo of the alignment problem Oh I can't copy that Yes I can So here's here's an example of our alignment problem Um so I'll run this program over here in MIPY once I get the indenting done correctly Maybe a bit of OCD All right So if we run this program we get an exception here in the Mipsy output unaligned access So the instruction that failed was it can tells us exactly what the instruction was and um at what address Okay So one thing I might highlight here is it becomes really easy when you're looking at hex to know whether an address is aligned or not So for four bit alignment we have um 0 48 C and then we're back to zero again So I know that this is aligned Um but what exactly it's not showing me what the axis was Oh yes it is So it is trying to access one 0 1 06 So six is not divisible by four So um this is an unlined access Okay And the reason for this is we have asy here Remember this hello will be six bytes long because we have the null terminator on the end there So this space ends up coming immediately after that hello and that means that it won't be aligned So if we change this to a store bite so it is aligned Let's make this a hexadimal value FF step through keep an eye on our data section Okay So we've just written to memory there You can see that this address here is not aligned Um so our so that's why our store word wasn't working So what can we do to align this well space two that will surely align it run Oh this doesn't actually do anything Fat fingers Okay so it doesn't actually do anything but it shouldn't crash when we run it Okay so we have successfully aligned it Okay but what if we change this what if we put an exclamation mark we really want to say hello Then we need to change this to one And this will this will um happen whenever we make a change to this this string here So another option for making this alignment is we can use the align directive and this takes an argument which is which when placed to the power of two will be the number of bytes that will be aligned to So if we have a one then it's aligned to um two bytes Yep Two bytes If we have two it's aligned to four bytes Three This will be aligned to eight bytes but we only need to align to two to four bytes here Still a store word All right So we run no crashing And you can see that we've got this still this uninitialized bite here because it's aligned that space section over here Okay So just summarizing what we've just done here So we can use the space to to pad this value to pad this section so that the next space will be aligned Um so we have to calculate this space ourselves So it's errorprone and may break if we modify our string hello Okay So this would be a much better way of fixing our problem So we align to the next object on four byte address two power two or two to the^ of two and this is much less errorprone All right Now we have an example using pointers So what would this print okay how can we write this in myths so we're defining a pointer here Well first of all we have the answer at the top which is 42 the answer to everything Okay and here we're declaring a pointer We're assigning that pointer to be the memory address of the answer Then we say I equals We dreference the pointer to take the value at that address and we print that value Okay Then we assign um the value at that address to be 27 and print our global variable here which we've should have modified to 27 Okay So this is our glo these are our global variables So let's do data that answer is a word initialized to 42 Okay And then let's go back and do our text section Okay I in T0 PN T1 Okay So we're not initializing yet Okay So P is equal to the answer and P is in T1 So load address answer into T1 which is our P All right Fantastic So you'll notice that we didn't have to put the amperand here in front of our answer because the assembler knows that this is a label and and therefore a pointer Okay Now I is the value at that address and it is a word So we need to load word Uh we'll load into T T0 because that's I and we're loading T1 Okay now we want to print I should have left these constants up here Let's move this down here What was printing that's one So that means in v 0 we have the sysol number print int and our argument um which is an a z So we need to pull t And then we fiscal All right makes sense I'm sure I've probably made a bug in there but we'll find out We'll debug it in just a moment Okay so now the next instruction we want to store to that memory address the number 27 So what do we have access to let us just use T2 li store word T2 and we still have PN T1 One right now we want to print the answer again So that's the global So we need to load that again But our right but we're not loading it from P We want to take the answer straight away Okay So in T2 and now we can print int again How did we print int okay in this case we're taking T2 sys call Can't forget the return L I V 0 Z jump return uh return address All right I don't have you guys here to help me fix any problems that I might have So let's just hope it works First go run 4227 forgot the new line on the end All right So word load instruction T1 What don't we need anymore ah let's just go T3 So we can load a Z directly here can't we print if I remember that is number Seven Don't make it of course Much better There we go Okay So while we're here why don't we take a look at our data section we have one B here Let's make this something a little more easier to understand O X And we need to step through a couple of times So 1 2 3 4 5 6 7 8 Fantastic That is our word Oops So what would happen if we added a half word here abc D and then we'll repeat this word afterwards F to pass Maybe autocomplete can help me out Oh I think I've forgotten how to store a half word Here we go It's in the instructions Half I don't use half words that often All right Excellent Step through Okay So this is our first word and then this is our half word So you can see that second word that we added on the end there The assembler has aligned this on be on our behalf Okay so the assembler knows that these um data lengths need to be aligned to the size of those d that data Okay so it's just the space that we need to worry about because it has no idea how we're going to be accessing a space of of eight car of eight bytes Okay I think we might actually finish early today because I don't have so many questions Unfortunately we um I'm not able to see any questions that you might have and respond to them in real time Um but if you have any questions please reach out on the forum and I'll answer those as quickly as I can Okay so a little bit of bonus content So dealing with instruction set ar um architecture extensions So this there's actually a true story associated with this where this is something that I had to do So now suppose that a CPU is released with an extension to the instruction set architecture Maybe suppose sys call is a new instruction and that we don't have an assembler yet that understands the encoding of the sys sys call instruction So one thing that we can do here is use the the directives to inject an instruction into our program So these directives are not limited to the data section alone We can put them in the text So here you can see I am trying to um perform a system call for print int but I don't have the system call instruction But I know that the encoding for this system call instruction is ox000000000000000000 C So I can reserve a word in my instruction stream here that has that value initialized to that value and that will function exactly like the sys call instruction And if you don't believe me we better put in our return through LI Okay first of all you notice if we go to the decompile program okay it has it has inferred for us that this um machine code here is the system call instruction So it knows that what we're trying to do here But let's run it just for fun All right Silly me We already have LJR there This is why structuring your your program is important Um okay So run There we go So we get a system call print the number 42 just like our program says with machine code for um in inserted into our assembly Okay So you could write your whole entire program like this using machine code but I definitely would not recommend it All right so what did we learn today so we did a recap of our if statements um loops and we started talking about myths data So how we load and store data from memory Um we looked at integers characters and pointers and um importantly we looked at alignment So make sure that you align your space seg um reservations and in our next lecture we're going to be covering 2D arrays 1D arrays 2D arrays and we'll be looking at strrus All right So thanks very much And for those who have been following um who have been reading my previous lecture slides you may have noticed that I've been trying to get a tiger with AI Um but I I've been struggling to get AI to generate a transparent background or even a white background without any shadows So I gave up and said "Hey let's just give it an exciting background." Um okay So but who would ever believe that I created this myself anyway if you have any course related questions please reach out on the forum Um seeing again I'm seeing a little bit of noise there but I'd like to see a bit more Get some discussion going discussions going And any admin related questions please email the course account c1521c.w.edu.au And if you need any support with um me with your um mental health or other things there's we have all these services Please reach out if you need anything Thank you very much I'll see you on Wednesday

Transcript for:COMP1521 Week 2 Lecture 1 MIPS Assembly Overview

Transcript for:
COMP1521 Week 2 Lecture 1 MIPS Assembly Overview