[Lecture 7] Fundamentals of Computer Architecture

no do you know why the camera is like this e e for for is live stream working okay good [Music] I we can use the camera here this this makes little sense just use the camera over there [Music] and then let's see yeah that's better okay I think there's some background issue though as you can see it's it's related to this computer yeah okay let's get started everybody can hear me in the back good looks like we've lost some people this room was more crowded in the past do you guys agree I went for a conference and everybody's gone well not everybody you guys are still here okay uh today we're going to start with uh the architecture part of this uh class and we're going to move a little bit higher up uh in the stack in the transformation levels of transformation we're going to talk about the one model of computing and then instructions at architectures and then we're going to bridge the gap uh from there uh to the logic later on but just some reminders for you okay uh there are office hours these are hybrid Mondays 1 to 2 p.m. there's a mood announcement you can check that's correct right and your extra credit assignments are coming up one is on March 21st some of you may have watched it some of you ask questions about it and another one is coming on April 1st so hopefully how many people have done the extra CR assignments at least one okay is it interesting okay okay hopefully you'll get extra credit okay so what we have learned so far this is what we've covered we have covered three weeks of lectures plus we've started the labs hopefully labs are going well I'll ask you questions later on about them but essentially we've started with the very high level device level abstraction transistor as a switch and then we built Logic on top of that and we've talked about how to design that logic both combinational and sequential how to design finite State machines how to describe that logic using a hardware description language so that we can actually instantiate these things at the lower level and how to actually analyze the timing and verification functionality as well as the timing of that circuit does that sound good so this is all digital design so we are really done with the digital design part of this course since we don't have a lot of time in the bachelor's program in computer science here digital design is a much bigger field actually you can take multiple courses on that topic but we're we're dedicating 1/4th of this course to digital design now we're going to move up a little bit higher uh we're not going to go into microarchitecture immediately because I want you to get the higher level perspective which is really how A computer executes things uh so we're going to talk about the one noyman model and we're going to talk about example Oney machines essentially instructions at architectures uh that that uh operate based on one Norman principles lc3 and mips specifically I'm going to start with lc3 which Pat and Patel book does a great job covering the fundamentals of uh and then we're going to actually give examples from two architectures to instructions at architectures lc3 is the little computer 3 which is an educational Isa mips is a real Isa that's used in the field and there are many many other instructions at architectures like A6 arm risk 5 Etc that we're not going to talk about and then we're going to talk a little bit about assembly and programming so we're really going to talk about programming interface to Hardware today and then later we're going to start on microarchitecture and we're going to have a lot of interesting microarchitectural Concepts on how to implement and instructions at architecture such that you obey the software Hardware contract and uh do that while exploiting the logic gates that we have designed in the past does that sound good okay so basically we're going to build computer so we will learn the basic elements of a computer today and we'll look at an example of onment machine and today and tomorrow we're going to cover instructions at architectures including different types of instructions operate instructions data movement instructions control instructions we're going to talk about instruction formats addressing modes data types Etc these are some readings uh the red ones are what we're going to follow uh today and tomorrow essentially uh especially Pat and Patel I'm going to follow but I'm going to give you examples from Harris and Harris as well and tomorrow we're going to talk about programming a little bit assembly level programming I'm sure you've seen programming in other languages right high level languages and we're going to try to bridge the gap between high level language uh and assembly uh and Isa uh a little bit okay next week hopefully we'll start microarchitecture we'll start implementing the ISA essentially next week okay so let's jump into it how many of you heard about the onean model does this mean anything to you okay how many people did not hear about this it's okay so that's good most people didn't hear based on the hands I've seen do people who have heard about it heard did you hear it in classes no okay I see hand head shaken so probably you heard about it somewhere else so basically uh this model is really about building a Computing system I'm going to give you an alternative model hopefully tomorrow not today we don't have time today uh but this is not the only model but this is really the most successful model that we've had so far because it came first partially and it's easier to program also secondarily I would say okay so recall uh we I I've shown you this picture before at some point uh we will cover all of these components we'll cover processing memory and communication but we'll especially focus on processing initially and process uh for for this to work we need a model of computation basically in past lectures we learned how to design combinational logic structures sequential logic structures uh with those logic structures we showed that you could build an ALU arithmetic logic unit execution units basically we could build a decision unit for example we could build memory storage units we could build communication units we've seen examples of this all of these are basic elements of a computer today we will raise our abstraction level to a higher level so to this level essentially software Hardware interface and a little bit of microarchitecture so that you can see that software Hardware interface can implemented in micro architecture but we're going to look at different implementation choices later and we're going to use logic structures to construct this basic Compu Computing model so what are the basic components of a computer if you think about the fundamentals very Basics you want to get a task done right we want to solve problems as we have discussed in the first lecture and to solve problems we need to express those problems from in a language that the computer can understand uh so we need a computer program basically right in the end you write a computer program and you have written computer programs at high level languages we're going to look at computer programs at uh a level uh that the machine really understands we're going to look at ones and zeros for example today this what is a computer program computer program specifies what the computer must do right that's the goal and then we need to design the computer itself so that it could carry out this program right to carry out that specified task uh that's dictated within the program that you have written hopefully this is obvious you need both of these so a program is really a set of instructions in the end each instruction specifies a well-defined piece of work for the computer to carry out this is really important because we're going to have instructions uh that are at the level that the hardware can interpret and understand when you write programs at a high level language uh you may write things that are very high level right it could be a uh I don't know you make an operation on a key value store for example that could be part of your programming language you do something on a hashmap right that could be part of your programming language instructions May the hardware itself may not operate may not understand those hashmap you need to map those things to the hardware we're going to look at what the hardware can do at least in the instruction sets that we're going to study but I'm going to also show you that instruction sets can be designed to be high level as well okay so what's an instruction instruction is really the smallest piece of specified work in a program and these are all from Pat and Patel you can read the parts that I mentioned in Pat and Patel instruction set is all possible instructions that a computer is designed to be able to carry out it's okay we're going to we're defined the instruction set right now we're going to build up to it but before we get to that there should be actually a model of execution also and that's where with the W noyman model comes in if you want to build a computer that can carry out instructions you need an execution model for processing programs so John W noyman whose name you may have heard of in the past how many people heard of johnman you may not know the model but you may heard of the name right he proposed along with his colleagues a fundamental model in 1946 which became very popular this is what he looks like and this is the paper uh it's consists of five components later I'm going to distill two major uh things in a one noyman model but let's go through these five components first we start with memory memory stores the program and the data because you need to actually have the program somewhere you need to have the data somewhere and you need to do something to it processing unit does something to it essentially these are the processing units and then there is an input unit that can take input from the user and there's an output unit that the computer can communicate with the outside world and finally there's a mro which is the control unit which coordinates everything the computer does it controls the order in which the instructions are carried out essentially so these five components are proposed by one we're going to examine examples of these and throughout this lecture we'll examine two examples of the vyon model the lc3 little computer 3 and myips uh myips is a real Isa as I said these are good examples these are good educational examples but I'll give you examples from different Isa and struction set architectures as well does that sound good okay cool okay so I should say that all general purpose computers today use the world nman model that's why we're starting with it this doesn't mean that it may be the most efficient model later for example we're going to see accelerators for machine learning in lecture 18 or so that are based on completely different principles they they may still take instructions that uh that uh they so that they can carry out some task but they don't necessarily use the Wyman model okay so this is a pictorial representation of the one nyman model it has five of these components that we have discussed we start with memory uh I'm going to Define some of these and this is not magic basically we already built uh a memory a simple one but this was a TW 12 bit memory if you remember right you have four locations here and each location stores three bits we're going to build bigger memories and we're going to abstract it such that we can actually use instructions to specify locations here this a logic level view of memory right that's a memory array so what does the memory store in the one model it stores both programs and data and actually There's no distinction between them if you just look at memory it's all bits there's no nothing that says oh this a program and this is data you don't distinguish between them in memory you will distinguish between them when you're processing when you're actually executing instructions as we will see so control logic that control unit distinguishes uh treats a particular memory location memory address as containing an instruction versus data depending on when it accesses it this will become more clear okay memory contains bits as I said bits are logically grouped they could be logically grouped into bytes and in the these isas that we're going to look at instructions architectures that we're going to look at they're grouped into bytes a bite is eight bits these are actually terminology that comes from 1960s from IBM 36091 which is one of the very interesting computers uh and words word could be actually eight8 16 or 32 bits depending on the instructions at architecture so we have actually defined some of these things earlier address space is the total number of uniquely identifiable locations in memory in lc3 for example the address space is 2 to the 16 you meaning you have 16bit addresses okay in myips the address space happens to be 2 to 32 32-bit addresses in a more like a modern Isa instruction Arch X8 664 the address space is up to 2 to the 48 meaning you can have 48 that addresses or your memories that can be these this address space is what could be referenced by the programs a program can reference very large amounts of memory if you're given this address space okay addressability is how many bits are stored in each location meaning in in each address for example you can be 8bit addressable in other words B addressable many isas today are today are by addressable or you could be word addressable meaning your address can specify 32 bits for example assuming your word is 32 bits in lc3 for example a word is 16 bits and your word addressable these will become also more more clear but these are some definitions and these are the things that the programmer sees whenever you program at a low level as an assembly programmer that looks at the instruction set of a computer you see the address space this address space is specified in the ISA document you see the addressability so that you know what an instruction is going to do how big the address will be and uh how how much data that you can specify with a single address 8 Bits or a single word and we will see that there are different instructions that could be referring to a single word or a single bite so that we can for example get a single bite from memory or get a a whole word from memory okay I I just said this basically a given instruction can operate on a bite or a word and we will see examples of this hopefully toward by the end of this lecture okay this is a simple example a representation of memory with eight locations uh you have the addresses of these locations since there are eight locations uh you have three bits to identify each location we should be able to do these calculations very easily by now and each location happens to contain eight bits one B over here so you can think of this as B addressable memory and the address space is eight okay each location has eight uh bits so this is a this is an example from patn Patel also if you look at this value six is stored in address four address four is this and value six is stored there and value four is stored in address six just to confuse you it's not confusion of course you know how to deal with your bits probably right okay so one question is let's make uh this is one specification of the ISA for example you can say my address space is this big and my addressability is 8 Bits and this is what I have you could design an Isa this way but this is very limiting clearly you don't want memories that are this small so modern isas are much larger as we discussed in the earlier slide but just for fun if I ask you the question how can we make the same size memory bits addressable uh what this would do is essentially the size of this memory is total eight locations eight bits each 64 bits if it's bit addressable you would have 64 locations where each location is a single bit right so your address size would be six bits and each location would specify one bit so it's possible to do that most computers today are not programmed that way you don't have instructions that operate on bits the granularity at which you can get data from memory is usually the smallest is 8 Bits meaning one B this is changing with machine learning a little bit because people are try figuring out that sometimes actually you want to uh use smaller data types I'm going to define the data type later on like one bits two bit four bits so that you can operate on smaller data elements because it turns out operating on larger data elements is expensive in terms of energy and computation power if you operate on smaller data elements you can actually be much more energy efficient so things may be changing if as isas and workloads evolve into the future and the energy requirements push people to do something different so it's good to basically know that you can go into trade-offs like this okay so let's talk about word addressable memory word addressable memory says that each data word has a unique address if it was bit addressable each data bit would have a unique address let's start with word deible uh in myips for example uh a unique address you have a unique address for each 32-bit data work in lc3 this is specified by the Isa designer somebody designs this Hardware software interface somebody documents it it's like a contract 4,000 page do contract as you will see later uh today tomorrow probably uh where they specify everything that the programmer some programmer needs to know in terms of how to program this Hardware right in lc3 somebody specified that uh a unique address there should be a unique address for each 16bit data words okay so let's take a look at myips memory myips memory looks like this basically so at the bottom you see word zero and then you have word one and then you have word two and then you have word three and these are some random data that we have right and here we have heximal representation uh you have four bits here and you have eight of these four bits so you have 32 bits total right hopefully it makes sense okay so you have a word address essentially this is word address zero word address one word address two word address three if you specify oh I want to get the data at word address three you will have an instruction that enables you to express that to the computer and the computer will access this memory location because it's address three using a decoder if you remember and that data will be transferred somewhere as we will see soon and you will get this data okay okay let's take a look at B addressable memory in this case each bite has a unique address mips actually is B addressable it turns out uh lc3 is not B addressable but there's an updated version of lc3 little computer 3B that is B addressable as well so now this is our myips memory that's became B addressable I didn't show you the longer version but I actually divided uh the words into uh four pieces over here each of them is a bite as you can see and this is the bite address now so you can see that we don't have a word address uh here we have a bite address zero and one of them is zero over here we're going to talk about that soon uh so let's assume that this is zero this is by Zero by one by two by three and then oh and then you have b s four over here this is four five six 7 8 9 uh 10 11 Etc right you can do a Tex a decimal okay so now we're bite adressable whenever we use an instruction that says oh get this bite for me let's say I want bite uh four uh and you somehow specify the address and we will see all of that we will see how we specify all of those to the computer if you say I want B adds 4 using an instruction the computer will get you F7 assuming some ordering as we will see okay so that this all clearly brings up the question how are these four bytes ordered so people actually think uh some some data is expressed in terms of byes some data is expressed in terms of words right you may have a large data type like 32 bits but your memory may be bite addressable meaning it consists of eight bits each then if you think about a word how do you actually lay out the word matters so how are these four bites ordered becomes a question I'm going to quickly D diverge into that but basically the question is which of these four bytes is the most versus least significant right is the most significant bite here or is it here okay so you may have read G travels how many people read G travels at some point not at school they don't teach this at school anymore not for fun there might be more fun things these days I don't know but I would recommend reading this These Are Beautiful Stories uh but basically G encounters uh folks that are big Indian versus little Indian and this has influenced computer terminology also and these folks are different folks they're different in the way they break their eggs you know eggs look like this right they're they're asymmetric they're not symmetric in general and some folks decide to break their eggs on this site the big site they're called Big Indian and some other folks decide to break their eggs on the left side the little side they're called little Indian make sense why do they do it this way because they're used to doing it that way right if you're born to a place where everybody starts breaking their eggs on the uh big part then you keep doing that probably well you could start a new chend of course over there but you may not be immediately accepted at that place so this a convention basically or tradition in other words so which one is better neither of them maybe I mean depends on your taste perhaps right uh so basically we have a similar issue over here uh if you look at the memory that we have designed and now these are not data over here these are the addresses bite addresses uh I have put for each word word addresses are the same in a big Indian machine you lay out the words such that the most significant bite of a word is at the lower address so if you specify address zero you get the most significant bite in that word that you store if you specify address three you get the least significant bite right why does this matter because if you have a 32-bit value and if you want to add it to another value you want to know what's the most significant bit and what's the least significant bit right because these are very different things Bit Zero uh affects the addition the least because it's the least significant bit bit 31 affects the addition the most because it's the most significant right okay in the little landian word world it turns out uh least significant bite is in a word is at the lower address like this and most significant bite is at a higher address again you can say oh okay this is more intuitive For Me Maybe because it's least and the address is lower I I get that but in the end you could be living in a world where this is the norm for you and it's certainly possible to do that right you just need to rewire your thinking and if you're always rewired that way this is more natural to you than this no question about that right so basically the a convension does this really matter in the end if you know where to put your least significant btes it doesn't matter where it really matters is when one big endian system communicates with a little Indian system and they need to talk to each other so you have a computer that lays out the data in one way because it's in one Isa that's is big endian and you have another computer of of the network uh that uses the lendian convention when they need to communicate with with each other they somehow need to make sure that the least significant bite in one computer stays the least significant bite at uh in the other computer right this basically they need to do some reordering okay this is good to know and it's also it this actually has a lot of lessons that I don't want to go into but convention actually affects a lot of the designs that we have okay so so how do you access memory that that is a logical view of memory that we have le significant bytes more significant bytes Etc now let's talk about accessing memory so there are two ways of accessing memory you need to read or load data from a memory location and you need to write or store data into a memory location okay uh so these are the higher level things and then internally we're going to use two things memory address register and memory data register to facilitate this access this is one implementation there could be multiple implementations but now we're actually blurring the line between abstraction level of programming and implementation there's a memory address register in the Oney model and then there's a memory data register to read you first load the memory address register with the address we wish to read from it could be a 16bit address for example and in the Second Step data in the corresponding location gets placed in the memory data register using some circuitry in the memory okay we're going to M demystify that circuitry later to write there are two steps also one the first step is you load the M memory address register with the address and MDR with the the data we wish to write because we're going to write that data to that address and you activate a WR enable signal into M which ensures that the value in MDR is written to the address specified by m make sense okay this is one way of implementation and this is perhaps the simplest way of thinking about uh memory Okay so we've covered the memory right now that's the memory part of the one noral model let's talk about the processing unit Processing Unit looks interesting because it has one component that we have seen ALU hopefully people remember ALU do people remember ALU from last not the last previous lectures okay good but you probably don't know what temp is temp is an important thing and it's going to introduce some Concepts that are going to appear again and again and again and again in our lectures uh which is locality and we're going to build a memory hierarchy out of that temp later okay so processing units the main function of a procing unit is to perform actual computations this is what a computer is designed to do right add multiply multiply and accumulate uh convolution it could be any operation exor the processing unit can consist of many functional units and we started with an arithmetic logic unit in the past it executes computation and logic operations in lc3 there are only four operations well add and and not actually there are three operations uh there was an exor added later on in myips there a bunch of these operations you will see these later on so there is a nor keep that in mind because there will be a question later on that will relate to nor but there are other things like multiply also there's a story about multiply that I'm not going to tell right now but there are interesting stories over here so the ALU processes quantities that are referred to as words so they operate on words in general but they could operate an ALU can also operate on bytes in lc3 the word length is 16 bits in myips the word length is 32 bits this is the maximum operation length uh of analu okay so recall this is the ALU that we designed uh in a past lecture I will not go through this but this is the jog your memory so this is again not magic we know how to design this thing at least a subset of the myips ISA okay so what else is there in the processing unit this is an important concept not it's not just about processing usually well all you essentially when you do processing you operate on data right where does the data come from Yes it comes from memory but you need memory close to the processing unit as well because memory far away is too far away and this is called the fast temporary storage it's almost always the case that a computer provides this fast temporary storage essentially a small amount of storage very close to Al and this is visible to the programmer so we're going to actually explicitly specify the storage separately from the memory you can think of this as as a small amount of memory close to the ALU the purpose is to store temporary values and quickly access system later this is how fundamental data access and memory is to our instructions at architectures okay so for example I'm going to do this calculation a plus b time c divid d in that order you have intermediate results right your intermediate result of a plus b if you do an add a plus b you can store it in the temporary storage as opposed to writing it back to memory because it's too slow to store each ALU result in memory and then retrieve it again for future use if you had some small temporary storage small amount of memory very close you could do this very quickly we have as we have discussed in earlier lectures M MX is much slower than an addition multiplication or division that's the key uh over here that's true for the intermediate result of a plus b time C you did an add operation to do this and then you did a multiply uh you where do you store the results you store it in a temporary storage we're going to call these registers soon yes basically this temporary storage is usually a set of registers it's also called a register file it's essentially a small amount of memory that is addressable by the instructions so that you can store these temporary results so that you don't need to go to memory go to main memory to access things so basically we have fast temporary storage uh let me motivate it uh again over here because this is going to appear again and again later on memory is large but slow memory what do I mean by memory memory is this well let's go back this is memory this thing over here is large but slow in x86 it's 48 bits two to the 48 uh 8 bit elements right we don't want to go there all the time so we have registers in the processing unit they ensure fast access to values to be processed in the ALU typically one register contains one word same as word length because you operate on words in an ALU so this is called a register set or register file it's visible to the programmer it's part of the instructions at architecture separate from memory uh it's essentially a set of registers that can be manipulated by instructions lc3 for example has eight general purpose registers and they're numbered r0 through R7 because you have eight you need three bits to specify a register so it's you can think of this as a small memory uh that is uh essentially uh that has eight locations where each location stores a word register size word length as I said mips has 32 general purpose registers as you will see in your programs it gives you bigger fast temporary storage and again they are specified uh from r0 through r31 so you have a five bit register ID or register number uh which requires more space in your instruction as a result your instructions need to be larger in myips your instructions are actually smaller in lc3 and register size is larger in myips because myips word length is 32 bits as we have discussed okay so now we introduced registers we're going to do a lot of programming on registers later on today and also later uh in this course hopefully this makes sense okay recall this is not also magic we have built this register we call this register so this is a 4-bit register if you look at this it could be a you can imagine a 32-bit register also right okay so this is another example of this forbit register so these are pneumonics that I'm not going to go through but these are some uh conventions in terms of how registers are named in myips it's really r0 through r31 but at the higher level when you program you may uh refer to registers like this and the compiler can compile it into uh these numbers okay we we may get back to this when we do assembly program Okay so we've covered memory and the procing unit let's talk about input and output quickly uh we're not going to talk about that in detail yet uh but these are critical also they enable information to get into and out of a computer clearly you can use many devices for input and output these are also called peripherals input can be keyboard mouse scanner discs Etc many things output can be the monitor printer diss Etc so today actually we have many more input and output devices potentially in lc3 as you will see in Pat and Patel book they consider a key keyboard and monitor and they have special uh locations in m to access them we will not talk about this right now we may get back to it later but let's talk about the control unit which really orchestrates how all of these are put together it's essentially like the conductor when Orchestra right you have orchestra they're all doing different things and the conductor really conducts who should be doing what at what point in time essentially the control unit conducts a step-by-step process of executing every instruction that program in sequence and we're going to see that today it keeps track of which instructions being processed via this register called PC oh sorry instruction register instruction register is actually the bits of the instruction that you're currently uh processing it contains which uh essentially it contains the instruction word if you will and we will see that we will see different instruction words soon and then this instruction register gets interpreted by the processor that where the processor figures out oh this instruction specifies that I should do an ad right this other instruction specifies that I should do a load but you have one instruction at a time in this instruction register it also keeps track of which instruction to process next via program counter this program counter or instruction pointer it's another register that contains the address of the next instruction to process does that make sense so you basically start with a program counter fetch an instruction put it in the instruction register and then increment the program counter one once the current instruction finishes processing you fetch the next instruction from the incremented program counter put it into the instruction register process that instruction increment the program counter and then once this instruction is done you remove it from the instruction register or fetch the next instruction into the instruction register and you keep incrementing the program counter to so that you can go to the next instruction a sequential Manner and instruction register provides a storage for the bits of the instruction that specify what the computer should do so a program counter is the address to fetch the next instruction from instruction register is the data value of the instruction does that make sense okay so basically we've kind of built our architectural State i' uh I've uh I didn't tell you exactly but these are all visible to the programmer except for the instruction register uh these are the things that are visible to the programmer when you program instructions that architecture exposes uh to the programmer memory which is essentially an array of storage locations index by address you have addressability address size Etc uh uh and you have registers general purpose registers these are given special names in the ISA as opposed to addresses fast temporary storage as we have discussed there could be also special purpose registers which we're not going to deal with yet and then there's a program counter this is the memory address of either the current or the next instruction depending on where you are in the instruction processing cycle which we will get to soon these are three key Parts uh in a machine that are visible to the programmer and instructions programs are set of instructions uh specify how to transform the values of this programmer visible state for example an add instruction says add this register to this other register and put the result into this destination register basically add instruction operates on two registers a load instruction will load a value from a memory location load from this address in memory into this register and then you may have a store instruction that says take the data value in this register and store it into this memory location so all of the instructions that we will see will operate on these entities which are visible to the programmer that's why this is so important and that's why I'm spending a lot of time on the slide this is the architectural State visible to the programmer we will also see instructions that operate on the program counter actually all instructions operate on the program counter whenever you uh fetch an instruction executed you go to the next instruction you increment the program counter in other words but then we're going to do more sophisticated things to the program counter because executing sequentially in a program is not that useful right sometimes you want to do something else in a program based on some data value for example you say if this data value is greater than five I execute this code if this data value is smaller than five I execute this other code right you do some conditional action and if all you're doing in a program is actually going sequentially by incrementing the program counter you don't have that so we have we'll introduce branches or control flow into the program that take the program counter change it to a value uh which will take you to the right place in the program does that make sense if you want to execute a function call for example you program with functions I want to call this function that requires changing the program counter to the value to the address of the function essentially and function is essentially a sort of instructions in memory somewhere uh that you can jump to using by changing the program counter Okay so we've covered the one model at least at some basic level let me give you two other properties of the Oneal model that I kind of implicitly suggested it's also called uh a stored program computer instructions are in memory essentially it has two Key Properties one is stored program I'm going to Define it a little bit nicer and the second is sequential instruction processing which kind of I kind of harped on earlier so essentially store program means instructions are stored in memory in a linear array uh memory is Unified between instructions and data so there's no distinction basically in memory between instructions and data the interpretation of a stored value depends on the control signals whenever the processor is actually uh uh accessing memory uh in what part of the instruction C as we will see soon okay this may be difficult to understand right now but it'll become clear soon hopefully let's talk about sequential instruction processing this is easier to understand you basically have one instruction processed fetched executed completed at a given time program counter instruction pointer identifies the current instruction or when it's incremented the next instruction and this program counters it m sequentially to the by being incremented except for control transfer instructions which are branches or jumps as we will see but it's very sequential and this is part of the success of the one model whenever you write a program you write a sequence of commands and you expect that uh one instruction in the sequence is executed first before you go to the next instruction before you go to next instruction and this enables a lot of things like easy debugging easy thinking about how things are performed yes memory because large is slow if you have large memory let's say 2 to the 48 locations imagine based on what we have discussed earlier in earlier lectures how big of a decoder do you need how big of a multiplexer do you need how big of an array do you need like all of them are larger and based on the timing and verification lectures that requires a huge amount of combinational logic the the delay is much larger right if you have only eight registers your decoder is much smaller your multiple is much smaller your array is much smaller so your access times are much small slower okay very good question because there's no magic basically we built up to all of these right yes that's a great question I'm not going to answer that question meaning that's an important question what is stored in the registers versus what what do you put into memory that's the job of a good programmer essentially everything can be in memory you you don't you decide not to use the registers your program will be terribly slow uh today's compilers do register allocation essentially what today's compilers do they do some locality analysis in your program they try to figure out automatically which variables should be stored in the registers next to the processing unit and they do the registry allocation if you're programming with a nice high level language with a nice compiler Etc if you're programming an assembly you're on your own you'd better do the registry allocation yourself make sense there good question too these will issues will come up in the memory hierarchy again and again what goes into the cash what goes into the other cach what goes into the other cash who brings the data from one cach to another cach right we'll go we're going to see that in lectures 20 plus but we're getting there but we just introduced the registers okay so these are the two key characteristics of the worldy model that you should never forget okay now let's go start building a one machine and I'm going to start with lc3 because this is easier I think and your book Pat Patel book does a great job uh explaining how how this operates I think for for this part of uh the discussion I would definitely recommend the patan Patel book okay so this is another one nman machine a lot of components are one nman and basically what we have is all Oneal machines today this is the lc3 Oneal machine and you can identify different parts this is the control unit these are the control signals let just to acclimate you with what is coming next uh we have data data is marked with black edged arrows over here control is usually marked with white edged arrows so that you can distinguish between control signals and data this is memory 16bit addressable and you can see the m and MDR over here memory address register and memory data register you have a keyboard and you have a monitor you have different registers for those that we're not going to go into but you can read in the book there's an ALU two inputs one output there's a register file fast temporary storage eight general purpose register you have two registers you can read concurrently from the register file uh there's an instruction register as we have discussed this stores a current instruction that's being processed there is a program counter it stores the address of either the current instruction or the next instruction depending on where you are in the instruction cycle there is the finite State machine that generates the control signals that takes input that takes as input the instruction register it looks at the bits on the instruction it says oh I need to orchestrate things such that this ad is done right and we're going to see some of that there's a clock this is a synchronous machine as we have discussed in the past and this control signal one of the control signals that's generated is the ALU operation what should the ALU do to process this instruction right and then there's another thing the Tri-State uh buffer if you remember trate buffers from the combinational logic lecture uh this trate buffer is used to place the result coming out of the ALU onto the huge thing called The System bus processor bus in this case so that the result of the ALU can be potentially put to memory for example right or it can be potentially put to the register file okay and then there's a signal coming from the control unit that controls uh the trate buffer so that if you generate the control signal at the right time when the result is generated and you put the result somewhere else we're going to see all of these soon and that's the processor bus okay so we're going to build up to this machine but I'm going to give you examples uh first on how the instructions are processed okay we've already discussed this instructions and data are stored in uh memory typically the instruction L is the word length the processor fetches instructions from memory sequentially it fetches one instruction decodes and executes the instruction continues with the next instruction after that so the address of the current instruction is stored in the program counter and then you incremented at some point while the instruction is being processed if you have a word addressable memory the processor increments the program counter by one because instruction is one word and then you keep incrementing the program counter if you have bite adjustable memory the procor increments the program counter by the instruction length in bytes we're going to see by four because your addresses are referring to one by and each instruction is four bytes so if you want to go to the next instruction you increment the address by four right as opposed to by one so there are some interactions with the operating system also that we're not right now going to talk about but uh there should be a way for the system software to say oh start this program to the machine we're not going to talk about those interactions yet the system is actually more complicated than what we're going to look at today but for the operating system for example when it starts the machine when it sets up the machine after the machine is reset it points the program counter to a particular location in this case that particular address that mips designers decided to uh say that that's where the program should start basically let me give you this and then we're going to uh take a break so this is a sample myths program uh you have four instructions stored in consecutive words in memory it's essentially uh assembly code so if you take the myips ISA mips Isa would specify what these are to you this is a load instruction this an add instruction this an add immediate instruction this a subtract instruction hopefully we're going to see a lot of these there's no need to understand this program now we will get back to it but this program is not what the machine sees what the machine sees is this encoded instructions so there needs to be a way of encoding this because this is not what you store in memory right in the memory that we design we store bits zeros and ones there's no l there's no W there's no T2 there's no all any of that right there's an encoding that needs to happen such that in memory we have the sequential set of instructions make sense so the instruction set architecture that 4,000 page book in part specifies what that encoding is if you want to specify a load Word or an ad if you prefer okay then you encode it this way so we're going to see that encoding in the later part of this lecture and this is important so someone needs to do that encoding in the past humans did that encoding today compilers do that encoding right because this actually easily automated in memory this is what you see this is what mips memory looks like these are the instructions and uh lower is lower address in this particular case I show you bite addresses over here so you can see that the first instruction is that this address for blah blah blah the next instructions that address plus four the next instruction is that address plus eight the next instruction address plus uh four well plus plus four from the previous one essentially and you can see this is what we store in memory each of them is four uh bits and you have eight of these four bits 32 bits instructions if initially the program counter is set here and the Machine fetches this instruction puts in the instruction register and executes it once it's done the machine goes to the next program counter meaning increments the program counter and fetches this instruction puts it into instruction register decodes it and execute it and it keeps doing that essentially okay this is a good time to take a break let's be back when the bell rings and then we're going to see this instruction processing cycle uh and then we're to see the entire Isa hopefully soon for for for that's cl cl spe spe a e e e six this is good spe for is after this for soit this the on you [Music] okay okay let's get started for the second part of the lecture um this was a slide where we uh finished uh now we're going to talk about instructions and we're going to get to the details of uh each of these instructions that may be of interest and we're going to talk about the instruction processing cycle and we're going to map these instructions a little bit into this picture that I showed you earlier uh of the W machine I should say that this is one particular implementation or microarchitecture that implements uh the lc3 ISA we're going to go a lot more into depth of this and some other microarchitectures later on but this is one way of implementing things this is not the only way there could be thousands of different ways of implementing things this is a nice and clean way uh that is relatively simple we're going to get to different ways of implementing later on that's that'll start next week okay now let's talk about these instructions essentially instruction is the most basic unit of computer processing instructions are the words in the language of a computer hardware understands these instructions instructions set architecture is really the vocabulary of that language so I could imagine different ways of Designing this right like any other computer language or human language there are many different ways of communicating information right uh the language of the computer can be written as machine language this is the comp computer readable representation ones and zeros which is basically what I showed you over here these are not ones and zeros but they're grouped into four bits each this is the heximal notation right they're basically grouped ones and zeros and then there's the assembly language which is the human readable representation which is essentially something that looks like this right well not as human readable as a higher level language is right you could actually become more and more human readable right and at some point you don't need to read also you just speak and then things get translated into uh these things right today we're at a point where that is possible except maybe that's not perfectly good at this point yet okay so think about that so we will study lc3 instructions and myips instructions these are it's good to study different instructions at architectures because fundamental principles are similar uh but I don't want you to take away with just one Isa it's good to know different ways of doing things because by contrasting some simple differences you can really understand uh the tradeoffs much better if you just studied one architecture then you may not be able to see some of the trade-offs that I I would like to highlight a lot of things will be similar though principles are essentially similar in all other isas as well like x86 arm risk 5 which are the dominant isas that exist uh today okay so instructions made of of two parts up code and oper this is very fundamental uh what is an up code an up code essentially specifies what the instruction does or should do operation code in other words operant specifies specify who the instructions to do it to basically what are we operating on right both are specified in the instruction format or instruction and coding and we're going to see that specification for example an lc3 instruction and coding consists of 16 bits that's the word size in lc3 each bit means something as we will see so this is one example this is a 16bit representation uh this is the up code 00001 is the up code for an ad how do I know that I look at the lc3b ISA manual it says that or lc3 Isa manual it says that if I want to write ad in machine code I should have the top four bits up code be 00001 that way someone some Hardware designer can design Hardware uh that decodes that and once you use a decoder a 4 to6 decoder you'll figure out what instruction that is right so we have built these components and then uh the next three bits specify a register the next three bits specify another register the next three bits should be zero and the next three bits specify another register so this is one instruction in lc3 and we will see why it looks this way uh so basically up code is specified as the top four Bits And there are four 16 distinct up codes in lc3 because four bits you can specify two the four distinct values right we know this this is bad and butter for you right now and if you want to decode this you use a decoder that's also hopefully becoming bread and butter for you bits 11 and zero are used to figure out where the operands are the remaining bits and uh this is we're going to call these addressing modes basically how do you figure out where your opends are where your Source registers are where your destination registers are in this particular case this ad specifies because these four bit bits are add you're going to do an ad it turns out because this bit is zero there these three bits are interpreted as a source register we'll get back to that if this bit was one these five bits would be interpreted as an immediate value over there as opposed to a register so this is one of your source registers this is another source register and this a destination register so this encoding specifies that we should an ADD and we should add the value in R six to R2 and store the result in R six these are general purpose registers so this encoding specifies that you just need to build the machine build the logic that does it and we're going to build up to that and we're going to give you the key parts today okay so that was an ad but there are different instruction types and there are three main types of instructions uh operate instructions execute operations in the ALU data movement instructions read from or write to memory also implicit in operate uh operate instructions uh use registers because registers are a critical component of the uh this ALU next to ALU right okay control form instructions change the sequence of execution hopefully we're going to look at all of them today but we're going to go into more detail so let's start with some example instructions this is an example operate instruction right high level code this not an instruction yet a equals b b plus C you could use your favorite programming language declare your variables you can say a equals B plus C right it's possible to do that assembly could be like this in this case add a b c a is the destination B is One op C is another op this is not exactly lowlevel assembly because it doesn't have the register allocation done yet these are variables ABC uh ad is called the pneumonic to indicate the operation to perform uh B and C are Source operant and a is the destination operant so that's the semantic specification of this assembly right now you need to map the variables to registers so this is the assembly in L lc3 somebody needs to do that register allocation human can do it compiler can do it let's do it as humans and as human I decide B should be register one C should be register two a should be register zero mips has a different kind of different way of specifying registers but it's essentially the same S1 S2 s z okay so now we mapped the variables to registers this is what an lc3 instruction looks like after you do the register mapping to the variables add r0 R1 R2 r0 is the destination destination uh depending on the convention destination can be on the left destination can be on the right right but our convention is destinations on the left so let's look at the field values if you want to specify this you go to the manual and the manual says an add up code is one and we're adding two registers so this should be zero and these two should be zeros also I don't know how strict the manual was on that but that's how it is and this is Source register two it's two this is Source register one which is one this is destination register zero so if you look at the instruction encoding these are exactly what 16 bits are so each bit you can specify that's the machine code essentially right it's essentially what I said over here where each bit is put into its place so this is our encoding you can also write this in hexad decimal so you write this in hexad decimal put it in memory when the program counter gets to it that instruction gets executed when the program counter program counter is the address of the instruction to execute if you remember earlier things that I said essentially that instruction this instruction specifies uh that an addition should be done on register one and register two and the result should be stored in register zero okay I'm going through this relatively slowly because uh we're going to do more of these soon so that was our instruction format basically this is our instruction format and encoding lc3 operate instruction format is actually very general you have an up code four bits in the top four bits bits 15 through 12 these are the most significant bits and then you have the destination register and then the source register these three bits are always set to zero and then you have the Second Source register over here so op is up code what the instruction does or should do for example add up code happens to be 00001 because whoever designed the lc3 ISA said that's should be so and up code is actually 01 01 okay sr1 and sr2 are Source registers Dr is the destination register okay so the semantics of add is Dr gets the value uh uh value of sr1 plus sr2 basically the data value in sr1 gets added to data value in sr2 uh the result gets stored into a destination register okay Dr semantics of end is very similar uh it's not add it's essentially an an logical bitwise end operation okay and this is one example we actually looked at this in this case you do an add R2 plus r six and the result goes to R six so source and destination register can be the same right you can overwrite the source register after you produce the result okay so this is add in myips it's going to be very similar uh you add S1 to S2 and store the result in s0o these are the field values they look a little bit longer I'm going to look at the machine code directly what what this does is basically uh Rd destination register specified over here gets RS plus RT in this case you basically uh it turns out s0 S1 and S2 are uh 16 17 and 18 there's a mapping also over there that is done uh but don't worry about that right now uh this thing specifies that you should add RS register 17 to register 18 and store the result in register 16 okay this is the machine code the instruction encoding up code is zero this fun is actually very important because it's an extended up code so in mips things are a little bit more complicated as you can see instructions are 32 bits longer there are more general purpose registers so you need more bits to specify them five bits each so you need 15 bits to specify two Source registers and one destination register and your up code is as you can see six bits and then there's fun function that's also six bits it turns out that's an extended op code you need more operations than a real Isa okay and that's the encoding okay and this is the art type instruction format so you have three register operant zero is the up code RS and RT are the source registers that's where they're placed in the encoding because somebody decided that it should be so Rd is the destination register shamp you don't need to worry about that it's a shift amount for only for shift operations you may see that later on fun is very important this is operation r type instructions basically art type instructions are operate instructions uh with register with register values they operate on registers zero is always the up code fun is the extended up code it tells you what really you should do zero doesn't mean anything in a sense zero means you should do something to these registers Funk means whatever you're going to do meaning it's an up code also okay we're going to look at that later on okay with operate instructions such as addition we tell the computer to execute arithmetic or logic computations in the ALU we already said that actually but we also need instructions to access the oper from memory if everything was in registers you don't need anything to access memory with but registers are small you need actually large amounts of memory for storing data so we need to load uh uh data from memory to registers and we need to store data from registers to memory and we need instructions to do that and so far we have not looked at that yet so we're going to see that soon read how to read or load from memory right is performed in a similar way but we will talk about that later writing is exactly opposite direction so let's take a look at load Word this is high level code this is an array operation you have an array at Base address a and I is the I element in the array how many people have programmed with things like this okay good how many people have not programmed with things like this okay good I'm not surprised you should you should have programmed with things like this so you're basically accessing I element in this array and then I'm assuming the array is stored in memory assuming it's a huge array right later we will see models where things can be stored in registers like gpus some folks has questions about the GPU execution model wait until lecture 16 or so or later actually even later okay so this is the assembly uh the destination is a variable a I'm going to have a base address a and uh an index I and this assembly says load from That Base address calculate an address into the actual location indexed into this array from this Base address get the data from there put it into variable a and we're going to map the variable to a register load is a nemonic to indicate the load Word operation and the in this case it's a load word meaning get a value of size Word 32 bits in myips a is the Base address I is the offset this also called an immediate or literal essentially it's a constant uh a is the destination op hand destination variable and the semantics looks like this I already said this basically okay so let's take a look at how this is uh encoded in lc3 and mips lc3 assembly looks like this in this case index is two uh and this is the assembly we have a load register ldr uh up code uh you uh this a register zero stores a base address a and number two is encoded in the instruction this is an constant encoded in instruction specifying what is the index and this is destination register okay and this is the semantics basically you take the value in r0 which is hopefully a add two to it calculate the address go to memory get the data place the data into register three okay so if you look at my assembly it's going to be very similar essenti the same code the same high level code translate into a load word instruction it's the the the the syntax is a bit different but it's essentially the same thing right what we do is this the Base address is an S zero we get the value over there add to a two and uh Access Memory get the data put the result into destination register S3 okay this is assuming that uh myips is word addressable I'm going to look at a bite addressable version of it we're going to multiply that two by four basically that's what's going to happen because if you're bite addressable uh your is essentially uh uh each element assuming your elements are word sized each element is four bytes right okay so these instructions actually happen to use a particular addressing mode we're going to study addressing modes later today or tomorrow uh the way the address is calculated addressing mode is also specified by the instruction set architecture it tells you how do you calculate the address in this case we calculate the address using addressing Mode called base plus offset we have a base register and we add an offset to it and the offset happens to be an immediate meaning a constant specified in the instruction okay now let's take a look at load Word and B addable myips this is high level code no there's some sound coming okay so basically MIP assembly of this code is assuming you have word sized element in this array and your memory is bite addressable in order to access element two you need to multiply the offset by four right 2 * 4 is eight so my offset is going to be eight over here and that's how I access the second word in a b addressable memory uh basically B address is calculated as word address times byes divided by word byes divided word by word uh bytes in a word in myips is four and we already discussed that if lc3 were B addressable you have two bytes per word and you would multiply it by two essentially and you will see that later on okay now let's encode this these are the instruction formats with immediate for lc3 and myips as you can see they look very similar I look at the B adjustable version of myips because MPS is really Bal that's how the ISA is defined lc3 is word disal that's how the ISA is defined okay and these are the field values now you can see this a different kind of format up code in lc3 is six specifies ldr destination register is three base register zero and there is a six bit offset field over here with a six-bit offset field you can easily encode two right and you get two over here so this is the encoding basically we use all of the bits over here in myips it looks like this you have the up code it happens to be 35 you don't have fun because up code looks like this the rest of the instruction looks like this meaning you have a source register you have a destination register so s z happens to be 16 s three happens to be 19 in myips and coding and immediate is huge 16 bits as you can see and that's eight so basically this is the encoding of this instruction it's called an IT type immediate type instruction which means that it has a 16bit immediate value half of the instruction is dedicated to encoding a constant value does that make sense okay we're going to see more of these soon we're just studying the encoding right now okay so hopefully this gives you an idea of how instructions are encoded we're going to see more of these encodings but before we do that let's all let's see how these instructions are processed a little bit because we're going to tie the encoding to the processing also soon okay we're going to introduce the instruction processing cycle or instruction cycle this is not to be confused with uh the clock cycle it has nothing to do with the clock cycle it's really how the instructions are processed in the machine so basically how are these instructions that we encoded get executed now we can speak the language of the computer because we encoded things in a way the computer can understand and the computer can decode them using a lot of decoders which you have seen so we know how to tell the computer to do things like execute computations the ALU by using for instance an addition instruction access oper from memory by using the load word instruction but how do these get executed on the computer so the process of executing an instruction is called an instruction cycle or instruction processing cycle I like the second one better instruction processing cycle because you processing instruction it's called a cycle because you keep doing this repeatedly okay so basically every instruction goes through a sequence of steps or phases uh to be executed and these are the general cases fetch decode evaluate address fetch operant execute store result these phases may not need may not be needed in every instruction not all instructions require all the fix six phases every instruction needs to be fetched you have a program counter you need to fetch the instruction you get the data bits of the instruction and then you need to decode the instruction to figure out what the instruction is telling the machine to do and then you need to evaluate the address so that you get the Ops and then you fetch the Ops you execute the instruction by doing something to the oper and then you store the result but not all instructions require everything for example load uh register does not require execute right you fetch the operand you fetch the operand from mem and put it into a register destination register there's no operation that you do really in the ALU for that as you will see later on ad doesn't require evaluating a memory address in this case evaluate address implicitly means memory address uh you don't have uh at least in lc3 and mips you don't have ad instructions that directly operate on memory all of them operate on registers okay this is good to keep in mind it's also good to keep in mind that this is not the case in every single Isa in the world people have imagined made different things x86 for example has instructions operate instructions like ad that operate on memory values register values you can in in x86 you can specify I want to add this me add the add the data value in this memory location to this register and store the value in some other memory location okay this enables you better expressibility you don't have to do things only in registers and then put the value into memory you can directly operate in the memory locations without bringing the data into the registers and there could be value to this as we will see in later lectures but we're not going to see isas and you're not going to implement isas like this uh in your labs for example but this is an example instruction in x86 that says uh get this uh eax is a register uh if you use it in a address you access the memory location get the data value in that memory location add to it the value in the edx register and then store the result into the memory location that's addressed by E again okay so source and destination are the same in this case again you don't need to understand this but there's no one single way of doing this and the benefit of doing operations directly on the memory locations is you don't need to waste register for example you just need to add something to somewhere in memory you don't need that value again remember registers are very small precious you want to keep things that are being used over and over inside your register file but you just want to increment some random location in memory because you're keeping count somewhere once in a blue moon just do it in memory don't bring it into a register right so there are good reasons why people add these things into the isas but they do complicate the ISA also they do complicate the implementation in the end okay so after you store the result and if happens basically that's why this called a cycle now we're going to go through each of these uh stages does that sound like okay you're already having fun I can tell okay so fetch phas so this phase obtains the instruction from memory and loads into the instruction register I kind of said that actually earlier but every instruction has to go through this the complete description looks like this actually I'm going to show you a picture also uh you load the memory address register with the contents of the program counter which is the address of the instruction as you know and simultaneously increment the program counter because you're go to the next instruction later at some point and then you access memory or interrogate memory as used by Pat and Patel memory Returns the results uh into the MDR basically uh by interrogating memory what what the memory does it takes the address and memory address register accesses itself and puts the data into the MDR what is data these are the contents of the instruction right because we're accessing memory with the program count okay and then you take the data that's put into the MDR by memory memory data register put it into the instruction register does that make sense so there are three steps to get the the contents pointed to by the program counter which is the address of the instruction that you want uh eventually after these three steps the instruction that you want at that program counter at that memory location gets placed into the instruction register now you have the full content of the instruction you've fetched the instruction what we have done in other words is we have interpreted uh the address the program counter as a the address of an instruction because we're accessing memory in the fetch stage now we're accessing memory using the program counter in the fetch stage putting the data that's coming out of the memory into the instruction register we're interpreting the data at that location as an instruction right okay this is going to be hopefully clear later on because we're going to access memory in some other stage also we're not going to put the data into the instruction register just because we're in the fetch stage of execution the control unit is putting the data value coming out of memory into the instruction register so we're treating that data value in memory as an instruction okay okay let's take a look at how this is done in this high level picture micro architecture of lc3 so step one you load M and increment the program count what do you load the M with you basically take the program counter so there's some program counter over here there needs to be control signals set accordingly in this stage so there needs to be a finite State machine controlling this as we will see also soon the finite State machine says I'm fetching the instruction I'm in the fetch phase so i' better set the gate PC signal to enabled so that the value on the PC gets gated under the bus remember the Tri-State buffer that value flows on the bus and I'd better take set the enable signal or load M signal to be one so that whatever flows on the bus gets placed written into the M register load m means write enable it's essentially a WR enable signal for m so now we've kind of put together all of the concepts that we've seen earlier we have a register PC we're gating it its value using a Tri-State buffer onto this wire bus and then we're right in the register where we want the address to be placed into M that's could take one clock cycle okay don't worry about clock Cycles right now this is these are steps okay and then there's another step where we access memory and at some point memory uh accessing memory happens by it basically takes the address in the M which is essentially the program counters value accesses itself using the decoder uh all of those bits uh multiplexers at the end and the the data that you need eventually comes out in the MDR the word the instruction word of course you need to set the load MDR signal to be one so that this MDR gets written into the third step is once that is done you load the instruction register which happens to be over here with the content of the MDR to be able to do that we need to have a path from MDR to ir and that path already exists and the finite State machine so you can think of these as three States in finite State machine State one state two state three finite State machine in this state uh enables MDR signal G gate MDR signal to be one such that the value in MDR goes onto this large bus that we call the processor bus it also uh WR enables the instruction register or load IR as it said in this book uh such that the data value that's coming out of this processor bus gets written into the instruction register okay so now hopefully you kind of connected the dots everything we have seen previously enabled what we have done over here we're now able to fetch an instruction using both sequential and combination logic and a finite sa machine that I didn't show you but you could imagine that finite State machine I'm going to show you that later on in one state you do this and then you move to the next state in the next state you do this and once the memory says I'm done you move to the next state in the next state the finite State machine does this so after three states you get the instruction that you want okay so all of this is done using sequential and combination logic together but of course you needed to design the machine to do that also sounds good okay so that was fetch we're going to see more of this later on let's talk about decode now you f the instruction the contents are in the instruction register the decode phase does what decode means we decode the instruction identify what the instruction is and what we should do afterwards right uh it also generates essentially what we should do afterwards means it generates the set of control signals to process the identified instruction in later phases of the instruction cycle basically it determines your next states in the finite State machine if it's an add I execute the sequence of States if it's a multiply I execute this other sequence of States if it's a load I execute this other sequence of states in the finite State machine right okay we're going to see that finite State machine so basically we're going to use a decoder uh so in this case we have four bits up code in lc3 and a 4 to 16 decoder identifies which of the 16 op codes is going to be processed so the input is four bits the top four bits of the instruction register because we've already fetched the instruction from memory and the remaining 12 bits identify what else is needed to process the instruction based on those four bits okay so let's take a look at decode we're not going to see a whole lot here but basically we identify the instruction to be processed essentially the top four bits of the instruction register goes through a decoder and essentially affects this finite State machine that we're going to see later it also generates a set of control signals to process the instruction later for example it says oh I should go through these states next we're going to see this more but that's decode actually identifying the instruction and generating the control signals for what should happen next uh in the future because I have fetched this instruction right okay fetch and decoded so you can see there are a lot of control signals over here we will demystify this picture later I don't want to go into a lot of microarchitectural detail right now but you will see a lot of this later and you'll see different kinds of control logic okay decoder hopefully you remember again there's no magic over here a 4 to 16 decoder is identifies what instruction you're executing and remember I I said that uh the input pattern could be the instruction in the program and the processor needs to decide what action to take based on the instruction up code hopefully now it's much more clean so this is the full finite State machine I don't want to scare you but this is uh lc3b actually which is B addressable it's it makes things slightly more complicated you can find this online we should also upload it into spring 2025 website if you haven't done so uh but just to show you these stages that we talk about this is the fetch phas if your finite State machine is here what you do is you generate the control signal such that PC gets loaded into memory address registers and concurrently I forgot to say this in the picture sorry concurrently you increment the PC by two because you're going to go to the next instruction later at some point right you could do this later also somewhere but they decided to do it in this state and then unconditionally you jump to next state remember in finite State machines we have the next state logic after one clock cycle unconditionally you go to the next state okay the uncondition and the next state memory uh uses the memory address register to access data to access data from itself and places the data into the memory data register and stays in the state until the ready signal uh is uh asserted if the memory is not ready meaning memory access may take a long time it's actually an analog process as we will see later on uh analog plus digital process uh memory asserts a ready signal once the data is placed into the MDR and you go to the next state only when memory does that and the next state memory dat register gets loaded into the instruction register so this is our fetch face and this finite State machine controls all of those operations as you can see again there's no magic we've seen finite State machines we've seen logic this is the controller that we have next phase is decote now what's happening over here is again unconditionally you go to the decode stage there's some stuff that we're not going to talk about right now but based on ir1 1512 over here you decide where to go in the finite State machine the next state is determined by the up code right and each of those up codes have a different thing to do for example if the up code is add in the next state you do this you add sr1 to sr2 store the results in Dr and then you go to state 18 which happens to be the fetch face right so you go to next instruction basically and you've already incremented the PC you fetch the next instruction does that make sense so this is the operation of an entire lc3b machine which is a simple machine existing machines you can specify as so a processor over here it is also a finite State machine it's a much more complicated finite State machine but it is a finite State machine okay okay so that hopefully is clear uh the next uh phase in execution is evaluate address phase this computes the address of the memory location that is needed to process the instruction so this phase is necessary in ldr because it it it requires access to memory and you need to access you need to eval the address that you need to access so in ldr you you need to compute the address of the data word that is to be read from memory by adding an offset to the content of a register if you remember base plus offset addressing mode this is not necessar an ad so ad doesn't have a state in the finite State machine for this so if you look at the finite State machine ad doesn't evaluate address it just takes sr1 sr2 does something right whereas ldr this is testing my eyes now while this is lc3b so there's an ldw which is is same as ldr it it it does some address evaluation this is the evaluate address State okay okay so not necessary that so let's take a look at this in the uh micro architecture ldr calculates the address by adding a register and then immediate to each other this is a specification if you look at this you have a base register and you have a sign extended offset we're going to see this later more and more but basically you need to take the register value in the state from one of the registers and then you need to add to it the bottom part bottom six bits coming from the instruction register and you need to put that into the M okay so that you can access memory now we're going to access memory with that so I've added that Adder because you needed it in this particular case it'll become more and more complicated that Adder we will see that in the next lecture but this is not magic again you can imagine building this and it's not that hard to build okay let's talk about fetch operand fetch operand phase obtains The Source operand needed to process the instruction in ldr now we've calculated the address you put the uh you load the M with the address calculated in the evaluate address stage you read the memory which place a source op in the memory data register in at you obtain the source operand from the register file in some microprocessors oper oper fetch from register file can be done at the same time you're decoding the instruction that's also possible lc3 microarchitecture that's this custom Pat and Patel doesn't do that but we're going to see that later on also when we build pipeline processors or when we build one of our my processors you will see that okay let's take a look at fetch operant quickly in ldr again you uh the step one is you you take the uh you load Mr with the address we calculated earlier let me just jog your memory we calculate the address over here that address needs to get into the m so in the finite State machine you need to set the control signal load Mr to be one so that the address that you calculated gets loaded over here and then you do the M access which is very similar to the fetch stage except it's a different state because now we're treating memy as a source of data that will be used as an operant as opposed to as a source of instruction that we be Place into the instruction register right okay so now the data comes to MDR and that will be placed somewhere later on but we'll talk about that execute phase executes instruction in add it performs addition in the ALU in exor it performs with exor in the ALU Etc it happens basically add let's take a look at ad ads sr1 and sr2 in this state you set the control signal such that you access the register file the data comes uh through the ALU so you have sr1 sr2 there are some control signals that make sure that both of these registers go to the inputs of the ALU and the ALU operation is add these these are all control signals and the data comes the result comes out of the ALU that needs to be gated meaning this trate buffer called gate ALU should be enabled such that the result gets put onto this wire processor bus and later in the next phase it's going to get stored into the register file make sense I'm kind of broken break breaking the phase but these may actually be completed in a single cycle these phases have nothing to do with clock Cycles as I said earlier one one particular two multiple phases can be collapsed into a single clock cycle a single phase can can take many clock Cycles like fetch itself takes at least three clock Cycles right but here execute and store result actually happens in the same clock cycle for add instructions okay store result this phase writes the result to the Destin designated destination so once store results is completed a new instruction cycle starts with the fetch phase so let's take a look at store result add lows the ALU result into Dr so ALU result was gated onto the processor bus oh and then you need to set the control signal such that you write enable the destination register and you have the destination register ID coming from the instruction register based on the encoding if you remember so there's no magic again the destination register ID is encoded in the instruction register and gets connected to the register file and the register file gets right enabled such that you write this data that's coming off the processor bus into the register specified by the instruction okay ldr if you do ldr you load the data that was placed into the MDR memory data register and set the control signals such that you gate the memory data register uh trate buffer and memory data register gets on the this processor bus and you set the destination register uh Lo load register signal accordingly such that you ride enable the register file and destination register comes from the instruction register instruction bits over here so you need to set the control signals accordingly such that MDR gets into destination register we're going to see different ways of doing that later on but now I've given you a complete instruction cycle for multiple types of instructions different instructions do different things clearly we've looked at add and load hopefully this makes sense we're going to see more and more of this okay how much time do we have let's see no we can change the sequence of execution there is enough time because it's important you need to change the sequence of execution and then there'll be a clean state so uh we've seen a couple of examples of instructions we have what we have not seen is this uh normally we execute instructions in program order sequence sequential order first instruction second instruction third instruction you keep incrementing the program coun unless we can change the sequence of we change the sequence of execution so control instructions allow us to do this they allow a program to execute out of sequence they can change the program counter by loading it during the execute phase meaning there are instructions that operate directly on the program counter program counter is an architecturally visible register to the programmer programmer can say I'm going to put value X into the program counter which means that the next instruction that I'm going to execute is going to be at address X right so you're very powerful as a programmer as you can see the program doesn't need to be executing in sequence so basically that wipes out the incremented PC that's loaded during the fetch phase and loads the incremented PC loads whatever uh you're loading after the execute phase so let's take a look at an example unconditional Branch or jump in lc3 it looks like this jump uh R2 R2 is the register and what this does is you take this is the encoding you have a base register base register as a value basically program counter gets the value in the base register makes sense right whatever value you put over there so you can jump to anywhere in memory any place in memory can be construed and interpreted as an instruction so it could have instructions all over the place this is the register addressing mode as we will Define later on there are variations of this there's a return this is a special case of jump where base registers are seven you can do function calls this is how you can jump to a function for example you have a base register you jump to a function and then you can return these are jump to sub routine this is actually how you really jump to a function that you will see later on and then there's a jump register uh also okay okay so there's a jump in myips this is unconditional Branch or jump it looks like this actually in myips it's a little bit different if you look over here uh this is the up code two up code is the top six bits and the rest is the target so how do you use that Target this called the J type instruction myips somebody defined it this way not my fault it's somebody else's definition Target address is essentially the target address essentially where you want to jump but this is how the full program counter is computed you basically take uh the incremented PC top four bits of the incremented PC concatenate to it these 26 bits multiplied by four why this target is Bible you multiply it by four because instructions are word aligned they're actually word words you so you multiply it the Spy four so that you can get the word address and you concatenate to it the top four bits of the piece this way you can jump relative to the program counter your program counter is here you use the top four bits they don't change but relative to it you find a location to jump to right it could be anywhere within the vicinity of to the 26 up there and to the 26 down there you will take some minus ones okay let's use the pseudo direct addressing mode we will see this later on it's also called a PC relative uh addressing mode so there are variations like jump and Link function called that I'm not going to talk about okay let me just show you and then where be done so how do you update the PC jump for example loads sr1 into PC there is a path from register file that goes through a multiplexer and you can actually set the load PC signal over here that way you can put a register into the program counter okay this is where I'll stop this is where we will pick up hopefully we we will go into more instructions and look at trade-offs in the ISA I'll see you tomorrow you

Transcript for:[Lecture 7] Fundamentals of Computer Architecture

Transcript for:
[Lecture 7] Fundamentals of Computer Architecture