Understanding ARM Cortex-M Load-Store Architecture

In our last video, we talked about the move instruction and how it's used in the ARM Cortex-M architecture. One thing we need to remember about the move instruction is that it can only access registers. The move instruction does not access memory. We need another way of doing that. And the way we access memory is with the load and store instructions. It's said that the Cortex-M architecture is a load-store architecture because most instructions do not access memory. For example, an add instruction cannot directly read a value from memory and add it to something, and the result from an add instruction can't be put directly into memory. To access memory, either to get something from a memory location or to store something in a memory location, we have to use the load and store instructions. Now, the reason why we can't do this with a move instruction is that accessing A normal 32-bit address in memory would require embedding a 32-bit immediate value, the address, into an instruction. And since instructions cannot be longer than 32 bits, that's just impossible. So let's look at how a load instruction works. The load instruction uses a register as a pointer, so it gets the memory address from a register. Note that when a register is used as an address pointer, we enclose its name in square brackets when writing the instruction. In this example, the load instruction uses the contents of register 1 as an address, fetches what is stored in memory at that address, and then puts that into register 5. So when we execute this instruction, what we see is that the load instruction is used as an address pointer. that the value in memory goes into register 5. So far so good. Of course there had better be something meaningful at the specified address. In this example register 1 holds the address A000 so that's the memory address the processor will use to look for the desired data. And I've assumed that this hexadecimal value 0x0000 So BEAD is in fact stored at that address and then gets copied into register 5. If we want to copy a value from a register out to some location in memory space, then we need to use the store instruction or store register instruction. The store instruction uses a register to hold the desired memory address just like the load register instruction. However, the first operand of the store instruction is the source of the data value to be stored into memory, and the given memory address is the destination. The store instruction is one of the few instructions that uses the first operand as the source and the second operand as the destination, instead of the other way around. Now, when the processor executes the store instruction, it copies the value stored in register 3, the source, out to the memory address specified by register 1. So the value 0xFeedC0de gets stored out to the memory address A00. And it's important to use the right words to describe how the load and store instruction work. It would be incorrect to say that they use the address of register 1, for example. Instead, we should say that the instructions use the address stored in register 1. Registers don't have an address, they just have a name. But the value stored in a register may be interpreted as an address, and that's what the load and store instructions do. The real power of the load and store instructions lies in how we can modify the address pointer. The load instruction shown here uses register offset addressing, which means that a small fixed constant value is added to the value stored in a register, and the result of that addition is used as the desired memory address. This example is just like the previous load instruction, but in this case the processor will add 4 to the value in register 1 before fetching data from memory. When the instruction is executed, the processor calculates that the desired memory address is A04. fetches the value stored at that address, and copies it into register 6. Note that the pointer value in register 1 is not modified, but there are variations of the load and store instruction that can modify the pointer register. There's a very important special case of the load instruction that can be used to load any 32-bit constant into a register. This is accomplished using what's called the literal addressing mode. One nice aspect of using literal addressing is that we can compile or assemble a piece of code without knowing what the actual value of the constant should be, as long as we know the name of the constant. The compiler or the assembler just reserves a place in program memory where the constant will ultimately be stored, and when all the pieces of our program are linked together, then the value of the constant is stored in the memory word reserved for it. For example, Suppose we have some file that defines the constants needed by our program. It's good programming practice to do that, to store the important constants all in one place. In that file, we use the.equ assembler directive to associate the value of the Jenny constant, 8, 6, 7, 5, 3, 0, 9, with a symbol named Jenny. Note that the.equdirective does not cause a value to be stored in memory. It just associates a constant value with a symbolic name. We know that constants must eventually be stored in a non-volatile program memory, but exactly where and how the constant eventually gets stored depends on how it is used in the code. In another file, we start writing some kind of subroutine, or a function, that needs to use the jenny constant. So the first thing we want to do in our subroutine is load the value of the constant into register 0. We can use a load instruction with register 0 as the destination, but the source is specified as an equal sign followed by the symbolic name that has been or will be associated with the constant value. It's this special syntax for the source operand that identifies the literal addressing mode. Now we know that the load instruction really needs to use a register as a pointer to where the constant will actually be stored in memory. But at this point, we haven't stored the constant anywhere in memory. Since the constant is needed by our subroutine, one reasonable approach is just to store the constant in a word of program memory that is just after the end of the subroutine's executable code. In other words, we bundle the required constants along with the executable code that needs them. If we know that the constant is stored along with the subroutine's code, then we will know the distance in bytes between the address of the load instruction and the address of the constant. We know the distance between these two things even before we know exactly where in memory they will be stored. So how does this help? Well, it turns out that there is already a register that knows something about the address where an instruction is stored in memory at the time that instruction is being executed, and that's the program counter. So the literal addressing mode is really just a special case of register offset addressing where the register is the program counter. And one very nice feature of using the literal addressing mode is the fact that it's now relocatable. Because as the program counter changes when the code is stored in a different location, the distance between the instruction and the literal value does not change. So, the assembler stores the value of the Jenny constant at the very end of the function of the subroutine using a.word directive. which does in fact allocate a word of memory storage for this constant. And in general, we call that area where these constants are stored the literal pool. It's the place where constants get stored for use by the subroutine. Now, when the load instruction is executed, remember that the program counter actually points to the next instruction. So... Here is our LDR instruction. It happens to be a 16-bit instruction, and when that instruction is being executed, the program counter is already pointing to the next instruction in memory. So our Jenny constant is loaded. let's say down here at the end of the subroutine, and the distance between the program counter and that constant is one, two, three, four, five, six words. The distance between the program counter and the constant is always specified as some number of words, and that's the value then that gets stored as an immediate value in the actual instruction encoding. Now if you look at the disassembled code, you'll see that the value of the offset is specified as a number of bytes from the current program counter. But in fact, the instruction encodes that as a number of words, and the number of bytes therefore must be divisible by four. Now usually we don't have to worry about this. We let the assembler sort out these details using the symbolic constants, and you don't have to pay much attention to it. So, why would we use the literal addressing mode instead of the MOVW and the MOVT instructions we talked about last time? Well, one reason is that the MOVW and the MOVT are not available on the Cortex-M0. They were added in the V7 ARM architecture that's used in the Cortex-M3 and M4. Another reason is that the LDR instruction uses less program memory. If we use a move T and a move W, we need two 32-bit instructions. If we use the LDR, on the other hand, we need one 16-bit instruction for the LDR and one 32-bit memory location for the constant itself. So we can use less program memory with the literal addressing mode. One disadvantage of literal addressing is that now we're doing a data access from the program memory. which may interfere with instruction fetches and slow down the processor. What are the key points? Remember that the Cortex-M architecture is a load store architecture, which means that there are very few instructions that can directly access memory, and since our I.O. devices are mapped to memory, there are very few devices or very few instructions that can directly access the registers of I-O devices. Remember that we have to use a register as a pointer for a load or a store instruction, and that register has to hold the full 32-bit address of the thing we want to deal with in memory. Remember the literal values, as used in the literal addressing mode, are just 32-bit constants that are placed between segments of executable code. So typically at the end of each function or subroutine, whatever literal values are needed for that function will be placed at the end of the executable code in what's called the literal pool. And remember that literal values are actually accessed just using an offset from the program counter. That means that your code will be relocatable, and we don't have to... actually store the value of that constant until the final linking is done. That's the load and store instruction for the Cortex-M architecture. Thanks for watching.

It's said that the Cortex-M architecture is a load-store architecture because most instructions do not access memory. For example, an add instruction cannot directly read a value from memory and add it to something, and the result from an add instruction can't be put directly into memory. To access memory, either to get something from a memory location or to store something in a memory location, we have to use the load and store instructions. Now, the reason why we can't do this with a move instruction is that accessing A normal 32-bit address in memory would require embedding a 32-bit immediate value, the address, into an instruction. And since instructions cannot be longer than 32 bits, that's just impossible.

So let's look at how a load instruction works. The load instruction uses a register as a pointer, so it gets the memory address from a register. Note that when a register is used as an address pointer, we enclose its name in square brackets when writing the instruction. In this example, the load instruction uses the contents of register 1 as an address, fetches what is stored in memory at that address, and then puts that into register 5. So when we execute this instruction, what we see is that the load instruction is used as an address pointer. that the value in memory goes into register 5. So far so good.

Of course there had better be something meaningful at the specified address. In this example register 1 holds the address A000 so that's the memory address the processor will use to look for the desired data. And I've assumed that this hexadecimal value 0x0000 So BEAD is in fact stored at that address and then gets copied into register 5. If we want to copy a value from a register out to some location in memory space, then we need to use the store instruction or store register instruction. The store instruction uses a register to hold the desired memory address just like the load register instruction. However, the first operand of the store instruction is the source of the data value to be stored into memory, and the given memory address is the destination.

The store instruction is one of the few instructions that uses the first operand as the source and the second operand as the destination, instead of the other way around. Now, when the processor executes the store instruction, it copies the value stored in register 3, the source, out to the memory address specified by register 1. So the value 0xFeedC0de gets stored out to the memory address A00. And it's important to use the right words to describe how the load and store instruction work. It would be incorrect to say that they use the address of register 1, for example.

Instead, we should say that the instructions use the address stored in register 1. Registers don't have an address, they just have a name. But the value stored in a register may be interpreted as an address, and that's what the load and store instructions do. The real power of the load and store instructions lies in how we can modify the address pointer.

The load instruction shown here uses register offset addressing, which means that a small fixed constant value is added to the value stored in a register, and the result of that addition is used as the desired memory address. This example is just like the previous load instruction, but in this case the processor will add 4 to the value in register 1 before fetching data from memory. When the instruction is executed, the processor calculates that the desired memory address is A04.

fetches the value stored at that address, and copies it into register 6. Note that the pointer value in register 1 is not modified, but there are variations of the load and store instruction that can modify the pointer register. There's a very important special case of the load instruction that can be used to load any 32-bit constant into a register. This is accomplished using what's called the literal addressing mode. One nice aspect of using literal addressing is that we can compile or assemble a piece of code without knowing what the actual value of the constant should be, as long as we know the name of the constant. The compiler or the assembler just reserves a place in program memory where the constant will ultimately be stored, and when all the pieces of our program are linked together, then the value of the constant is stored in the memory word reserved for it.

For example, Suppose we have some file that defines the constants needed by our program. It's good programming practice to do that, to store the important constants all in one place. In that file, we use the.equ assembler directive to associate the value of the Jenny constant, 8, 6, 7, 5, 3, 0, 9, with a symbol named Jenny.

Note that the.equdirective does not cause a value to be stored in memory. It just associates a constant value with a symbolic name. We know that constants must eventually be stored in a non-volatile program memory, but exactly where and how the constant eventually gets stored depends on how it is used in the code. In another file, we start writing some kind of subroutine, or a function, that needs to use the jenny constant. So the first thing we want to do in our subroutine is load the value of the constant into register 0. We can use a load instruction with register 0 as the destination, but the source is specified as an equal sign followed by the symbolic name that has been or will be associated with the constant value.

It's this special syntax for the source operand that identifies the literal addressing mode. Now we know that the load instruction really needs to use a register as a pointer to where the constant will actually be stored in memory. But at this point, we haven't stored the constant anywhere in memory. Since the constant is needed by our subroutine, one reasonable approach is just to store the constant in a word of program memory that is just after the end of the subroutine's executable code. In other words, we bundle the required constants along with the executable code that needs them.

If we know that the constant is stored along with the subroutine's code, then we will know the distance in bytes between the address of the load instruction and the address of the constant. We know the distance between these two things even before we know exactly where in memory they will be stored. So how does this help? Well, it turns out that there is already a register that knows something about the address where an instruction is stored in memory at the time that instruction is being executed, and that's the program counter. So the literal addressing mode is really just a special case of register offset addressing where the register is the program counter.

And one very nice feature of using the literal addressing mode is the fact that it's now relocatable. Because as the program counter changes when the code is stored in a different location, the distance between the instruction and the literal value does not change. So, the assembler stores the value of the Jenny constant at the very end of the function of the subroutine using a.word directive.

which does in fact allocate a word of memory storage for this constant. And in general, we call that area where these constants are stored the literal pool. It's the place where constants get stored for use by the subroutine. Now, when the load instruction is executed, remember that the program counter actually points to the next instruction. So...

Here is our LDR instruction. It happens to be a 16-bit instruction, and when that instruction is being executed, the program counter is already pointing to the next instruction in memory. So our Jenny constant is loaded.

let's say down here at the end of the subroutine, and the distance between the program counter and that constant is one, two, three, four, five, six words. The distance between the program counter and the constant is always specified as some number of words, and that's the value then that gets stored as an immediate value in the actual instruction encoding. Now if you look at the disassembled code, you'll see that the value of the offset is specified as a number of bytes from the current program counter.

But in fact, the instruction encodes that as a number of words, and the number of bytes therefore must be divisible by four. Now usually we don't have to worry about this. We let the assembler sort out these details using the symbolic constants, and you don't have to pay much attention to it. So, why would we use the literal addressing mode instead of the MOVW and the MOVT instructions we talked about last time? Well, one reason is that the MOVW and the MOVT are not available on the Cortex-M0.

They were added in the V7 ARM architecture that's used in the Cortex-M3 and M4. Another reason is that the LDR instruction uses less program memory. If we use a move T and a move W, we need two 32-bit instructions.

If we use the LDR, on the other hand, we need one 16-bit instruction for the LDR and one 32-bit memory location for the constant itself. So we can use less program memory with the literal addressing mode. One disadvantage of literal addressing is that now we're doing a data access from the program memory.

which may interfere with instruction fetches and slow down the processor. What are the key points? Remember that the Cortex-M architecture is a load store architecture, which means that there are very few instructions that can directly access memory, and since our I.O.

devices are mapped to memory, there are very few devices or very few instructions that can directly access the registers of I-O devices. Remember that we have to use a register as a pointer for a load or a store instruction, and that register has to hold the full 32-bit address of the thing we want to deal with in memory. Remember the literal values, as used in the literal addressing mode, are just 32-bit constants that are placed between segments of executable code. So typically at the end of each function or subroutine, whatever literal values are needed for that function will be placed at the end of the executable code in what's called the literal pool.

And remember that literal values are actually accessed just using an offset from the program counter. That means that your code will be relocatable, and we don't have to... actually store the value of that constant until the final linking is done.

That's the load and store instruction for the Cortex-M architecture. Thanks for watching.

Transcript for:Understanding ARM Cortex-M Load-Store Architecture

Transcript for:
Understanding ARM Cortex-M Load-Store Architecture