Transcript for:
Understanding Change of Basis in Linear Algebra

Hey folks, my name is Nathan Johnson and welcome to lecture 8 of Advanced Linear Algebra, which is all about change of bases. So the idea for this lecture is, recall from a couple lectures ago, that we learned that you can represent vectors from vector spaces as coordinate vectors, which were sort of just lists of numbers that represented how far that vector was pointing in each of the different directions specified by whatever basis you were working with. Okay, well there's a very natural question, what if... You start off with some basis, but then you change your mind and you say, oh no, well actually I'd like to represent this vector in some different basis instead. In other words, what if I'd like to construct a different coordinate vector corresponding to different directions, a different basis in my vector space? How can I convert this coordinate vector into this one? How can I change the basis that I'm working with? So that's the problem that we're going to look at in this lecture, okay? And we're just going to start off with the definition, the definition of the thing that solves this problem, basically, okay? And we're going to go through... how to show that this solves our problem and how to actually compute it efficiently. Starting off with the definition though, same setup as always. Suppose you've got some vector space and it's going to be finite dimensional. We're going to have a basis that has finitely many vectors in it, so n-dimensional in this case, and you've got some other basis c. The idea is we would like to convert coordinate vectors with respect to basis b into coordinate vectors with respect to basis c. Well, the matrix that does the job, we call it the change of basis matrix from b to c. and we denote it like this. And I'll talk about this notation a little bit more later on, but sort of this is saying, well, it's the matrix that converts B in to C, okay? Well, it's an N by N matrix, where N again is the number of vectors in any basis of the vector space. In other words, it's the dimension of the vector space. What it is, the matrix that does the job is, well, it's just the matrix that has these as its coordinate vectors. And what are these? Or sorry, these as its columns. And what are these? Well, these V1 up to VN, these are the, the vectors from B just represented in the new basis C. So sort of what you're doing here is you're taking the old basis and you're representing it in the new basis, and that gets you some matrix. You just stick those coordinate vectors in as columns into this matrix, okay? And that gets you a matrix that converts any vector from the old basis to the new basis from B to C, okay? And that's not immediately obvious, okay? It's not immediately obvious that this matrix does anything useful for us, okay? So this is actually a theorem, and this is our first theorem of this lecture. The theorem shows us that this matrix up here does what we want it to do, okay? In particular, property A here says, okay, if you got bases B and C of some finite dimensional vector space, then this matrix up here that we just introduced, it really does convert coordinate vectors with respect to basis B into coordinate vectors with respect to basis C. So it lets you change... Coordinate vectors from one basis to another. Okay, so that's the important part, really. That's what property A here says. And then property B gives us a nice relation that hopefully is intuitive enough. It says, well, what if you want to go back the other way? What if, okay, yeah, great, I converted from coordinate, sorry, I converted coordinate vectors from basis B into basis C. Now I want to go back the other way. For some reason, I changed my mind. I want to go back from basis C back to basis B. Okay, I want to go back to where I started. And well, part B of this theorem just says the way to do that, is you just invert the matrix, right? So it says every change of basis matrix, it's invertible, and well, the inverse is what you would expect, okay? If you want to invert the process of going from B to C, well, that's going from C back to B, okay? Again, if you think of matrices as linear transformations, they do things to vectors, then this is very intuitive. The inverse undoes the thing that you did, right? Okay, and furthermore, this matrix, this change of basis matrix, it's the only matrix with a property E. with this property A from this theorem here. In other words, if I have any matrix, if I throw in some other matrix just called P here, and P times VB equals VC for all vectors, well, then it must be this change of basis matrix that we defined up above. That's the only matrix that does the job for us. All right, so we're going to prove this theorem, but before we prove it, we've got just a couple of quick notes about the theorem and this notation and this definition. All right, and the first note that I want to make is about the notation. Okay, so this notation that we're using, P with, you know, B backwards arrow C, it looks really weird at first, but sort of the reason that we do it that way, the reason that we have this backward arrow, rather than sort of maybe what you might guess at first, P with B forward arrow to C. The reason that we do it with the backward arrow like this is so that adjacent subscripts match in this theorem, and there's going to be other theorems next week where this sort of ramps up and gets even more extreme, where it's going to be really convenient for us to have sort of matching things next to each other just to make sure that everything sort of matches up how we would like. Like in this expression we're thinking of this as something in basis B, oh and then that B is being transformed into basis C, like you're sort of reading these things from right to left just because function notation and multiplication works. right to left. Okay, so hey, this matches up here, so great, this coordinate vector can go into this matrix is sort of the way we're thinking about this. Okay, and then after it goes in, all you're left with is something in basis C, which, yeah, that's what we're left with on the right-hand side too. So it all matches up. That's why we like that notation. Okay, and the way to remember the definition, like if we scroll back up here, this definition of change of basis matrix sort of has a lot going on in it. And in particular, it's very easy to get tripped up in, hey, do I represent the vectors from basis B and basis C? Or do I represent the vectors from basis C and basis B? And the way to think about it is you're transforming the old basis B into the new basis C. So take the old basis vectors from B and write them in the new basis C, right? You want to end up with coordinate vectors with respect to basis C. So take the old, represent them in the new. That's how we're going there. All right, so let's prove this theorem. Let's show that change of basis matrices actually do what we want them to do. Okay, and the first thing we're going to do is just give names to the vectors from the basis b, so that's what we're doing there. And then, well, because it's a basis we can write our vector v that we're interested in as a linear combination in exactly one way of those members of the basis b. And in particular, if I write v as a linear combination in this way, that means exactly by definition that the coordinate vector of v with respect to this basis is, well, it's just you take those coefficients from the linear combination and you write them out in a vector. Okay, that's all just by definition. Okay, and then what we want to show is we want to show, for part a of this theorem anyway, we want to show that if I take my change of basis matrix and multiply it by this coordinate vector here. Then I get the coordinate vector of v with respect to basis c. I want to show that I'm changing the basis b to basis c. And the way you do that is just plop everything in, okay? At this point, we're just working from definitions, okay? So what is p b into c? Well, it's this matrix here. It's the matrix with the vectors from the basis b represented in the basis c as its columns, okay? And then what is v with respect to b? Well, it's just we wrote down over here, okay? But it's got to be as a column vector so that the matrix multiplication works out, right? Okay, whenever we do matrix multiplication, we're thinking of all of our vectors as column vectors. Okay, and now we just do this matrix multiplication to see what happens. And in particular, these matrices are partitioned so that we can just do block matrix multiplication really easily, right? If you do block matrix multiplication, it's just this first column times C1 plus the second column times C2 and so on. So you end up with exactly this linear combination down here. And then you use the fact that you can pull scalar multiplication and vector addition inside and outside of coordinate vectors, right? So I can pull the c1 inside this coordinate vector and c2 inside this coordinate vector and then all the vector sums inside and end up with the coordinate vector of this linear combination. So now instead of being linear combination of coordinate vectors it's coordinate vector of linear combinations. Okay and then I just look at what this is c1 v1 plus c2 v2 up to cn plus up to plus cn vn. that's exactly this linear combination up here, which is v. Okay, so this is just the coordinate vector of v with respect to c. Okay, and that's exactly what I wanted, right? I wanted to show that p turns coordinate vectors with respect to b into coordinate vectors with respect to c. So part a is done. That's all it is. It's just block multiplication, block matrix multiplication, and everything falls out. For part b of the theorem, part b, remember, part b of the theorem was this one that, you know, the matrix is invertible and, hey, its inverse is just... sort of the inverted change of basis matrix is the change of basis matrix in the other order. All right and the way that you do this is just ask yourself well what happens if I multiply these two matrices the matrix P, C into B and then also its purported inverse back to back by some vector. Okay so what happens if I do the product of these two matrices times VB? Well this first change of basis matrix is going to turn VB into VC right that's what we just proved in part a. But then the second change of basis matrix is going to convert C back into B, so it's just going to convert back into VB. Again, we just use part A there. Okay, so in other words this matrix here, this product PBC times PCB, that leaves every single vector alone, does not change it. Okay, so in other words, this matrix PBCPCB, that's the identity matrix. Okay, and there's a theorem from introductory linear algebra that says whenever you have that, whenever you have a product of two matrices that's the identity matrix, they must be inverses of each other. Okay, so this is a theorem, you know, it was about one-sided inverses. If you ever have matrices A and B so that A times B equals identity, they must be inverses of each other. Okay, so that's all part B is. We're done with part B now as well. Alright, and finally the last part of this theorem that we have to prove is uniqueness of change of basis matrices. We have to show that there's only one, so we have to show that if there's any matrix P that satisfies property A, in other words if there's any matrix P such that P times VB equals VC for all vectors in the vector space, then actually this P must be the change of basis matrix P C into B. Alright, and the way that you do that is well we're allowed to choose V to be whatever we want in this equation here. So what if I choose V to be one of the vectors from the basis B that I'm working with, from the old basis B, right? Okay, well, I'm going to compute P VJB in two different ways, okay, and that's going to give me what I want. So the first way that I'm going to compute it is, what is this coordinate vector VJ with respect to B? Remember that VJ is in the basis B. The way to write VJ as a linear combination of the members of B is just, well, it's zero times the first basis vector plus zero times the second basis vector and so on, up to plus... 1 times the jth basis vector because it is the jth basis vector and then plus 0 times all the other ones So in other words the coordinate vector of VJ with respect to B is gonna have all zeros and then a single one And then all zero so in other words this coordinate vector here is exactly EJ the vector with 1 in the J spot and Zeroes everywhere else. Alright, so one way of computing this product here is, well, it's p times ej, which, again, if you remember how matrix multiplication works, this is exactly the jth column of p. All right, so great. So this quantity here is exactly the jth column of p. Okay, but on the other hand, we're assuming that p converts coordinate vectors with respect to b into coordinate vectors with respect to c. So also, p times vjb must be vjc. And when you combine those two calculations that we just did, that tells us that vjc, that's got to be the jth column of p. All right, but wait a second, the jth column of p is vjc. That's exactly the definition of p b into c. That's exactly the definition of this matrix here. This is the matrix whose jth column is vjc for all j, right? Its first column is v1c, its second column is v2c, and so on. So that finishes the theorem. That shows us that p really is equal to the change of basis matrix from b into c. Okay so there's only one of them, it's unique. All right so let's do an example here, let's actually compute something. So let's find the change of basis matrix from b into c where now we're working with a vector space p2 of degree less than or equal to two polynomials and the bases are just b equals the standard basis and c is this weird basis that we've worked with once before, so one plus x, one plus x squared, and x plus x squared. And this time once we find that matrix we're going to find the coordinate vector of this polynomial with respect to C by using that change of basis matrix. All right, so the reason that we're going to use this change of basis matrix to find this coordinate vector is, well, if you want to find the coordinate vector with respect to B, that's really easy, right? Because that's a standard basis, okay? So we talked about this before. If you're finding coordinate vectors with respect to standard basis, you just read off the coefficients. It's asking how many ones are there? There's four. How many x's are there? There's minus one. How many x squareds are there? There's three. So coordinate vector is easy in that case. We want p with respect to c though, so the way we're going to get that is we're just going to multiply this vector by the change of basis matrix into c, okay, and by definition that's just this matrix here, okay, so you represent the old basis vectors in the new basis, so 1 with respect to c is our first column, x with respect to c is our second column, x squared with respect to c that's going to be our third column. So that's enough. I mean we know how to compute these coordinate vectors so we could just go ahead and compute them, plug them in there and do the multiplication and it would all work out, but a slightly easier way to compute this change of basis matrix is instead going to be to use part b of that previous theorem. We're going to compute instead the change of basis matrix from b into c and then invert, or sorry, we're going to compute the change of basis matrix from c into b and then invert it and that'll give us the change of basis matrix from b into c. Okay and the reason that we do it this way instead is look what happens. Okay if we compute the change of basis matrix from c into b what you do is you represent the old basis vectors and the new basis. Okay so the old basis vectors in this time this time are from c so 1 plus x in the new basis b, 1 plus x squared in the new basis b, x plus x squared in the new basis b. And the new basis is the standard basis. So just like we just said, these coordinate vectors are easy to come up with, right? 1 plus x with respect to b, well that's just, there's 1 1, there's 1 x, and there's 0 x squared. So that first column is 1 1 0. 1 plus x in the standard basis, well there's 1 1, 0 x's, and 1 x squared. Those coefficients I just read off are 1 0 1. That's my second column. And for this third column, there's zero ones, there's one x, and there's one x squared. Zero, one, one. That's my third column. Okay, and now all you have to do is you take this matrix and you invert it, okay? That's something that you learned how to do in the previous course, and you get this matrix right here, and that is your change of basis matrix that's actually the proper way around, that goes into the basis that we want, that goes into basis C. Okay, and that's a lot easier than computing P from B into C directly because we would have to solve a linear system. for each of these columns here if we did it directly. So this inverting way is a little bit easier in this case. Okay anyway so now that we have our change of basis matrix then we just do the matrix multiplication. p with respect to c, well that's just we take our old p with respect to b that was easy to compute, multiply it by this change of basis matrix, and work it out and we get the vector 0, 4, minus 1. Okay, and we actually did this calculation in a completely different way a couple pages ago, a couple lectures ago, and you can go back and check that we computed p with respect to c a different way, and we got the exact same coordinate vector a couple lectures ago, okay, back when we first learned about coordinate vectors. So of course it all works out the same, it's just a different way of doing it. Okay, so in that example, the reason that we inverted to find the change of basis matrix is because we were computing a change of basis matrix where we were converting... from the standard basis to an uglier basis. Okay, and it turns out that actually computing standard base sorry, computing change of basis matrices into the standard basis is much easier. Again, because these columns you can just read off the coefficients. So computing standard basis matrices, computing change of basis matrices into the standard basis is easy. Well, what if you want to compute a change of basis matrix from some ugly basis to some other ugly basis. Well if there's no standard basis in sight, well if you're just going from ugly basis to ugly basis. Well this next theorem tells us probably the easiest way to do this. What you can do is you can get away with converting one of the ugly bases into the standard basis, converting another ugly basis into the standard basis, and then just doing sort of a routine combination, a routine calculation on them to get this the change of basis matrix that you want. Okay so here's the setup. Suppose that You know you got some finite dimensional vector space as always and this time we're gonna have three bases and the idea is bases B and C Those are the ones we actually want convert between and then basis E is just gonna be some helper basis to make the calculation easier To do okay oftentimes. It's the standard basis, but doesn't have to be okay Then what the theorem says is that if you compute these two change of basis matrices again with the idea being that they're easier to compute Alright, and then you compute the reduced row echelon form of this augmented matrix what you're gonna get is you're going to get an identity on the left-hand side just because this guy's invertible. We have a theorem saying that change of basis matrices are invertible. And then on the right-hand side you're going to be left with the change of basis matrix from B into C. In other words, the change of basis matrix that you actually want going between the two ugly bases. All right, and again to help you remember this theorem it's useful to note that, hey, on the left both of those change of basis matrices are converting into E, the standard basis, the nice basis, and The order of the other bases is the same on the left-hand, right-hand sides, right? We've got a C on the left, a B on the right. And hey, here's the C's on the left, the B's on the right, okay? So things go in the same order in both cases. Just to help you remember that. All right, so again, let's prove this. Let's see where this comes from. Okay, where's this theorem come from? All right, and then we'll do an example of actually computing with it. All right, and to prove this, the idea is, well, let's think about what if we just wanted to compute vj with respect to c? What if we wanted to compute this guy, where again, b is just, you know, we're just giving names to the vectors in b. b is v1, v2, up to vn. And we're just asking about, well, what if we wanted to compute the jth one of those represented in the basis c? Okay, well, one way that we could do that is we could use the fact that, well, change the basis matrices change bases. So we could compute vj with respect to e, the nice basis. And again, because we could just read off coefficients, that'd be easy to do because standard bases are nice. Okay, and then, well, we could set up this equation here, p times c into e. And again, this is easy to compute because converting into the standard basis is easy. and then that matrix will convert basis C into basis E. Okay, this actually, this is a linear system though. Okay, this doesn't tell us what the answer is right away because we want to find this piece here knowing this piece and this piece. Well, that's exactly a linear system, right? A linear system is something of the form AX equals B where we know A and we know B and we wanna know X. So the way to solve this is to row reduce, right? You throw everything into an augmented matrix, A on the left, B on the right. You find it's a reduced row echelon form you're going to get i on the left and x on the right, the solution of the linear system on the right. Okay, so in other words, if I solve this linear system, what I do is I throw p c into e on the left, vj with respect to e on the right. I find the reduced row echelon form of that. I'm going to get identity on the left, again, because this change of basis matrix is invertible, and on the right, I'm going to get the solution to the linear system. I'm going to get vj with respect to c. All right, now... That is true no matter what J I pick. There is nothing special about J. So this works for every basis vector from that basis B. Okay, so in other words, I can actually do all of, I can find all of these vectors sort of at the same time if I'm sort of clever and I just, instead of augmenting one column on the right-hand side, what if I augment all of them on the right-hand side? So what if I take the same left-hand side, P, C and E, I stick V1, E on the right-hand side and then V2, E on the right-hand side, all the way up to Vn, E on the right-hand side. And now I find the reduced row echelon form of this big long matrix, okay, that has, you know, an n by n piece on the left, and actually now an n by n piece on the right. Well, if I find the reduced row echelon form of that, the left-hand side is going to be the exact same. So the left-hand side determines which row operations I have to actually use. I'm still going to get an identity on the left-hand side when I find that reduced row echelon form, and it's still going to convert each VJE into a VJC. So it's going to convert this V1E. into v1c it's going to convert v2e into v2c and so on all the way up to vne it gets converted into vnc right but now if you look really carefully at what i just said this matrix over here again just by definition this is p b into e right the the matrix that has v1e v2e up to vne as its columns that's p b into e whereas on the right hand side here what have i got instead well now i'm converting the basis vectors from b into c so this is exactly p b into c. So another way of writing this line up here is exactly, well, the reduced rational inform of p c into e, p b into e, well that's exactly identity and then p b into c, okay? And that's the theorem, that's what we wanted to show, that was the statement of the theorem. Reduced rational inform of one matrix that's easy to compute but too big is this matrix that has identity and the thing we actually want. All right. So yeah, so just as a reminder again, both of these change of basis matrices on the left, as long as E is chosen to be the standard basis, those are both easy to compute. You can just read off coefficients from the basis vectors in B and C. And also this method of computing a change of basis matrix, it's almost identical to the method that you learned in the previous course for computing the inverse of a matrix, right? You're taking a matrix, you're augmenting with something, reduce rational and form, you end up with identity on the left. and the thing you actually want on the right in this case the thing we actually want is the change of basis matrix in introductory linear algebra the thing we actually want on the right was the inverse of the matrix right all right but it's the same basic procedure all right so let's use this procedure to actually do a calculation let's convert between two ugly bases and see how this method works okay so let's find the change of basis matrix from b into c where we have just two ugly bases okay, and they're bases of the degree two polynomials again, okay, and then we're going to use that change of basis matrix to compute v with respect to c if we already know that v with respect to b is the vector 1, 2, 3, okay, and one thing that you're going to notice that's really nice about this method is we don't have to go back and compute v itself from this. We could, but it's just extra work that we don't have to do. We can go straight from v with respect to b into v with respect to c without even knowing at any point what v itself is. Alright, so here's how it works. Again, you could compute this change of basis matrix P, B, and C directly from definition, but it's easier to use that theorem. So what I'm going to do is I'm going to say, hey, let's introduce a helper basis, the standard basis E, which is just 1 x x squared again. Alright, and then it's easy to spot what are the change of basis matrices into the standard basis. Okay, P, B into E, what you do is you look at B, and you just represent these in the standard basis and those become your columns. So look at this guy in the standard basis how many ones are there? Zero. How many x's are there? Minus one. How many x squareds are there? One. That becomes our first column zero minus one one. Look at the next vector in b. How many ones are there? Three. How many x's are there? Zero. How many x squareds are there? Two. Okay and so on. Our third column comes from this five and then similarly for the other change of basis matrix into e. Okay. 3x that gives me this first column this next basis vector gives us the second column and this last basis vector gives us the final column okay and now from those to find the change of basis matrix that we actually want what we do is row reduce okay so we stick this first change of basis matrix on the left we augment with this other change of basis matrix on the right both of them going into the standard basis all right and then you row reduce down to the identity matrix so i'll just Like think of this as solving a linear system. I'm just doing Gauss-Jordan elimination here from the first linear algebra course. After you take this matrix, which is just the two change of basis matrices side by side, you row reduce, you get this guy here. Identity on the left, we knew we were gonna get that because these change of basis matrices are invertible. And on the right-hand side, we get what we actually want. We get the change of basis matrix from B into C. So P, B into C is exactly this augmented right-hand side. All right, great, so that finds us the change of basis matrix. And now to find the coordinate vector with respect to c, all we do is we take the coordinate vector with respect to b, and we multiply it by p of b into c, the change of basis matrix. So just plop those guys down next to each other, this change of basis matrix we just computed, and then again remember that the question itself told us that v with respect to b was one, two, three, so that's the right-hand side there, and you just do matrix multiplication. and you get your answer that is v with respect to c and we didn't even have to go back and find the original coordinate vector or sorry the original vector v itself which is a polynomial we didn't have to find v itself at all to do that calculation all right so that's it for lecture eight and actually for week two's lecture notes so i will see you in lecture nine when we start talking about linear transformations between arbitrary vector spaces