Transcript for:
Understanding Projections in Linear Algebra

Okay, guys, we're almost ready to make this lecture immortal. Okay. Are we on? All right. This is an important lecture. It's about projection. And let me start by saying that I'm going to be talking about by just projecting a vector b down on a vector a. So just so you see what the geometry looks like in just two dimensions, I'd like to find the point point along this line, so that line through A is a one-dimensional subspace, so I'm starting with one dimension. I'd like to find the point on that line closest to B. Can I just take that problem first, and then I'll explain why I want to do it and why I want to project on other subspaces. So where is the point closest to B that's on that line? It's somewhere there. there. And let me connect that. And what's the whole point of my picture now? Where does orthogonality come into this picture? The whole point is that this best point, that's the projection, P, of B onto the line, where's orthogonality? It's the fact that that's a right angle. That this, the error, this is like how much I'm wrong by. This is the difference between B and P. The whole point is that that's perpendicular to A. That's got to give us the equation. That's got to tell us that's the one fact we know. That's got to tell us where that projection is. Let me also say. Look. I've drawn a triangle there. So if we were doing trigonometry, we would do like, we would have angles theta and distances that would involve sine theta and cos theta. That leads to lousy formulas compared to linear algebra. The formula that we want comes out nicely. And what's the, what do we know? We know that p. this projection, is some multiple of A, right? It's on that line. So we know it's in that one-dimensional subspace. It's some multiple, let me call that multiple x, of A. So really it's that number x I'd like to find. So this is going to be simple in 1-D, so let's just carry it through and then see how it goes in high dimension. OK. The key fact is, so the key- the key to everything is that perpendicular, the fact that that A is perpendicular to A is perpendicular to E, which is B minus A x, xA. I don't care if that's xA. That that equals zero. You see that as the central equation? That's saying that this A is perpendicular to this correction. That's going to tell us what x is. Let me just raise the board and simplify that. and out will come x. OK. So if I simplify that, let's see, I'll move one term to one side, the other term will be on the other side. It looks to me like x times A transpose A is equal to A transpose B. Right? I have A transpose B as one term, A transpose A as the other, so right away, here's my A transpose A, but it's just a number now, and I divide by it, and I get the answer. x is A transpose B over A transpose A. And p, the projection I wanted, is that's the right multiple. That's got a cosine theta built in, but we don't need to look at angles. We've just got vectors here. And the projection is p. Is A times that x, or x times that A. But I'm really going to, eventually I'm going to want that x coming on the right-hand side. So you see that I've got two of the three formulas already. Right here, I've got the, the, the equation, that's the equation that, that, that leads me to the answer. Here's the answer for x, and here's the projection. OK. Can I do, add just one more thing to this one-dimensional problem? One more, like, lift it up into linear algebra, into matrices. Here's the last thing I want to do with those, but don't forget those formulas. A transpose B over A transpose A. Actually, let's look at that for a moment first. Suppose And A. Well then, I'll let me take this next step. So P is A times X. So can I write that then? P is A times this neat number, A transpose B over A transpose A. That's our projection. Can I ask a couple of questions about it, just while we look, get that, digest that formula? Suppose b is doubled. Suppose I change b to 2b. What happens to the projection? So suppose I, instead of that vector b that I drew on the board, make it 2b, twice as long. What's the projection now? It's double two, right? It's going to be twice as far. If b goes twice as far, the projection will go twice as far. And you see it there, if I put in an extra factor two, then p's got that factor two. Now what about if I double a? What if I double the vector a that I'm projecting onto? What changes? No, the projection doesn't change at all. Right? Because I'm just, the line didn't change. If I double A or I take minus A, it's still that same line. The projection's still in the same place. And of course, if I double A, I get a four up above, and I get a four, an extra four below, they cancel out, and the projection is the same. OK. So really, this, I want to look at this. As the projection is, there's a matrix here. The projection is being, is carried out by some matrix that I'm going to call the projection matrix. And in other words, the projection is some matrix. that acts on this guy B and produces the projection. The projection P is the projection matrix acting on whatever the input is. The input is B, the projection matrix is P. OK, actually you can tell me right away what this projection matrix is. So this is a pretty interesting matrix. What matrix is multiplying B? I'm just, just from my formula, I see what P is. P, this projection matrix, is, what is it? I see A A transpose above, and I see A transpose A below. And those don't cancel. That's not one. Right? That's a matrix, because down here, the A transpose A, that's just a number. A transpose A, that's the length of A squared. And up above is a column times A squared. a row. Column times a row is a matrix. So this is a full-scale n by n matrix if I'm in n dimensions. And it's kind of an interesting one. And it's the one which, if I multiply by b, then I get this. You see, once again I'm putting parentheses in different places. I'm putting the parentheses like there. I'm saying, OK, that's really the matrix that produces this projection. OK. Now, tell me. All right. What are the properties of that matrix? I'm just using letters here, A and B. I could put in numbers. But I think it's, for once, it's clearer with letters, because all formulas are simple. A transpose B over A transpose A. That's the number that multiplies the A. And then I see, wait a minute, there's a matrix here. And what's the rank of that matrix, by the way? What's the rank of that matrix? Yeah, let me just ask you about that matrix, which looks a little strange, A A transpose over this number. But, well, I could ask you its column space. Yeah, let me ask you its column space. So what's the column space of a matrix? It's if you multiply that matrix by anything, you always get in the column space, right? The column space of a matrix is when you multiply any vector by that matrix, any vector b, by the matrix, you always land in the column space. That's what column spaces were. Now, what space do we always land in? What's the column space of, what's the result when I multiply this any vector b by my matrix? So I have p times b. Where am I? I'm on that line, right? The column space, so here are facts about this matrix. The column space of P of this projection matrix is the line through A. And the rank of this matrix is, I'll say it at once, One. Right. The rank is one. This is a rank one matrix. Actually, it's exactly the form that we're familiar with, with rank one matrix, with a rank one matrix. A column times a row. That's a rank one matrix. That column is the basis for the column space. Just one dimension. OK. So I know that much about the matrix. But now there are two more facts about the matrix that I want to notice. Is, first of all, is the matrix symmetric? That's a natural question for matrices. And the answer is yes. If I take the transpose of this, there's a number down there. The transpose of AA transpose is AA transpose. So P is symmetric. P transpose equals P. So this is a key property, that the projection matrix is symmetric. One more property now, and this is is the real one. What happens if I do the projection twice? So I'm looking for something, some information about p squared. But just give me in terms of that picture, in terms of my picture. I take any vector b, I multiply it by my projection matrix, and that puts me there. So this is pb. And now I project again. What happens now? What happens when I apply the projection matrix a second time to this? So I'm applying it once, brings me here, and the second time brings me here. I stay put. Right? The projection for a point on this line, the projection is right where it is. The projection is the same point. So that means that if I project twice, I get the same answer as I did in the first projection. So those are the two properties that tell me I'm looking at a projection matrix. It's symmetric. And its square is itself, because if I project a second time, it's the same result as the first projection. OK. So that's, and then here's the exact formula. When I know what I'm projecting onto that line through A, then I know what P is. So do you see that I have all the pieces here for projection on a line? Now, and those, please remember them. So there are three formulas to remember. The formula for x, the formula for p, which is just Ax, and then the formula for capital P, which is the matrix. Good. Good. OK. Now I want to move to more dimensions. So we're going to have three formulas again, but you'll have to, they'll be a little different, because we won't have a single line, but a plane or three-dimensional or a n-dimensional subset. OK. So now I'm moved to the next question. Maybe I'll say first. Let me say first why I want this projection, and then we'll figure out what it is. We'll go completely in parallel there, and then we'll use it. OK. Why do I want this projection? Well, so why project? It's because I'm As I mentioned last time, this new chapter deals with equations A x equal b may have no solution. So that's really my problem, that I'm given a bunch of equations, probably too many equations, more equations than unknowns, and I can't solve them. OK. So what do I do? I solve the closest problem that I can solve. And what's the closest one? Well, Ax will always be in the column space of A. That's my problem. My problem is Ax has to be in the column space, and B is probably not in the column space. So I change B to what? I choose the closest vector in the column space, so I'll solve Ax equal p instead. That one I can do. Where p is, this is the projection of B onto the column space. That's why I want to be able to do this, because I have to find a solution here, and I'm going to put a little hat there to indicate that it's not the x, it's not the x that doesn't exist, it's the x hat that's best possible. So I must be able to figure out what's the good projection there. What's the good right-hand side that is in the column space, that's as close as possible to b, and then I'm and then I know what to do. OK. So now I've got that problem. So that's why I have the problem again. But now let me say I'm in three dimensions. So I have a plane, maybe, for example, and I have a vector b that's not in the plane. And I want to project B down into the planet. Okay. So there's my question. How do I project a vector? And I'm really, what I'm looking for is a nice formula, and I'm counting on linear algebra to just come out right. A nice formula for the projection of B into the plane, the nearest point. So this, again, a right angle is going to be crucial. OK. Now, so what's, first of all, I've got to say, what is that plane? To get a formula, I have to tell you what the plane is. How am I going to tell you a plane? I'll tell you a basis for the plane. I'll tell you two vectors. A1 and A2 that give you a basis for the plane. So let's say there's an A1 and here's a vector A2. They don't have to be perpendicular, but they better be independent, because then that tells me the plane. The plane is the plane of A1 and A2. And actually, going back to my, to this, to this connection, this plane is a column space. It's the, it's the column space of what matrix? What matrix, so how do I connect the two questions? I'm thinking, how do I project onto a plane? And I want to get a matrix in here. Everything's cleaner if I write it in terms of a matrix. So what matrix has these, has that column space? Well, of course, it's just the matrix that has A1 in the first column and A2 in the second column. Right? Just, just, let's be sure we've got the question before we get to the answer. So I'm looking for, again, I'm given a matrix A with two columns. And really, I'm ready, once I get to two, I'm ready for n. So it could be two columns, it could be n columns. I'll write the answer in terms of the matrix A. And the point will be, those two columns describe the plane, they describe the column space, and I want to project. OK. And I'm given a vector b that's probably not in the column space. Of course if b is in the column space, my projection is simple, it's just b. But most likely I have an error e, this b minus p part, which is probably not zero. OK. But the beauty is that I know the, from geometry, or I could get it from calculus, or I could get it from linear algebra, that this vector, this is the part of B that's perpendicular to the plane. That E is perpendicular is perpendicular to the plane. Your intuition is saying that that's the crucial fact. That that's going to give us the answer. OK. So let me, that's the problem. Now for the answer. So this is a lecture that's really like moving along. Because I'm just popping that problem up there and asking you what combination, now, yeah, so what is p? What is this projection p? This is projection p is some multiple of these. Basis guys, right? Some multiple of the columns. But I don't like writing out x1 a1 plus x2 a2. I would rather write that as ax. Well, actually I should put, if I'm really doing everything right, I should put a little hat on it. Remember that this x, that those are the numbers, and I could have put a hat way back there. Is, right? So this is the projection, P. P is ax bar. And I'm looking for x bar. So that's what I want an equation for. So now I've got hold of the problem. The problem is find the right combination of the columns so that. The error vector is perpendicular to the plane. Now, let me turn that into an equation. So I'll raise the board and just turn that, turn what we've just done into an equation. So let me, I'll write again. The main point. The projection is A x hat, and our problem is find x hat. And the key is that B minus A x hat, that's the error, this is the E, is perpendicular. To the plane. That's got to give me, well what am I looking for? I'm looking for two equations now, because I've got an x1 and an x2. And I'll get two equations, because, so this thing, E, is perpendicular to the plane. So what does that mean? I guess it means it's perpendicular to A1 and also to A2, right? Those are two vectors in the plane, and the things that are perpendicular to the plane are perpendicular to A1 and A2. Let me just repeat. This guy then is perpendicular to the plane, so it's perpendicular to that vector and that vector. It's perpendicular to that, of course, but it's perpendicular to everything in the plane. plane, and the plane is really told me by A1 and A2. So really I have the equations A1 transpose B minus A x is zero, and also A2 transpose B minus A x is zero. Those are my two equations. But I want those in matrix form. I want to put those two equations together as a matrix equation. And it just comes out right. Look at the matrix A transpose. Put A1, A1 transpose is its first row. A2 transpose is its second row. That multiplies this B minus Ax. and gives me the zero and the zero. You see that I'm just, this is one way to come up with this equation. So the equation I'm coming up with is A transpose B minus A x hat is zero. OK. That's my equation. All right. Now I want to stop for a moment before I solve it and just think about it. First of all, do you see that that equation Back in the very first problem I solved on a line, what was the-on a line, the matrix A only had one column. It was just little a. So in the first problem I solved, projecting on a line, this, for capital A, you just change that to little a, and you have the same equation that we solved before. A transpose E equals zero. OK. Now, a second thing, second comment. I would like to, since I know about these four subspaces, I would like to get them into this picture. So let me ask the question. What subspace is this thing in? Which of the four subspaces is that error vector E? This is nothing but E. This is this guy coming down the, in, down perpendicular to the plane. What subspace is E in from this equation? Well, the equation is saying A transpose E is zero. So I'm learning here that E is in the null space of A transpose. Right? That's my equation. And now I just want to see, hey, of course, that was right. Because things that are in the null space of A transpose, What do we know about the null space of A-train? So that last lecture gave us the sort of the geometry of these subspaces and the orthogonality of them. And do you remember what it was? What on the right side of our big figure, we always have the null space of A transpose and the column space of A and their orthogonal. So E in the null space of A transpose Is saying E is perpendicular to the column space of A. Yes. I just feel, okay, the damn thing came out right. The equation for, the equation that I struggled to find for E really said What I want. That the error e is perpendicular to the column space of it. Just right. And from our four fundamental subspaces, we knew that that is the same as that. To say e's in the column space, we're going to use the E's in the column space. And we're going to use the E's in the column space. And we're going to use the E's in the column space. And we're going to use the E's in the column space. And we're going to use the E's in the column space. And we're going to use the E's in the column space. the null space of A transpose as E's perpendicular to the column. OK. So we've got this equation, now let's just solve it. All right, let me just rewrite it as A transpose A x hat equals A transpose B. That's our equation. That gives us x. And, oh, allow me to keep remembering the one-dimensional case. The one-dimensional case, this was little a. So this was just a number, little a, little a transpose, a transpose a was just a vector, rho times a column of number. And this was a number. And x was the ratio of those numbers. But now we've got matrices. This one is n by n. A transpose A is an n by n matrix. OK, so can I move to the next board for the solution? OK. This is the key equation. Now I'm ready for the formulas that we have to remember. What's x hat? What's the projection? What's the projection matrix? Those are my three questions that we answered in the 1-D case and now we're ready for in the n-dimensional case. So what is x hat? Well, what can I say but A transpose A inverse, A transpose B. That's the solution to our question. What's the projection? That's more interesting. What's the projection? The projection is A x hat. That's how x hat got into the picture in the first place. x hat was the combination of columns, and then I had to look for those numbers, and now I found them, was the combination of columns. of the columns of A that gave me the projection. Okay? So now I know what this guy is, so it's just, I multiply by A. A A transpose A inverse A transpose B. That's looking a little messy, but it's not bad. That combination is our, like, magic combination. This is the thing which is, which use, which what, which is like, what's it like? What was it in one dimension? What was it? We had this, we must have had this thing way back at the beginning of the lecture. What did we, oh then A was just a column, so it was little a, little a transpose over A transpose A, right? That's what it was in 1D. You see what's happened in more dimensions. I'm not allowed to just divide because I don't have a number, I have to put inverse, because I have an n by n matrix. But same formula. And now Tell me what's the projection matrix. What matrix is multiplying B to give the projection? Right there. There it is. I even already underlined it by action. The projection matrix. Which I use capital P is this. It's that thing. Shall I write it again? A times A transpose A inverse times A transpose. Now if If you'll bear with me, I'll think about what I've done. I've got this formula. Now the first thing that occurs to me is, Is something bad. Look, why don't I just, you know, here's a product of two matrices and I want its inverse. Why don't I just use the formula I know for the inverse of a product and say, okay, that's A inverse times A transpose inverse. What will happen if I do that? What will happen if I say, hey, this is A inverse times A transpose inverse, then shall I do it? It's going to go on videotape if I do it. And I don't, all right, I'll put it there, but just like. Don't take the videotape quite so carefully. OK. So if I put that thing, it would be A A inverse, A transpose inverse, A transpose, and what's that? That's the identity. But what's going on? So, right, you see, my question is, somehow, I did something wrong. That wasn't allowed. And why is that? Because A is not a square matrix. A is not a square matrix. It doesn't have an inverse. So I have to leave it that way. This is not OK. If A was a square invertible matrix, then I couldn't complain. Yeah, I'll think, let me think about that case. But my main case, the whole reason I'm doing all this, is that A is this matrix that has too many rows. It's just got a couple of columns, like A1 and A2. but lots of rows. Not square. And if it's not square, this A transpose A is square, but I can't pull it apart like this. So I'm not allowed to do this pull apart, except if A was square. Now if A is square, what's going on if A is a square maker? A nice, square, invertible maker. Thank you. What's up with that case? So this is all the formula ought to work then too. If A is a nice square invertible matrix, what's its column space? So it's a nice n by n invertible everything great matrix. What's its column space? The whole of Rn. So what's the projection matrix if I'm projecting under the whole space? It's the identity, right? If I'm projecting B onto the whole space, not just onto a plane but onto all of 3D, then B is already in the column space. The projection is the identity, and this is, gives me the correct formula, P is up. But if I'm projecting onto a subspace, then I can't split those apart and I have to stay with that formula. OK. And what can I say, so I remember this formula for 1-D, and that's what it looks like in n dimensions. And what are the properties that I expected for any projection matrix, and I still expect for this one? That matrix should be symmetric, and it is. P transpose is P. Because if I transpose this, this guy's symmetric, and its inverse is symmetric, and if I transpose this, this guy's symmetric, and if I transpose this, this guy's symmetric, Transpose this one, when I transpose it, will pop up there and become A. The A transpose will pop up here and I'm back to P again. And do we dare try the other property, which is P squared equal P? It's got to be right. Close. We know, geometrically, that the first projection pops us onto the column space, and the second one leaves us where we are. So I expect that if I multiply by, let me do it, if I multiply by another p, so there's another a, another a transpose a inverse a transpose, and you. God, eight A's in a row is like obscene, but do you see that it works? So I'm squaring that, so what do I do? How do I see that multiplication? Well, yeah, I just want to put parentheses in good places so I see what's happening. Yeah, here's an A transpose A sitting together. So when that A transpose A multiplies its inverse, all that stuff goes, right? And leaves just the A transpose at the end, which is just what we want. So p squared equals p. So sure enough, those two properties hold. OK. That's, that, OK, we really have got now all the formulas. X hat, the projection p, and the projection matrix capital P. And now my job is to use them. OK. So when would I? When would I have a bunch of equations, too many equations, and yet I want the best things? And the most important example, the most common example, is if I have points. Here's the application. These squares. Fitting by a Lung. Okay. So I'll start this application today, and there's more in it than I can do in this same lecture. So that'll give me a chance to recap the formulas, and there they are, and recap the ideas. So let me start the problem today. I'm given a bunch of data points. And they lie close to a line but not on a line. Let me take that. Say at t equal to one, two, and three, I have one and two and two again. So my data points are, this is like the time direction, and this is like, well, let me call that v or y or something. I'm given these three points, and I want to fit them by a line. By the best straight line. So the problem is, fit the points one, One is the first point. The second point is t equals two, b equals one. And the third point is t equals three, b equals two. So those are my three points. t equals, sorry, that's two. So this is the point one-one, this is the point two-two, and that's the point three-two. And of course there isn't a line that goes through this. So I'm looking for the best line. I'm looking for a line that probably goes somewhere, you think it goes somewhere like that? I didn't mean to make it go through that point. It won't. It'll kind of, it'll go between, so the error there and the error there and the error there are as small as I can get them. OK. What I'd like to do is find the matrix A. Because once I've found the matrix A, the formula is takeover. So what I'm looking for this line, B is C plus DT. So this is in the homework that I sent out for today. Find the best line. So I'm looking for these numbers, C and D, that tell me the line. And I want them to be as close to going through those three points as I can get. I can't get exactly, so there are three equations to go through the three points. It would, it would go exactly through that point if, let's see, that first point has t equal to one, so that would say c plus d equals one. This is the one-one. The second point, t is two, so c plus two d should come out to equal two. But I also want to get the third equation in. And at that third equation, t is three, so c plus three d equals only two. That's the key, is to write down what equations we would like to solve but can't. If we could solve them, that would mean that we could put a line through all three points. And of course, if these numbers, 1, 2, 2, were different, we could do it. But with those numbers, one, two, two, we can't. So what is our equation A x equal B that we can't solve? I just want to say, what's the matrix here, what's the unknown x, and what's the right-hand side? So this is, the matrix is one, one, one, one, two, three. The unknown is c and d. And the right-hand side is one, two, two. Right? I've just taken my equations and I've said what is Ax and what is B. Then- There's no solution. This is the typical case where I have three equations, two unknowns, no solution. But I'm still looking for the best solution. And the best solution is taken is to solve not this equation, Ax equal b, which has no solution. But the equation that does have a solution, which was this one. So that's the equation to solve. That's the central equation of the subject. I can't solve Ax equal b. But magically, when I multiply both sides by A transpose, I get an equation that I can solve, and its solution gives me x, the best x, the best projection, and. and I discover what's the matrix that's behind it. OK. So next time I'll complete an example, numerically, then. Today was all letters, numbers next time. Thank you.