Understanding Eigenvectors and Eigenvalues

Eigenvectors and eigenvalues is one of those topics that a lot of students find particularly unintuitive. Questions like, why are we doing this and what does this actually mean, are too often left just floating away in an unanswered sea of computations. And as I've put out the videos of this series, a lot of you have commented about looking forward to visualizing this topic in particular. I suspect that the reason for this is not so much that eigenthings are particularly complicated or poorly explained. In fact, it's comparatively straightforward, and I think most books do a fine job explaining it. The issue is that it only really makes sense if you have a solid visual understanding for many of the topics that precede it. Most important here is that you know how to think about matrices as linear transformations, but you also need to be comfortable with things like determinants, linear systems of equations, and change of basis. Confusion about eigenstuffs usually has more to do with a shaky foundation in one of these topics than it does with eigenvectors and eigenvalues themselves. To start, consider some linear transformation in two dimensions, like the one shown here. It moves the basis vector i-hat to the coordinates 3, 0, and j-hat to 1, 2. So it's represented with a matrix whose columns are 3, 0, and 1, 2. Focus in on what it does to one particular vector, and think about the span of that vector, the line passing through its origin and its tip. Most vectors are going to get knocked off their span during the transformation. I mean, it would seem pretty coincidental if the place where the vector landed also happened to be somewhere on that line. But some special vectors do remain on their own span, meaning the effect that the matrix has on such a vector is just to stretch it or squish it, like a scalar. For this specific example, the basis vector i-hat is one such special vector. The span of i-hat is the x-axis, and from the first column of the matrix, we can see that i-hat moves over to 3 times itself, still on that x-axis. What's more, because of the way linear transformations work, any other vector on the x-axis is also just stretched by a factor of 3, and hence remains on its own span. A slightly sneakier vector that remains on its own span during this transformation is negative 1, 1. It ends up getting stretched by a factor of 2. And again, linearity is going to imply that any other vector on the diagonal line spanned by this guy is just going to get stretched out by a factor of 2. And for this transformation, those are all the vectors with this special property of staying on their span. Those on the x-axis getting stretched out by a factor of 3, and those on this diagonal line getting stretched by a factor of 2. Any other vector is going to get rotated somewhat during the transformation, knocked off the line that it spans. As you might have guessed by now, these special vectors are called the eigenvectors of the transformation, and each eigenvector has associated with it what's called an eigenvalue, which is just the factor by which it's stretched or squished during the transformation. Of course, there's nothing special about stretching versus squishing, or the fact that these eigenvalues happen to be positive. In another example, you could have an eigenvector with eigenvalue negative 1 half, meaning that the vector gets flipped and squished by a factor of 1 half. But the important part here is that it stays on the line that it spans out without getting rotated off of it. For a glimpse of why this might be a useful thing to think about, consider some three-dimensional rotation. If you can find an eigenvector for that rotation, a vector that remains on its own span, what you have found is the axis of rotation. And it's much easier to think about a 3D rotation in terms of some axis of rotation and an angle by which it's rotating, rather than thinking about the full 3x3 matrix associated with that transformation. In this case, by the way, the corresponding eigenvalue would have to be 1, since rotations never stretch or squish anything, so the length of the vector would remain the same. This pattern shows up a lot in linear algebra. With any linear transformation described by a matrix, you could understand what it's doing by reading off the columns of this matrix as the landing spots for basis vectors. But often, a better way to get at the heart of what the linear transformation actually does, less dependent on your particular coordinate system, is to find the eigenvectors and eigenvalues. I won't cover the full details on methods for computing eigenvectors and eigenvalues here, but I'll try to give an overview of the computational ideas that are most important for a conceptual understanding. Symbolically, here's what the idea of an eigenvector looks like. A is the matrix representing some transformation, with v as the eigenvector, and lambda is a number, namely the corresponding eigenvalue. What this expression is saying is that the matrix-vector product, A times v, gives the same result as just scaling the eigenvector v by some value lambda. So finding the eigenvectors and their eigenvalues of a matrix A comes down to finding the values of v and lambda that make this expression true. It's a little awkward to work with at first, because that left-hand side represents matrix-vector multiplication, but the right-hand side here is scalar-vector multiplication. So let's start by rewriting that right-hand side as some kind of matrix-vector multiplication, using a matrix which has the effect of scaling any vector by a factor of lambda. The columns of such a matrix will represent what happens to each basis vector, and each basis vector is simply multiplied by lambda, so this matrix will have the number lambda down the diagonal, with zeros everywhere else. The common way to write this guy is to factor that lambda out and write it as lambda times i, where i is the identity matrix with 1s down the diagonal. With both sides looking like matrix-vector multiplication, we can subtract off that right-hand side and factor out the v. So what we now have is a new matrix, A minus lambda times the identity, and we're looking for a vector v such that this new matrix times v gives the zero vector. Now, this will always be true if v itself is the zero vector, but that's boring. What we want is a non-zero eigenvector. And if you watch chapter 5 and 6, you'll know that the only way it's possible for the product of a matrix with a non-zero vector to become zero is if the transformation associated with that matrix squishes space into a lower dimension. And that squishification corresponds to a zero determinant for the matrix. To be concrete, let's say your matrix A has columns 2, 1 and 2, 3, and think about subtracting off a variable amount, lambda, from each diagonal entry. Now imagine tweaking lambda, turning a knob to change its value. As that value of lambda changes, the matrix itself changes, and so the determinant of the matrix changes. The goal here is to find a value of lambda that will make this determinant zero, meaning the tweaked transformation squishes space into a lower dimension. In this case, the sweet spot comes when lambda equals 1. Of course, if we had chosen some other matrix, the eigenvalue might not necessarily be 1. The sweet spot might be hit at some other value of lambda. So this is kind of a lot, but let's unravel what this is saying. When lambda equals 1, the matrix A minus lambda times the identity squishes space onto a line. That means there's a non-zero vector v such that A minus lambda times the identity times v equals the zero vector. And remember, the reason we care about that is because it means A times v equals lambda times v, which you can read off as saying that the vector v is an eigenvector of A, staying on its own span during the transformation A. In this example, the corresponding eigenvalue is 1, so v would actually just stay fixed in place. Pause and ponder if you need to make sure that that line of reasoning feels good. This is the kind of thing I mentioned in the introduction. If you didn't have a solid grasp of determinants and why they relate to linear systems of equations having non-zero solutions, an expression like this would feel completely out of the blue. To see this in action, let's revisit the example from the start, with a matrix whose columns are 3, 0 and 1, 2. To find if a value lambda is an eigenvalue, subtract it from the diagonals of this matrix and compute the determinant. Doing this, we get a certain quadratic polynomial in lambda, 3 minus lambda times 2 minus lambda. Since lambda can only be an eigenvalue if this determinant happens to be zero, you can conclude that the only possible eigenvalues are lambda equals 2 and lambda equals 3. To figure out what the eigenvectors are that actually have one of these eigenvalues, say lambda equals 2, plug in that value of lambda to the matrix and then solve for which vectors this diagonally altered matrix sends to zero. If you computed this the way you would any other linear system, you'd see that the solutions are all the vectors on the diagonal line spanned by negative 1, 1. This corresponds to the fact that the unaltered matrix, 3, 0, 1, 2, has the effect of stretching all those vectors by a factor of 2. Now, a 2D transformation doesn't have to have eigenvectors. For example, consider a rotation by 90 degrees. This doesn't have any eigenvectors since it rotates every vector off of its own span. If you actually try computing the eigenvalues of a rotation like this, notice what happens. Its matrix has columns 0, 1 and negative 1, 0. Subtract off lambda from the diagonal elements and look for when the determinant is zero. In this case, you get the polynomial lambda squared plus 1. The only roots of that polynomial are the imaginary numbers, i and negative i. The fact that there are no real number solutions indicates that there are no eigenvectors. Another pretty interesting example worth holding in the back of your mind is a shear. This fixes i-hat in place and moves j-hat 1 over, so its matrix has columns 1, 0 and 1, 1. All of the vectors on the x-axis are eigenvectors with eigenvalue 1 since they remain fixed in place. In fact, these are the only eigenvectors. When you subtract off lambda from the diagonals and compute the determinant, what you get is 1 minus lambda squared. And the only root of this expression is lambda equals 1. This lines up with what we see geometrically, that all of the eigenvectors have eigenvalue 1. Keep in mind though, it's also possible to have just one eigenvalue, but with more than just a line full of eigenvectors. A simple example is a matrix that scales everything by 2. The only eigenvalue is 2, but every vector in the plane gets to be an eigenvector with that eigenvalue. Now is another good time to pause and ponder some of this before I move on to the last topic. I want to finish off here with the idea of an eigenbasis, which relies heavily on ideas from the last video. Take a look at what happens if our basis vectors just so happen to be eigenvectors. For example, maybe i-hat is scaled by negative 1 and j-hat is scaled by 2. Writing their new coordinates as the columns of a matrix, notice that those scalar multiples, negative 1 and 2, which are the eigenvalues of i-hat and j-hat, sit on the diagonal of our matrix, and every other entry is a 0. Any time a matrix has zeros everywhere other than the diagonal, it's called, reasonably enough, a diagonal matrix. And the way to interpret this is that all the basis vectors are eigenvectors, with the diagonal entries of this matrix being their eigenvalues. There are a lot of things that make diagonal matrices much nicer to work with. One big one is that it's easier to compute what will happen if you multiply this matrix by itself a whole bunch of times. Since all one of these matrices does is scale each basis vector by some eigenvalue, applying that matrix many times, say 100 times, is just going to correspond to scaling each basis vector by the 100th power of the corresponding eigenvalue. In contrast, try computing the 100th power of a non-diagonal matrix. Really, try it for a moment. It's a nightmare. Of course, you'll rarely be so lucky as to have your basis vectors also be eigenvectors. But if your transformation has a lot of eigenvectors, like the one from the start of this video, enough so that you can choose a set that spans the full space, then you could change your coordinate system so that these eigenvectors are your basis vectors. I talked about change of basis last video, but I'll go through a super quick reminder here of how to express a transformation currently written in our coordinate system into a different system. Take the coordinates of the vectors that you want to use as a new basis, which in this case means our two eigenvectors, then make those coordinates the columns of a matrix, known as the change of basis matrix. When you sandwich the original transformation, putting the change of basis matrix on its right and the inverse of the change of basis matrix on its left, the result will be a matrix representing that same transformation, but from the perspective of the new basis vectors coordinate system. The whole point of doing this with eigenvectors is that this new matrix is guaranteed to be diagonal with its corresponding eigenvalues down that diagonal. This is because it represents working in a coordinate system where what happens to the basis vectors is that they get scaled during the transformation. A set of basis vectors which are also eigenvectors is called, again, reasonably enough, an eigenbasis. So if, for example, you needed to compute the 100th power of this matrix, it would be much easier to change to an eigenbasis, compute the 100th power in that system, then convert back to our standard system. You can't do this with all transformations. A shear, for example, doesn't have enough eigenvectors to span the full space. But if you can find an eigenbasis, it makes matrix operations really lovely. For those of you willing to work through a pretty neat puzzle to see what this looks like in action and how it can be used to produce some surprising results, I'll leave up a prompt here on the screen. It takes a bit of work, but I think you'll enjoy it. The next and final video of this series is going to be on abstract vector spaces. See you then!

Transcript for:Understanding Eigenvectors and Eigenvalues

Transcript for:
Understanding Eigenvectors and Eigenvalues