Transcript for:
Understanding Gradient Vectors and Applications

Hello everyone, welcome to week three of MTH 201. So last week we learned about multivariable functions and partial derivatives. So to expand on that we're going to learn about the gradient vector and its applications like directional derivatives and optimization. So starting with this lecture right we're just going to define what that gradient vector that I mentioned is and so just to do that we're just going to plop a definition on screen. So the gradient vector you can express either way.

We're going to use the nabla operator which is this upside down triangle quite commonly in this course. And just as a refresher, if you have the standard multivariable functions that we've been looking at with two inputs, these terminologies of fx, represent the first order partial derivatives in each direction. So with respect to x is this partial derivative and then with respect to y is this partial derivative. So all the gradient is is it's just collecting those two into a vector with two elements because we have two inputs. So the first element is just our partial derivative with respect to x.

and then second element the other partial derivative. So this doesn't necessarily just have to be, at least the gradient doesn't have to just be defined for functions with two inputs, we can have as many inputs as we like, and to define the gradient in those cases, so in this case we have n inputs, so we'd have n first order partial derivatives, that we could calculate. So you can just collect them all in a vector with n elements, but we'll mostly just be sticking to these types of multivariable functions with two inputs.

So let's go ahead and do a quick example where we calculate the gradient of one of these multivariable functions. So we have this function up here and what we're looking to do is calculate the gradient which is just the collection of the first order partial derivatives. And in this case we have just two.

So before we differentiate and find these partial derivatives, let's expand out this middle term to make that a bit easier. So y minus one squared can simplify to this. And then we also still have to be careful about expanding out this negative three out the front. So that's easy. But the negative three times negative two y is going to become plus six y.

And then these constants actually cancel out at the end. So we get sorry, this is a y y squared. I didn't carry that over correctly, so it's just going to be this.

And now we can go ahead and calculate those partial derivatives. So with respect to x, it's just going to be 2x, because this first term is the only one with x in it. The rest are just constants, as far as x is concerned. And then with respect to y, this first term is going to go away, and we're left with...

this. So hopefully last week we had enough practice of partial differentiation and hopefully this isn't too bad to find these partial derivatives. So we can collect those together to get our gradient which is the first partial derivative here and then the other one. So that's all you have to do. And you might be wondering, well, if it's that easy to find the gradient, what is it useful for?

Well, let's take a look at a function we're pretty familiar with at this point. So it's this Pringle looking function. Or I also call it a saddle because it looks like a horse's saddle.

To find the gradient, it's pretty easy. The partial derivative with respect to x is just 2x here. and then partial derivative with respect to y is negative 2y.

So what we can actually do from this point is we've got this vector defined, but we could have any inputs x and y. So let's visualize a whole range of inputs by looking at a contour plot and putting these gradient vectors over the top of the contour plot. So something like this.

And so what we can do is we can just choose a point. So what I'm going to do is say point equals, we'll go x equals 0 and our y equals negative 1.5. So where that point would be is somewhere around here.

And you can see that the arrows here, so we've got arrows all over the place, but our arrows here are pointing upwards. like this. So how you'd go about if you had a point but you didn't have the arrows, is you just substitute these values in, the point values.

So putting x equals 0 here, and y equals negative 1.5 here, what you get is if you put an x equals 0 here, and I'll write this up the top here, so a gradient at point p here is going to be zero because two times zero is just zero and then negative two times negative 1.5 in here for y is going to give us positive three. So what this vector is going to do is it's going to point three up in the y direction but zero in the x direction so it's going to have no say in the x direction it's just going to point upwards or up positive along the y-axis here, vertical. So that's why we've got an arrow pointing up here.

And you can do the same to calculate all these vectors that we've got all over the contour plot. And so why is this vector pointing this way? Or why is this gradient vector pointing this way? Well, and I'll move my head for this, what the gradient vector represents is it points in the direction of maximum rate of change of f.

So another way to put it is it points in the direction of steepest ascent. So it's telling you where could we go from a particular point to find a higher value of the function or the highest possible value of the function from where we're at. So at this point, if we look over on the the visualization on the left, the surface plot, and we find that same point. So it'll be between negative one and negative two on the y to somewhere here and then zero on x. So somewhere around here it's pointing upwards so we'll go along y, it looks a bit different here, but it's saying we have to go this way to find the greatest increase of our function and that might be a bit...

It might not seem intuitive because it could also go this way to increase maybe, or maybe this way, but the kind of two sides cancel out a little bit and it's going towards the center kind of apex. We could do the same with another point and you'll find, so if we had the point over here, it's going to point this way to find the steepest ascent. And you can see that the same sort of point over here, roughly.

I don't know. And all the arrows over here are pointing this way to the right. So up, going up the function. So that's roughly what the gradient represents.

So when we calculate it at specific points, it's telling us where we can go for the steepest ascent in the function value.