Understanding Dot Products and Bilinear Forms

Okay, so last time we were talking about this dot product. And so let me remind you, if you fix some vector v in whatever vector space we're talking about, vector space v, then we can think about the function f v fixed given some w, which will just be the function that gives you v dot. W, right? So what is this function? This function is a function. So this function f is a function that eats in one vector, w, so eats in something from your vector space and splits out a scalar value. So, I mean, particularly, this is like defined for real numbers. But we can extend this definition if we want by just saying, just do the transpose of one times the other, and then you can think of whatever field you want. Okay, so we said this is a linear function, right? This is linear, so let me write that down here. This is linear, in particular we call this a linear... form. And it's like, well, what does it mean for it to be linear? It means that If you, well, this function is going to respect scaling and addition. That is f of, you know, w1 plus w2 will come out to be f of w1 plus f of w2. And f of some constant times, you know, w is just that constant. Times your vector. And sometimes we don't like to do two steps. And so instead of doing that in two steps, we just verify that f of, you just append a constant to one of these terms. Say a constant times w. W1 plus W2, you just check that that comes out to be a constant times F of W1 plus F of W2. It's like, well, Y is doing that one step the same as doing these two, because if you let your constant be 1, if you're choosing a scalar of 1, you cover the first property. And if you choose a scalar to be 0, you cover the second. Well, since this is like over some arbitrary field f, I mean, we're doing it for r, we give you some arbitrary field f, I mean, you just have to like convince yourself in any field you have a 0 and a 1, but that's true. Okay, that was actually baked into our definition of vector space, 0s and 1s. So, here we go. That's what it means to be linear. But what I want to emphasize is that this function is quite nice, because it's not just linear with respect to w. We could instead define this the opposite way. We could instead... have fixed our w to be some vector in our vector space. And then we could have defined f of fixed w for any v is just that v dot w, or more generally v transpose times w. Right? We could define it this way. So here, The first value is fixed, and whatever I'm plugging in becomes the second value, right, in my dot product or in this multiplication, matrix multiplication. Here, it's the first guy I'm plugging in, and I'm holding the second guy, fixed. And I want to argue that this is still linear, right? Like, well, let's think about it, you know? Like, what is, so I'm going to claim that this is also linear. So let's think about it. What would F be of some constant times V1 plus V2? Well, by definition, that's just your constant times V1 plus V2 dot W, because W is fixed. F has some fixed W. And then you just go back to your first course in your algebra and remind yourself that dot product distributes across addition. So then you get your constant times v1 dot w plus v2 dot w. And then you should remind yourself scaling one vector. in a dot product is the same as taking a dot product and scaling, right? Because if I'm scaling this vector, it's multiplying all the values in the vector by some term, but then when I dot it, I can pull out that term, I can factor out the c, which would have been the same as first dotting and then scaling by c. So this is exactly the same as your c times f of v1 plus f of v2. So it's also scalar in this first term. Which means, instead of thinking... Of f. As a function from a vector space to r. What if we instead define. We could instead define. Some function g. From some vector space. Cross itself. In this instance. Cross itself. To let's say the real numbers. Where we define, so here we're really thinking about vector spaces being like, I don't know, something like Rn. You know, like Rn times itself. Where g of, this is saying you have two values coming into it. Both are vectors in Rn. Is just v dot w. If we think about this function, it's linear in each argument. It's linear in the first argument, and it's also linear in the second argument. Is it clear what I mean by that? If you hold one constant, the other one is linear. If you hold the other one constant, the second one is linear, right? So I'm just saying here exactly what I was saying right here, but now thinking of this as a function that has two inputs. So we want to give a name for these kinds of things. Here we thought about them, here they were linear forms, because they were just functions from a vector space to R. But more generally, if we have some functions taken in two vectors, and is linear in each argument, I'm going to call it a bilinear form. So here's my definition. If we're given some vector spaces, I'll do in complete generality, v and w over the same ground field f, typically for us it's r, but whatever your field is, given some vector fields v and w over f, we call a function that goes from v cross w to f bilinear, a bilinear form, a bilinear form, if f is linear in both arguments. That is, if I fix some v in my vector space v, then I want the function g with fixed v eating input w, which I'll define to be f of vw, should be linear. And if I fix instead some w in my second vector space, big W, then the function h with fixed w eating inputs from v, which I'll define to be just f of vw, should also be linear. So when you hold one constant, it's linear in the other term. I can break this down even more. I mean, is it clear what I'm saying about this? Yeah? Okay. I mean, let's just be pedantic. I mean, let's just really emphasize this. What I'm saying is I'm saying like f of some constant v1 plus v2 holding w constant will just be c times f of v1 w plus f of v2 w. That's what it means to be linear in the first term. So here's another way of thinking about it. And f of some fixed v of, you know, some constant times w1 plus w2 will come out to just be that constant times f of vw1 plus f of vw2. Okay, I think we kind of beat this horse dead at this point. So let's see an example of this. I mean, we have dot product, right? So if these are both guys inside of Rn and Rn, we know how that plays out. But let's try and think, what if we want different kinds of vector spaces? These don't have to be the same vector space. What if I want different vector spaces? Can we think of a map? that will maybe take in values from Rm and Rn and kick out a single real value. There you go, man. How are we going to define such a map like that? That seemed kind of strange. Well, any direction we might try? I mean, trying to do Rm and Rm. Let's just make it really simple. What if I just want to do like R2 and R3? What kind of a map might do this? Yeah. Okay, so you think, okay, 2 and 3, let's try a 2 by 3 matrix and see what that gives us. So, 2 rows, 3 columns, so just give me such a matrix, I don't know, like, 1, negative 3, 2, 0, 4, negative 1, okay, there's a matrix. But now let's remember what we want to do. I want to start out with vectors in here and here and get out a real value number. So it's like my first vector, call it v, should be something that has like, you know, two values. Maybe we'll say it has values v1 and v2. That should live inside this vector, inside of R2. And my second vector, w. Let's say it's some vector with values w1, w2, w3 should live inside of R3. And somehow we want to take these two vectors, have them play with this matrix, and get out something inside of R. Okay, so this is a 2 by 3 matrix. We want to end up with a real number. which a real number you can just think of as a one by one matrix. This guy here is a 3 by 1 matrix, and this guy here is a 2 by 1 matrix. So, like, how can we line stuff up so it comes out to be a 1 by 1? Well, do you remember the trick of matrix multiplication? It's like the inside numbers have to agree, right? So if this is a 2 by 3, what should go next to it? The 3 by 1. So I should probably put right here my vector w1, w2, w3. Now that's going to kick out a 2 by 1 matrix. So then what should I do with that 2 by 1 matrix? It's like, yeah, it's like, well, here we have a 2 by 1. Actually, if I take the transpose of v, v1, v2, I've taken its transpose. It's now a 1 by 2. And so when I do matrix multiplication, I'll end up with a one by one. So that's one way I could do it. And let's just go ahead and see what we come out with. So I'm going to continue this on to the next line. So what do we end up with in this case? It's like... Well, w1, w2, w3, it's row times column, so I get the transpose of my v. times top row times first column is one copy of W1 minus three copies of W2 plus two copies of W3. And on bottom, zero copies of W1 plus four copies of W2 minus one copy of W3. And then it's like, okay, row times column, so that v1 be times by the stuff on top, the v2 be times by the stuff on bottom, you add it together, so you get a v1 w, oh, these should not be vectors. These are real values. The vector's v with values v1 and v2. So v1 times the top gives me a v1 w1 minus 3 v1 w2 plus 2 v1. W3. There's nothing in this value that comes out to be 0. There's a 0 there. Here it was like 0 W1. We can write 0 for 1, so it's just 0 copies of V2 W1 plus 4 copies of V2 W2 and then minus a copy of V2 W3. And this is all being added together. So that would just be a real value number, right? Just adding up products of real numbers. So if you look closely at this, Notice the coefficients here 1, minus 3, 2, 0, 4, minus 1 correspond to these coefficients, the entries of our matrix, and like the v1, w1 is saying that its coefficient is the guy in the 1, 1 spot. Here we're in the 1, 2. It's like here we're, you know, this is like 1, 1, 1, 2, 1, 3, 2, 1, 2, 2, and 2, 3. Right? And so really this matrix is giving you the coefficients of this linear combination of things from the vector v with things in the vector w. Products. That is, in general, I can define a function from rm cross rn to the real numbers by taking some vector in Rm, some vector v, transposing it, multiplying it by some matrix M, and then, well, I don't want to use M because I'm confused with the M and the M. Let me call this a matrix A, and then multiplying it by your vector inside of Rn, which is going to end up just looking like... some v1 through vm times whatever your matrix A is. It's an m by n matrix. So I have values A11 through A1n down to Am1 to Amn. times the vector w, which is a vector in Rn. So that just has values w1 through wn. And what happens when you multiply? Well, just like before, your a11 becomes your coefficient of the v1w1 up until the a1n. is your coefficient of v1 w n and then you keep going until we get to the bottom when your a m1 is your coefficient of vm w1 through a mn which is your coefficient coefficient of Vm Wm. We happy? So you can kind of think it's like a kind of polynomial. where these values is like something from V times something in W. I guess in general, I could write this as, you know, I'm kind of out of space now. I don't want to cram it in, but like the shorthand way of writing this, I'll just do it up here. The shorthand way of writing V. transpose AW, well like that's already pretty quick, you know that's pretty brief, but it's just going to come out to be some sum of whatever your coefficients, the entries of your matrix are, AIJ. times some value from the vector v, some value i from the vector v, times some value w from the vector w, where i is going to range from, well, whatever the dimension of this vector is, so from 1 to m, and your j's are going to vary from 1 to n. Right? That's all we're saying here. Okay, I spent a lot of time on this like very specific example of a bilinear form. I mean, we actually didn't even prove it was bilinear, did we? Okay, is this bilinear? Is it? Man, like, we're really going to try and shove a lot in. So let's show this is bilinear. Let's just do it. Give me a new color so it pops out. Let's just show it's bilinear up here. Why not? So, like, is this bilinear? Well, let's check that it's bilinear in the first term, V. So instead of having a V here, What if I have, you know, so I'm defining f of my vw to just be v transpose aw, where a is some m by n matrix, right? Just like here. So what is f of some constant times v1 plus v2 w? You're like, well, that's just going to be some c V1 plus V2 transpose AW. And now it's like, okay, we have to think a little bit. Like, what is this saying? Well, a sum, these are vectors, but we're thinking of them as matrices, right? We're thinking about this as a 1 by m matrix plus a 1 by m matrix transposed. And so now you need to remember, like, taking the transpose of a sum of matrices is the same thing as taking the sum of the transposes. So this is the exact same thing as first transposing each term and then summing them up. And when you have a transpose, you can pull out a constant. That's fine. So we just get this. It's like, okay, well, now what does that give you? And then you have to remember, like, matrix multiplication distributes. So it's just everything's going to come out from the fact that matrices are nice and linear, right? So the linearity of this was followed from the fact that matrices behave in nice linear ways. You distribute the A. You can pull out a constant. You get C times V1 transpose AW. plus V2 transpose AW, which is exactly f of V1 plus V1W plus f of v2w, where we have a constant out front. There should be a constant times that. So it's linear in the first term. And you do the exact same argument, and you show it's also linear in the second argument. Right, exact same, exact same style of proof. So it's linear in both the first and the second term. So yes, this is bilinear. But it's not just that this is bilinear. This is going to be, for us, like the canonical expression, example of bilinear. Anything bilinear is going to look something like this. So let me try and convince you of that. First let's go to this dot product and try and convince ourselves that dot product looks something like that. And then I'll give you a proof in general that any bilinear form can be thought of as something like that. Make that more precise in a moment. Okay, so we have this guy of defined dot product. You're like, what is that? Well, if you think of your v as just comprising v1 down through vn, and your w as comprising w1 down through wn, then this is just v1 w1. plus v2w2 up until vnwn. Can that be achieved by thinking about some v transpose aw? Like, what is our matrix A going to be? What is this matrix A? So we're going to have V1 through Vn. We're going to have W1 through Wn. What should this matrix A be to give us that? Which one? It's just the identity matrix. It's just 1, 1. you know, just the identity matrix. It's like, why is just the identity matrix? Well, we've already said that this is the same thing as just V transpose W, right? So this is also secretly of that form. Okay. We can do lots more examples where we come up with a bilinear form. It looks nothing like that. And then we convince ourselves it's equally or something like that, right? This is kind of the game we've been playing. But let me just prove it. Let me just prove that in general, any bilinear form will look something like this. So here's my theorem. So we want to be given some vector spaces, v and w, over some field. Let's say v has some basis b, with basis vectors v1 up through vm. So this is an m-dimensional vector space. And we'll say that w has some basis c with basis vectors w1 up through wn. Okay. So it is important that these are finite-dimensional vector spaces. Otherwise, there's infinite matrix. What's going to happen? So we do need to say we're leaving no tension in finite dimensional vector spaces. But if we do that, then I want to say if f is a map from v cross w to f, if this map f is a bilinear form, Then there exists a unique M by N matrix, we'll call it A again, it's an M by N matrix A, where this matrix will have values in your field such that f of v w comes out to be the same thing as what I want to say it's v transpose a w but it's like hold on this vector v lives in the vector space V, which isn't necessarily like Rm. It could be some crazy vector space. Like, this could be a polynomial, right? And this W could be, you know, also a polynomial. It could be a matrix. It could be whatever. It could be some combinations of sines and cosines, right? So, like, how are we going to do a polynomial times a matrix, right? Like, this is going to get absurd. What do we need to do with this V and this W? Yeah, let's write them. Let's write the coordinate vectors in terms of the basis. So we're going to write the coordinate vectors in terms of their basis. Now, now this coordinate vector is something that lives inside of Rm, and this coordinate vector is something that lives inside of Rn, right? So now we can do the argument just like before. This will come out to be a one-by-one matrix, which we'll treat as a scalar in our field. That's where we want to prove. Are we all clear on the statement? I mean, I want you to, like, be quite convinced that this is really, like, quite a profound theorem. Right, so maybe what I'll just do really quick is just give you like an example of a bilinear form that's very different than the ones we talked about so far. You know, like you could take your vector space v to be something like the space of, you know, polynomials of like degree 3. And you could take your vector space w. to be, let's say, the dual space of polynomials of degree 3. And so it's like, what is the dual space of polynomials of degree 3? Well, that's just the set of all linear forms on P3. And then you could define your function f from P 3 cross its dual space to, you know, some real number to be defined by. Just take some polynomial in P3. So take some polynomial, I don't know, some g of x. You know, take something like, you know, whatever your g of x is. And take whatever your dual form is. So the dual form itself is a linear map. So I should give it some name. I don't know, maybe I'll call like lambda for linear. And send that. We'll define f to be the i that sends that to whatever the linear map does on g of x. So for example, An instance of this would be like taking some polynomial, like, you know, 3x squared plus 1, taking some linear map on polynomials. What's a linear map on polynomials? Give me an example of one. What's a linear form on polynomials? A linear kind of thing you can do to polynomials. We talked about many last time. Derivative. We can just, what was that? The derivative at x equals seven. The derivative at x equals seven. Sure, I mean, what if we just call it the Evaluation Map? Let's just keep it simple. You could do a derivative and then the Evaluation. That's a nice one. But let me just think like the Evaluation Map you said at seven. And so what this does is it eats in a polynomial, it eats in a linear map, and it spits out a linear form, and it spits out whatever e7 does to 3x squared plus 1, which is just going to spit out 49 times 3 plus 1, which is like 150 minus 2, so 148. Is that how math works? And you're like, this is a very strange kind of map. It's a bilinear map, though. You have to convince yourself it's bilinear. So it's like, well, if you hold the second guy constant, and you change the polynomial for the first guy, well, you'd say, that's going to be linear because the second guy is a linear form. Then it's like, what if you hold the first guy constant and you change the second guy? Well, again, you need to argue that you follow some properties of linearity that is also linear in that term. So this comes out to be a bilinear form. So this is a bilinear form. A bilinear form. But like secretly this bilinear form can be achieved by this. You know, it's like that's quite strange. That you can represent this bilinear form of some matrix. So maybe after I prove this, you know, you should try and think about what that matrix is. This might be a good exercise. Well, I guess that requires you to have a basis for both of these guys. So you would have to think a little bit about what your bases are. But I think you could do that. That would be a fun little activity. Okay. That would be a neat exam question, too. Okay. But point being, these binary forms can be kind of wild things, but they all just look like multiplication of some matrix A. So let's try and prove this. Here's my proof. Are we doing that time? We'll get through this. Okay. So here's the proof. Like how can we possibly prove this? Well, the way we're going to prove it is we're going to remind ourselves a bilinear form is something that's just linear in each argument. So in particular, if we fix some vector v, let's fix it to be one of the elements of our basis. Let's fix v1. The first basis vector, the first element of our basis for the vector space V. Let's fix V1. Then we get a linear form. That's defined just by G V1 of W that eats some vector W and spits out whatever F. of v1 w is. So it means to be bilinear. If you fix the first guy, fix the first argument, it will be linear in the second term. But then we said last time that linear forms have a special look to them. What do linear forms look like? Which, is, which can be written, how can we write this? What did we say last time? Leaning forms just look like dot products. That was the big lecture last time, that this would just look like, or in general, it's just going to look like some vector a, I'll call it some vector a1, transpose w. for some vector a1. Okay, let's take a second to think about where a1 lives. w is living inside of... our vector space big W, so it's a dimension n. So this is a n by one. So I guess A1 needs to be, well up here it's a one by n. That's A1 transpose, so A1 is also n by one. That is, you can think of A1 as living inside the space Fn. Oh, gotta be a little bit careful. W is in some arbitrary vector space W, so I need to first write that in terms of its basis to make sure I get something where this makes sense. So it's an n by 1 matrix. We good now? I think we're good. Do you remember this from last time? If you have a linear form, you can write it as like a dot product like this. That was our big takeaway from last lecture. And so this isn't just true for holding V1 constant. We can do likewise. You know, G holding V2 constant of W. We can just define to be F of V2 W. This is now a linear form. on w and so there's going to be some a2 transpose times w that gives you this and so on all the way down to holding the last guy constant holding vm constant of w will just be f of vm W which is just some guy a m, some vector transpose times W. Because these are all linear forms. These are all linear forms. Because it's linear in the first term. Happy? Well then let's define, like we have all these vectors now, let's just put them all together. So what I'm going to do is I'm going to define my big matrix A just to be the guy whose rows are A1, A2, down through AM. Notice there are M rows on this guy. And how many columns are there? Well, each of these live inside of Fn. So there's Fn entries in each row, so there are n columns. So now I have an m by n matrix. But like, check this out. What is A, how would I get A1 transpose? Like, how do I get that guy? How do I, so A1 transpose is now A1 as a column. How do I get A1 as a column? Well, all I need to do is I want, like this guy, this has, the first row is A1. So if I take the guy 1, 0, 0, 0, 0, 0, and times it by my matrix A, Then you're going to do, oh, did I do that backwards? Is that backwards? Yeah, that's backwards. I want to take the transpose of that. I want to take 1, 0, 0, 0, 0, and times that by a. Because then if I do this row times my matrix, I'll just recover a copy of a1. I guess you can think these are all transposes because these are the rows of the matrix. And my A2 transpose is just this vector 0, 1, you know, 0s everywhere else, times A, and so forth. But what is 1, 0, 0, 0, 0, 0? That's the same thing as if I took v1 and wrote it in terms of my basis b. Because then, v1, being my first basis vector of b, would be just written as one copy of v1 plus zero copies of everything else. So this is just v1 written in terms of b transpose times a. And this next guy is just your v2 written in terms of your basis b transpose times a. So what do we really have here? We really have this first guy. is just v1 in terms of your basis B transpose times A times W in terms of the basis C. And this guy down here is just, well, A2 we said is just v2 in terms of your basis B transpose times the matrix A times W in terms of your basis C, and so on. You know, this last guy is secretly just the same thing as Vm, written in terms of your basis, transpose, so that's just 0.1, times your matrix A, so that would recover the last row of A, and then times it by yeah and then it's like okay so so then what do we end up with well it's like this tells you what happens if you do f of v1 with w, we know what happens if you do f of v2 with w, we know what happens if you do f all the way down to vm of w, but in general, what will f of some arbitrary v with w look like? Well, v can be written in terms of our basis vectors. V is something in V, so just write it as some linear combination of your basis vectors. It's just some number of copies of V1 plus some number of copies of V2 all the way through some number of copies of Vm. So this is just F of C1V1 up through CmVm. with w. And all we know about f is it's bilinear. So far, we've been using the fact that it's linear in the second component. Now it's time to use the fact that it's linear in the first component. So that it's linear in the first component, this is the same thing as c1 copies of f of v1 with w. plus all the way through cm copies of f of vm with w. So we're just using linearity. So this is by the fact that f is linear in the first component. We can break it down like this. But we just said... what F of V1W looks like. It looks like these guys. It looks like V1, but in terms of your basis, transpose A times W, but in terms of your basis. All the way through, Vm, We're in terms of your basis, transpose times a w. We're in terms of your basis. And now what? It's like, okay, well, now we can use the fact that expressing things in terms of their basis is a linear operation. So this is the same thing as just c1 v1. through Cm Vm written in terms of its basis, transpose times A times W written in terms of its basis, which is just the original vector V written in terms of its basis, transpose times A times the vector W written in terms of its basis. You're like, that's a bit of a hot mess, but what did we just show? We just proved that any bilinear form has this form you were looking at before, right? You have vectors in terms of their basis, so you get some nice m. a 1 by m matrix. It's an m by 1. You transpose, so it's a 1 by m. Multiply it by your m by n matrix. Then multiply it by w in terms of its basis. So it's an n by 1. And you'll get out your scalar. So we've just shown that any bilinear map looks like the guys we were looking at before. That is, it just looks like a nice linear combination of the two. of some coefficients, whatever your elements of A are, you know, A, I, J, times, well, whatever the coordinate values are in terms of your basis B, in terms of your basis C. So your coordinates, you know, B, I, and your coordinates C, J, where this is, you know, V is written as sum B1, V1 through Bm, Vm. And c are your coordinates of your w. So c1, w1 through cn, wn. It just looks like this, where you're summing over your i's and your j's. You're summing over your i's and your j's. Exactly like we saw before. So even in the sky, you'll be able to find some matrix like this to think about it in this way. Okay, we'll stop there for today.

So what is this function? This function is a function. So this function f is a function that eats in one vector, w, so eats in something from your vector space and splits out a scalar value.

So, I mean, particularly, this is like defined for real numbers. But we can extend this definition if we want by just saying, just do the transpose of one times the other, and then you can think of whatever field you want. Okay, so we said this is a linear function, right? This is linear, so let me write that down here.

This is linear, in particular we call this a linear... form. And it's like, well, what does it mean for it to be linear? It means that If you, well, this function is going to respect scaling and addition. That is f of, you know, w1 plus w2 will come out to be f of w1 plus f of w2.

And f of some constant times, you know, w is just that constant. Times your vector. And sometimes we don't like to do two steps.

And so instead of doing that in two steps, we just verify that f of, you just append a constant to one of these terms. Say a constant times w. W1 plus W2, you just check that that comes out to be a constant times F of W1 plus F of W2.

It's like, well, Y is doing that one step the same as doing these two, because if you let your constant be 1, if you're choosing a scalar of 1, you cover the first property. And if you choose a scalar to be 0, you cover the second. Well, since this is like over some arbitrary field f, I mean, we're doing it for r, we give you some arbitrary field f, I mean, you just have to like convince yourself in any field you have a 0 and a 1, but that's true. Okay, that was actually baked into our definition of vector space, 0s and 1s. So, here we go.

That's what it means to be linear. But what I want to emphasize is that this function is quite nice, because it's not just linear with respect to w. We could instead define this the opposite way.

We could instead... have fixed our w to be some vector in our vector space. And then we could have defined f of fixed w for any v is just that v dot w, or more generally v transpose times w. Right?

We could define it this way. So here, The first value is fixed, and whatever I'm plugging in becomes the second value, right, in my dot product or in this multiplication, matrix multiplication. Here, it's the first guy I'm plugging in, and I'm holding the second guy, fixed.

And I want to argue that this is still linear, right? Like, well, let's think about it, you know? Like, what is, so I'm going to claim that this is also linear.

So let's think about it. What would F be of some constant times V1 plus V2? Well, by definition, that's just your constant times V1 plus V2 dot W, because W is fixed. F has some fixed W.

And then you just go back to your first course in your algebra and remind yourself that dot product distributes across addition. So then you get your constant times v1 dot w plus v2 dot w. And then you should remind yourself scaling one vector.

in a dot product is the same as taking a dot product and scaling, right? Because if I'm scaling this vector, it's multiplying all the values in the vector by some term, but then when I dot it, I can pull out that term, I can factor out the c, which would have been the same as first dotting and then scaling by c. So this is exactly the same as your c times f of v1 plus f of v2.

So it's also scalar in this first term. Which means, instead of thinking... Of f. As a function from a vector space to r.

What if we instead define. We could instead define. Some function g.

From some vector space. Cross itself. In this instance.

Cross itself. To let's say the real numbers. Where we define, so here we're really thinking about vector spaces being like, I don't know, something like Rn. You know, like Rn times itself. Where g of, this is saying you have two values coming into it.

Both are vectors in Rn. Is just v dot w. If we think about this function, it's linear in each argument.

It's linear in the first argument, and it's also linear in the second argument. Is it clear what I mean by that? If you hold one constant, the other one is linear.

If you hold the other one constant, the second one is linear, right? So I'm just saying here exactly what I was saying right here, but now thinking of this as a function that has two inputs. So we want to give a name for these kinds of things.

Here we thought about them, here they were linear forms, because they were just functions from a vector space to R. But more generally, if we have some functions taken in two vectors, and is linear in each argument, I'm going to call it a bilinear form. So here's my definition.

If we're given some vector spaces, I'll do in complete generality, v and w over the same ground field f, typically for us it's r, but whatever your field is, given some vector fields v and w over f, we call a function that goes from v cross w to f bilinear, a bilinear form, a bilinear form, if f is linear in both arguments. That is, if I fix some v in my vector space v, then I want the function g with fixed v eating input w, which I'll define to be f of vw, should be linear. And if I fix instead some w in my second vector space, big W, then the function h with fixed w eating inputs from v, which I'll define to be just f of vw, should also be linear.

So when you hold one constant, it's linear in the other term. I can break this down even more. I mean, is it clear what I'm saying about this? Yeah? Okay.

I mean, let's just be pedantic. I mean, let's just really emphasize this. What I'm saying is I'm saying like f of some constant v1 plus v2 holding w constant will just be c times f of v1 w plus f of v2 w.

That's what it means to be linear in the first term. So here's another way of thinking about it. And f of some fixed v of, you know, some constant times w1 plus w2 will come out to just be that constant times f of vw1 plus f of vw2.

Okay, I think we kind of beat this horse dead at this point. So let's see an example of this. I mean, we have dot product, right?

So if these are both guys inside of Rn and Rn, we know how that plays out. But let's try and think, what if we want different kinds of vector spaces? These don't have to be the same vector space. What if I want different vector spaces?

Can we think of a map? that will maybe take in values from Rm and Rn and kick out a single real value. There you go, man.

How are we going to define such a map like that? That seemed kind of strange. Well, any direction we might try?

I mean, trying to do Rm and Rm. Let's just make it really simple. What if I just want to do like R2 and R3?

What kind of a map might do this? Yeah. Okay, so you think, okay, 2 and 3, let's try a 2 by 3 matrix and see what that gives us.

So, 2 rows, 3 columns, so just give me such a matrix, I don't know, like, 1, negative 3, 2, 0, 4, negative 1, okay, there's a matrix. But now let's remember what we want to do. I want to start out with vectors in here and here and get out a real value number. So it's like my first vector, call it v, should be something that has like, you know, two values. Maybe we'll say it has values v1 and v2.

That should live inside this vector, inside of R2. And my second vector, w. Let's say it's some vector with values w1, w2, w3 should live inside of R3. And somehow we want to take these two vectors, have them play with this matrix, and get out something inside of R. Okay, so this is a 2 by 3 matrix.

We want to end up with a real number. which a real number you can just think of as a one by one matrix. This guy here is a 3 by 1 matrix, and this guy here is a 2 by 1 matrix. So, like, how can we line stuff up so it comes out to be a 1 by 1?

Well, do you remember the trick of matrix multiplication? It's like the inside numbers have to agree, right? So if this is a 2 by 3, what should go next to it? The 3 by 1. So I should probably put right here my vector w1, w2, w3. Now that's going to kick out a 2 by 1 matrix.

So then what should I do with that 2 by 1 matrix? It's like, yeah, it's like, well, here we have a 2 by 1. Actually, if I take the transpose of v, v1, v2, I've taken its transpose. It's now a 1 by 2. And so when I do matrix multiplication, I'll end up with a one by one.

So that's one way I could do it. And let's just go ahead and see what we come out with. So I'm going to continue this on to the next line. So what do we end up with in this case? It's like...

Well, w1, w2, w3, it's row times column, so I get the transpose of my v. times top row times first column is one copy of W1 minus three copies of W2 plus two copies of W3. And on bottom, zero copies of W1 plus four copies of W2 minus one copy of W3. And then it's like, okay, row times column, so that v1 be times by the stuff on top, the v2 be times by the stuff on bottom, you add it together, so you get a v1 w, oh, these should not be vectors. These are real values.

The vector's v with values v1 and v2. So v1 times the top gives me a v1 w1 minus 3 v1 w2 plus 2 v1. W3. There's nothing in this value that comes out to be 0. There's a 0 there.

Here it was like 0 W1. We can write 0 for 1, so it's just 0 copies of V2 W1 plus 4 copies of V2 W2 and then minus a copy of V2 W3. And this is all being added together.

So that would just be a real value number, right? Just adding up products of real numbers. So if you look closely at this, Notice the coefficients here 1, minus 3, 2, 0, 4, minus 1 correspond to these coefficients, the entries of our matrix, and like the v1, w1 is saying that its coefficient is the guy in the 1, 1 spot.

Here we're in the 1, 2. It's like here we're, you know, this is like 1, 1, 1, 2, 1, 3, 2, 1, 2, 2, and 2, 3. Right? And so really this matrix is giving you the coefficients of this linear combination of things from the vector v with things in the vector w. Products.

That is, in general, I can define a function from rm cross rn to the real numbers by taking some vector in Rm, some vector v, transposing it, multiplying it by some matrix M, and then, well, I don't want to use M because I'm confused with the M and the M. Let me call this a matrix A, and then multiplying it by your vector inside of Rn, which is going to end up just looking like... some v1 through vm times whatever your matrix A is. It's an m by n matrix. So I have values A11 through A1n down to Am1 to Amn.

times the vector w, which is a vector in Rn. So that just has values w1 through wn. And what happens when you multiply?

Well, just like before, your a11 becomes your coefficient of the v1w1 up until the a1n. is your coefficient of v1 w n and then you keep going until we get to the bottom when your a m1 is your coefficient of vm w1 through a mn which is your coefficient coefficient of Vm Wm. We happy?

So you can kind of think it's like a kind of polynomial. where these values is like something from V times something in W. I guess in general, I could write this as, you know, I'm kind of out of space now. I don't want to cram it in, but like the shorthand way of writing this, I'll just do it up here. The shorthand way of writing V. transpose AW, well like that's already pretty quick, you know that's pretty brief, but it's just going to come out to be some sum of whatever your coefficients, the entries of your matrix are, AIJ.

times some value from the vector v, some value i from the vector v, times some value w from the vector w, where i is going to range from, well, whatever the dimension of this vector is, so from 1 to m, and your j's are going to vary from 1 to n. Right? That's all we're saying here. Okay, I spent a lot of time on this like very specific example of a bilinear form. I mean, we actually didn't even prove it was bilinear, did we?

Okay, is this bilinear? Is it? Man, like, we're really going to try and shove a lot in. So let's show this is bilinear.

Let's just do it. Give me a new color so it pops out. Let's just show it's bilinear up here. Why not?

So, like, is this bilinear? Well, let's check that it's bilinear in the first term, V. So instead of having a V here, What if I have, you know, so I'm defining f of my vw to just be v transpose aw, where a is some m by n matrix, right? Just like here.

So what is f of some constant times v1 plus v2 w? You're like, well, that's just going to be some c V1 plus V2 transpose AW. And now it's like, okay, we have to think a little bit. Like, what is this saying?

Well, a sum, these are vectors, but we're thinking of them as matrices, right? We're thinking about this as a 1 by m matrix plus a 1 by m matrix transposed. And so now you need to remember, like, taking the transpose of a sum of matrices is the same thing as taking the sum of the transposes.

So this is the exact same thing as first transposing each term and then summing them up. And when you have a transpose, you can pull out a constant. That's fine. So we just get this.

It's like, okay, well, now what does that give you? And then you have to remember, like, matrix multiplication distributes. So it's just everything's going to come out from the fact that matrices are nice and linear, right? So the linearity of this was followed from the fact that matrices behave in nice linear ways.

You distribute the A. You can pull out a constant. You get C times V1 transpose AW. plus V2 transpose AW, which is exactly f of V1 plus V1W plus f of v2w, where we have a constant out front.

There should be a constant times that. So it's linear in the first term. And you do the exact same argument, and you show it's also linear in the second argument.

Right, exact same, exact same style of proof. So it's linear in both the first and the second term. So yes, this is bilinear. But it's not just that this is bilinear.

This is going to be, for us, like the canonical expression, example of bilinear. Anything bilinear is going to look something like this. So let me try and convince you of that. First let's go to this dot product and try and convince ourselves that dot product looks something like that.

And then I'll give you a proof in general that any bilinear form can be thought of as something like that. Make that more precise in a moment. Okay, so we have this guy of defined dot product.

You're like, what is that? Well, if you think of your v as just comprising v1 down through vn, and your w as comprising w1 down through wn, then this is just v1 w1. plus v2w2 up until vnwn.

Can that be achieved by thinking about some v transpose aw? Like, what is our matrix A going to be? What is this matrix A? So we're going to have V1 through Vn. We're going to have W1 through Wn.

What should this matrix A be to give us that? Which one? It's just the identity matrix. It's just 1, 1. you know, just the identity matrix. It's like, why is just the identity matrix?

Well, we've already said that this is the same thing as just V transpose W, right? So this is also secretly of that form. Okay. We can do lots more examples where we come up with a bilinear form. It looks nothing like that.

And then we convince ourselves it's equally or something like that, right? This is kind of the game we've been playing. But let me just prove it. Let me just prove that in general, any bilinear form will look something like this.

So here's my theorem. So we want to be given some vector spaces, v and w, over some field. Let's say v has some basis b, with basis vectors v1 up through vm. So this is an m-dimensional vector space.

And we'll say that w has some basis c with basis vectors w1 up through wn. Okay. So it is important that these are finite-dimensional vector spaces.

Otherwise, there's infinite matrix. What's going to happen? So we do need to say we're leaving no tension in finite dimensional vector spaces.

But if we do that, then I want to say if f is a map from v cross w to f, if this map f is a bilinear form, Then there exists a unique M by N matrix, we'll call it A again, it's an M by N matrix A, where this matrix will have values in your field such that f of v w comes out to be the same thing as what I want to say it's v transpose a w but it's like hold on this vector v lives in the vector space V, which isn't necessarily like Rm. It could be some crazy vector space. Like, this could be a polynomial, right? And this W could be, you know, also a polynomial. It could be a matrix.

It could be whatever. It could be some combinations of sines and cosines, right? So, like, how are we going to do a polynomial times a matrix, right?

Like, this is going to get absurd. What do we need to do with this V and this W? Yeah, let's write them. Let's write the coordinate vectors in terms of the basis. So we're going to write the coordinate vectors in terms of their basis.

Now, now this coordinate vector is something that lives inside of Rm, and this coordinate vector is something that lives inside of Rn, right? So now we can do the argument just like before. This will come out to be a one-by-one matrix, which we'll treat as a scalar in our field. That's where we want to prove.

Are we all clear on the statement? I mean, I want you to, like, be quite convinced that this is really, like, quite a profound theorem. Right, so maybe what I'll just do really quick is just give you like an example of a bilinear form that's very different than the ones we talked about so far. You know, like you could take your vector space v to be something like the space of, you know, polynomials of like degree 3. And you could take your vector space w. to be, let's say, the dual space of polynomials of degree 3. And so it's like, what is the dual space of polynomials of degree 3?

Well, that's just the set of all linear forms on P3. And then you could define your function f from P 3 cross its dual space to, you know, some real number to be defined by. Just take some polynomial in P3. So take some polynomial, I don't know, some g of x.

You know, take something like, you know, whatever your g of x is. And take whatever your dual form is. So the dual form itself is a linear map.

So I should give it some name. I don't know, maybe I'll call like lambda for linear. And send that.

We'll define f to be the i that sends that to whatever the linear map does on g of x. So for example, An instance of this would be like taking some polynomial, like, you know, 3x squared plus 1, taking some linear map on polynomials. What's a linear map on polynomials?

Give me an example of one. What's a linear form on polynomials? A linear kind of thing you can do to polynomials.

We talked about many last time. Derivative. We can just, what was that? The derivative at x equals seven.

The derivative at x equals seven. Sure, I mean, what if we just call it the Evaluation Map? Let's just keep it simple.

You could do a derivative and then the Evaluation. That's a nice one. But let me just think like the Evaluation Map you said at seven. And so what this does is it eats in a polynomial, it eats in a linear map, and it spits out a linear form, and it spits out whatever e7 does to 3x squared plus 1, which is just going to spit out 49 times 3 plus 1, which is like 150 minus 2, so 148. Is that how math works?

And you're like, this is a very strange kind of map. It's a bilinear map, though. You have to convince yourself it's bilinear.

So it's like, well, if you hold the second guy constant, and you change the polynomial for the first guy, well, you'd say, that's going to be linear because the second guy is a linear form. Then it's like, what if you hold the first guy constant and you change the second guy? Well, again, you need to argue that you follow some properties of linearity that is also linear in that term. So this comes out to be a bilinear form. So this is a bilinear form.

A bilinear form. But like secretly this bilinear form can be achieved by this. You know, it's like that's quite strange. That you can represent this bilinear form of some matrix.

So maybe after I prove this, you know, you should try and think about what that matrix is. This might be a good exercise. Well, I guess that requires you to have a basis for both of these guys. So you would have to think a little bit about what your bases are.

But I think you could do that. That would be a fun little activity. Okay. That would be a neat exam question, too.

Okay. But point being, these binary forms can be kind of wild things, but they all just look like multiplication of some matrix A. So let's try and prove this. Here's my proof.

Are we doing that time? We'll get through this. Okay.

So here's the proof. Like how can we possibly prove this? Well, the way we're going to prove it is we're going to remind ourselves a bilinear form is something that's just linear in each argument.

So in particular, if we fix some vector v, let's fix it to be one of the elements of our basis. Let's fix v1. The first basis vector, the first element of our basis for the vector space V. Let's fix V1.

Then we get a linear form. That's defined just by G V1 of W that eats some vector W and spits out whatever F. of v1 w is.

So it means to be bilinear. If you fix the first guy, fix the first argument, it will be linear in the second term. But then we said last time that linear forms have a special look to them. What do linear forms look like? Which, is, which can be written, how can we write this?

What did we say last time? Leaning forms just look like dot products. That was the big lecture last time, that this would just look like, or in general, it's just going to look like some vector a, I'll call it some vector a1, transpose w.

for some vector a1. Okay, let's take a second to think about where a1 lives. w is living inside of...

our vector space big W, so it's a dimension n. So this is a n by one. So I guess A1 needs to be, well up here it's a one by n.

That's A1 transpose, so A1 is also n by one. That is, you can think of A1 as living inside the space Fn. Oh, gotta be a little bit careful. W is in some arbitrary vector space W, so I need to first write that in terms of its basis to make sure I get something where this makes sense.

So it's an n by 1 matrix. We good now? I think we're good. Do you remember this from last time? If you have a linear form, you can write it as like a dot product like this.

That was our big takeaway from last lecture. And so this isn't just true for holding V1 constant. We can do likewise.

You know, G holding V2 constant of W. We can just define to be F of V2 W. This is now a linear form. on w and so there's going to be some a2 transpose times w that gives you this and so on all the way down to holding the last guy constant holding vm constant of w will just be f of vm W which is just some guy a m, some vector transpose times W. Because these are all linear forms. These are all linear forms.

Because it's linear in the first term. Happy? Well then let's define, like we have all these vectors now, let's just put them all together.

So what I'm going to do is I'm going to define my big matrix A just to be the guy whose rows are A1, A2, down through AM. Notice there are M rows on this guy. And how many columns are there?

Well, each of these live inside of Fn. So there's Fn entries in each row, so there are n columns. So now I have an m by n matrix.

But like, check this out. What is A, how would I get A1 transpose? Like, how do I get that guy?

How do I, so A1 transpose is now A1 as a column. How do I get A1 as a column? Well, all I need to do is I want, like this guy, this has, the first row is A1. So if I take the guy 1, 0, 0, 0, 0, 0, and times it by my matrix A, Then you're going to do, oh, did I do that backwards?

Is that backwards? Yeah, that's backwards. I want to take the transpose of that. I want to take 1, 0, 0, 0, 0, and times that by a. Because then if I do this row times my matrix, I'll just recover a copy of a1.

I guess you can think these are all transposes because these are the rows of the matrix. And my A2 transpose is just this vector 0, 1, you know, 0s everywhere else, times A, and so forth. But what is 1, 0, 0, 0, 0, 0? That's the same thing as if I took v1 and wrote it in terms of my basis b. Because then, v1, being my first basis vector of b, would be just written as one copy of v1 plus zero copies of everything else.

So this is just v1 written in terms of b transpose times a. And this next guy is just your v2 written in terms of your basis b transpose times a. So what do we really have here? We really have this first guy. is just v1 in terms of your basis B transpose times A times W in terms of the basis C.

And this guy down here is just, well, A2 we said is just v2 in terms of your basis B transpose times the matrix A times W in terms of your basis C, and so on. You know, this last guy is secretly just the same thing as Vm, written in terms of your basis, transpose, so that's just 0.1, times your matrix A, so that would recover the last row of A, and then times it by yeah and then it's like okay so so then what do we end up with well it's like this tells you what happens if you do f of v1 with w, we know what happens if you do f of v2 with w, we know what happens if you do f all the way down to vm of w, but in general, what will f of some arbitrary v with w look like? Well, v can be written in terms of our basis vectors. V is something in V, so just write it as some linear combination of your basis vectors.

It's just some number of copies of V1 plus some number of copies of V2 all the way through some number of copies of Vm. So this is just F of C1V1 up through CmVm. with w.

And all we know about f is it's bilinear. So far, we've been using the fact that it's linear in the second component. Now it's time to use the fact that it's linear in the first component. So that it's linear in the first component, this is the same thing as c1 copies of f of v1 with w.

plus all the way through cm copies of f of vm with w. So we're just using linearity. So this is by the fact that f is linear in the first component. We can break it down like this. But we just said...

what F of V1W looks like. It looks like these guys. It looks like V1, but in terms of your basis, transpose A times W, but in terms of your basis.

All the way through, Vm, We're in terms of your basis, transpose times a w. We're in terms of your basis. And now what? It's like, okay, well, now we can use the fact that expressing things in terms of their basis is a linear operation.

So this is the same thing as just c1 v1. through Cm Vm written in terms of its basis, transpose times A times W written in terms of its basis, which is just the original vector V written in terms of its basis, transpose times A times the vector W written in terms of its basis. You're like, that's a bit of a hot mess, but what did we just show? We just proved that any bilinear form has this form you were looking at before, right? You have vectors in terms of their basis, so you get some nice m.

a 1 by m matrix. It's an m by 1. You transpose, so it's a 1 by m. Multiply it by your m by n matrix.

Then multiply it by w in terms of its basis. So it's an n by 1. And you'll get out your scalar. So we've just shown that any bilinear map looks like the guys we were looking at before.

That is, it just looks like a nice linear combination of the two. of some coefficients, whatever your elements of A are, you know, A, I, J, times, well, whatever the coordinate values are in terms of your basis B, in terms of your basis C. So your coordinates, you know, B, I, and your coordinates C, J, where this is, you know, V is written as sum B1, V1 through Bm, Vm. And c are your coordinates of your w.

So c1, w1 through cn, wn. It just looks like this, where you're summing over your i's and your j's. You're summing over your i's and your j's.

Exactly like we saw before. So even in the sky, you'll be able to find some matrix like this to think about it in this way. Okay, we'll stop there for today.

Transcript for:Understanding Dot Products and Bilinear Forms

Transcript for:
Understanding Dot Products and Bilinear Forms