Optimization on Quaternions

let's see last time we talked about sort of like deterministic Optimal control algorithms uh let's say kind of summary and then we did kind of you like in particular right we we talked about sort of lqr versus MPC and then like uh sort of DDP or indirect methods versus their call uh and then we did this kind of pteran Crash Course intro uh any questions about any of that stuff Jack cool okay so today we're basically going to double down on our querian conversation we'll do a little recap and then we're going to um dig into optimization uh with querian if you want to be fancy this is like optimization on lead groups uh blah blah blah optimization on technically this is optimization on su2 anyone's um yeah Special unitary group and which is uh what cians are that's uh the Double cover of SO3 SO3 is rotations right we said the qurans are this weird like double cover thing they two querian for every rotation they're actually a different group they're a group called su2 that's the uni querian that's that's what it's St for fun fact I don't know uh but it's yeah that that's what we're doing as an aside you don't need to notes it's it's sort of random extra information for fun okay cool shall we all right so um feel like I should say something about this now yeah s su2 are 2 by two complex matrices and it turns out their their cians uh okay cool so um so just like a quick recap from last time and we're going to kind of dig into the step a bit more and hopefully try to give you some geometric Insight today um think of these guys as 4D unit vectors and they have a multiplication rule that basically acts like matrix multiplication so anytime you're wondering what to do with these things like just think about them as rotation matrices if I want to like you know combine two in you know some order think about them as if they were matrices and that's that's kind of works out so we got and we're going to use the the asterisk or Star whatever for this in here and this looks like some weird General ization of like Dot and cross products so I've got like scal and Vector parts and in terms of those guys this product looks like this and it includes both the you know classic Dot and cross products in it and I mentioned as an aside last time that this was actually invented first and it freaked people out so Gibbs uh you know kind of split this apart and like invented do cross products um and then we we talked about how we can rewrite this thing as a couple of different matrice uh matrices so we can write down the left multiply Matrix which is this guy and this gives it um sort of think about this as you know the the left Jacobian of that Matrix mult of that quatrini multiply um and there's a right version as well and the whole idea here is just so we can kind of write this guy out as L of q1 * Q2 or R of Q2 * q1 the reason being this is the Jacobian if I want to take the derivative of an expression like this now I have a way of computing that Jacobian matrix which we're going to use today to start doing optimization stuff with these guys okay so that's uh multiplication then we talked about this uh conjugate idea that is the equivalent of the transpose for rotation Matrix it gives you the opposite rotation so it's the rotation about the same axis but the opposite direction yeah oh yeah that's the that skew symmetric cross product Matrix so V1 times V2 or whatever equals V1 cross V2 it's a 3X3 symmetric Matrix that's like literally just if I write down V1 cross V2 in components and pull out the V1 stuff into a matrix so such that this Matrix times V2 gives me V1 cross V2 that's what it is that's how you figure it out that is the definition it's the cross product matrix it's it's in the notes from last time I don't know go check it out that's a it's shows up all over the place people call it different things in like I think it's fairly standard in like group Theory context to call it the Hat map or the Hat Matrix um I kind of like that definition because then you can also similarly cleanly Define the unhat goes the other way which is take the mat and turn it back into a three Vector which is kind of nice so hat and UNH hat kind of there's a nice sort of symmetry there the other ways of writing it I think are just a lot of people will write it as like brackets with little cross product superscript or something which I think is less clean I don't know that's how we're going to write it it shows up in a lot of places um thought of another way that is the Jacobian matrix of the cross product so if I need to take a derivative of like V1 cross V2 that's how get that Jacobian so most of this stuff where I make up these weird matrices the whole reason is taking jaian Okay cool so this conjugate thing is I read it like this and we talked about last time I want the direction I want to just flip the direction of the rotation around and if you look at the original definition of you know the quern in terms of the axis angle Vector the Top's a coine and the Bottom's a sign so I flip Theta nothing happens to the top because cosine is symmetric but it flips the sign on the bottom but that's the definition and then last time we made up this Matrix called t eer transpose that's just uh got a one and then a minus I down here so that t * Q gives you Q conjugate again we're just making up matrices all over the place here so that we can take derivatives that I know how to you know get the Jacobian of these things and then similarly we have an identity querian which corresponds to a rotation of zero uh which if I plug it into the definition with the co and the sign it's just and I guess I'll write this Qi or something like like that um it's just one zero okay cool uh let's see this is we all review stuff um that hat map with uh there's also a sort of hat map for qurans remember the Hat map gives us the cross product if I remember the definition of this guy if I just wanted a cross product and I wanted to like dck it in this Quan multiplication what what would I do if I just want V1 cross do how can I get that out of this guy well so literally all this go see that so similarly you know I'm going to Define this sort of hat map for caterans that acts the same way which is why I'm calling it that uh um so if I have some you know Omega we're gonna we're going to make it it's just a querian with zero scalar part and like uh just the vector part we're going to similarly make up a matrix for that called H such that uh you know it's and this just H * Omega gives us that which is just a zero and then an identity block and again that's all this stuff is just so we can take derivative yeah yeah what we're going to do in practice here is for all the Quan stuff we're going to do everything so I'm going to make sure that it's clear when we do it the goal here is to like Define a bunch of matrices so that I can just live in Matrix Vector land I like I learned this all the hard way like having weird notation for quaternium products and stuff it messes you up as soon as you start like wanting to do mix and match vectors and guans and matrices so the way in my experience the right way to do this is to basically always like basically always use Matrix Vector notation so kind of what I'm doing right now is setting us up so that we can like have all the Matrix Vector stuff we need to like always write this down in Matrix Vector notation then it then you know how to code it up and it's super clear and you don't have any weird Mi and that stuff um to be to be super clear I made all this notation up uh like all the Matrix notation that's in here because we were doing a lot of this stuff and like mixing and matching like and stuff and it gets super confusing and super buggy and messy and it's just way cleaner to like double down on Matrix Vector stuff and live there all the time yeah yes so this is another very very annoying thing um we mentioned there's conventions for like body to world world to body and this is annoying and there's there's also mixed conventions on how you stack the querian this convention in robotics is s on top B on the bottom the convention in a lot of older Aerospace literature is the opposite and um there's also really really frustratingly there's a convention that writes the rotation Matrix as body to world but the Conan world to body in a lot of old Aerospace papers it is infuriated so there's like roughly speaking a halfz different conventions for all this floating around in the literature and you can never trust you read in a paper so if you're going to aren of paper that has anything with rotations in it like any equitan stuff basically go read it but then like rederive all the specific details yourself you cannot trust anything uh I'm sorry to tell you yeah life is hard this is very very annoying there's like 50 years worth of conflicting stuff from multiple Fields roughly speaking Aerospace robotics and physics all do different things um and mix and match in frustrating ways so you can't like you can't basically code up anything out of a paper you you need to like go run down the math yourself um and I would suggest having done this and experience much pain I'm trying to show you like a coherent consistent way to write it down and like do it all such that you don't sort of have these issues so I I highly recommend stick with these convention I'm basically showing you like roughly speaking the robotics convention that are consistent and then also this sort of Matrix Vector way of writing everything which uh I highly recommend sticking with because it just makes code it makes it clean to write it in like mat lab python whatever and it it avoids a lot of weird issues end of rant sorry yeah it's it's super frustrating there's a lot of terrible stuff out there all right cool so let's see I think that's all we need uh uh okay so let's like that's recap from last time um now what I want to do is sort of try to give you some geometric insight into like what's going on here a little bit and that and like in particular to try to get us towards what we need to like compute gradients and start doing optimization on these things so the first thing I want to talk about is a little bit of geometry so like already kind of said this at the beginning of class but uh it should be somewhat uh obvious to us right that these things are unit vectors in R4 and so therefore the set of all querian is a is a sphere in R4 though uh this is called This is called the three sphere and the reason is that it's the surface of the sphere is threedimensional right so it's embedded in R4 but it has three degrees of freedom on the on the surface just like the classic sphere that we know like the globe that's actually called the two sphere and that's because there two degrees of freedom on the surface of the sphere okay okay so let's say I've got this sphere I have this quinian over here Q so write that down Q lives on a sphere in r four um how about Q dot where does q dot live so if I have this qu turning I'm moving around right my body rid body's tumbling whatever um if I'm thinking about the derivative of q q dot what space does that live in uh well in this picture where does it live yeah exactly it lives in the tangent plane to this sphere right that's exactly right so let's draw a little cartoon of that so like this kind of guy right that's what Q dot looks like and it lives in if I imagine like drawing a little tangent plane you know kind of centered here that's where it lives right so yes it is threedimensional but if I were to draw it in here like at any given point it lives in this 3D tangent playing r that makes sense no you're you're not wrong like it is a three-dimensional it has three degrees of freedom right um but yeah in spirit that is absolutely right it is it is a three-dimensional thing right it only has three do right okay cool so uh yeah that is the right CLI U dot lives in the tangent plane at Q okay so that's clear everybody we got this weird sphere situation at any given Q there's some tangent plane and that's where the derivative lives Okay cool so I think where we're headed with this is that derivatives are like while the Q itself might be you you know 4D in some sense or similarly the rotation Matrix is this weird higher dimensional thing the derivatives are always sort of in this 3D tangent space right okay so that's important uh so let's now think about uh like kinematics a little bit we wrote this down last time so now hopefully going to give you a little bit more geometric insight into what's going on so we know that Q dot Q dot is this uh like tangent plane situation there's three dot but if I'm just going to write Q dot down as a vector that is a four Vector right so it lives in our four somehow um but then like we all know that the angular velocity the Omega which is kind of the time derivative here is is 3D the three Vector it's what you measure on your gyro so like how do we reconcile these things this is kind of funky um so last time we wrote down this kinematic equation that I didn't really derive in any detail and we're going to kind of circle back and try to derive that a little bit now so we have this guy Q dot is is this thing like Q times Omega hat sort of thing and it's specifically you know this in our Matrix notation which we're going to try to stick with cool okay so here's like how you get there from this Q dot picture so if I I've got my like fear situ I've got my Q over here say um let's say my Q dot is this guy so if I look at this kinematic equation and think about this for a second by the way that equation looks roughly the same for the rotation Matrix case right we did that too at the same deal so what this is actually doing is Q is what this is actually we're defining Omega it's kind of clear so what that's doing like if you want to think about it geometrically a little bit is it's actually saying I'm G to choose to Define so Omega is defined in the body frame right it's my gyro that's bolted to my uh you know robot or whatever and so um I'm actually gonna like the geometric picture is that Omega is always defined um it's always defined in the body frame and in this context because this guy is like the body to inertial rotation that corresponds to being defined at the identity querian which roughly speaking here I visually I would think about is defined at the North Pole uh which I don't know based on the stacking of that thing you know it can be weird but think about it like in my I think about it as the North Pole also because I came up in Aros space where read it the other way whatever not that important but I think you can kind of in it this is North Pole um and then what's going on here so Omega is always defined in this in this one particular tangent plane say at the North Pole and then what this kinematic equation is doing is rotating it from the North Pole down to our current q that make sense so I'm taking this North Pole vector and this guy is hitting it with q but rotate it down to the current tangent plane where Q is and then current Q does that make sense everybody yeah yeah that is a good question that is not yet obviously clear it's coming from the definition in terms of the axis angle that we originally wrote down where it was like Co Theta 2 sin Theta 2 coming from there that's like an algebraic answer there's also a geometric answer that's a little weird and subtle that maybe we'll gets you in a little bit so bear with me um but this is the kind of the geometric picture I guess at least for where this is coming from let me move this over here and like write this down um so Omega is always uh written let's say uh in the tangent plane at the identity um then the kinematics equation up here uh rotates it to the tangent plane at Q so it can be like added on to that Q if that makes any sense that's sort of I don't know some intuition for what's going on there um and I want to like give a little bit more intuition for this stuff by digging in on it so all this weird querian stuff is actually like the 3D generalization of uh complex numbers in the plane uh if anyone's seen like complex phasers you can kind of think of these as like rotations in R2 so I want to kind of write that down real quick and hopefully give you some flavor um for where this is coming from this way so um hopefully everyone's seen like I don't know took some like circuits or E stuff and saw this kind of picture so we've got like I don't know we write as XY here and then say we've got some vector v some Theta here uh this guy I can write as Co theta plus I sin Theta this guy a unit Vector so I can think of these as real and imaginary components for a complex number for X and Y components right and then um I have in this case right um I stacked these guys I'd get like you know Theta sin Theta and then from there you know V transpose V equals one so this is complex plane right Ione seen this before so um now what I want to do is write down DOT okay I just chain rule this guy this is partial V partial Theta times Theta dot yeah so if I write down partial V partial Theta I'm just going to that expression right there that's my kinematic jobian that's minus sin Theta cos Theta time Theta dot cool and uh like if I were to I can also kind of think about this another way uh I can slightly expand this as um if I write my Theta dot down there not as a scaler but then but again as a uh a 2d Vector um and then this is kind of the the standard 2D rotation Matrix and now it should be hopefully pretty clear like what the picture is doing um so this is rotation Matrix obviously and this is the like complex number equivalent of that hat now where I take you know take things and throw it in the the vector part which is the imaginary part here and so then if I draw the picture um this is my kinematic equation now for like this complex phaser situation you know analogous to that querian situation so I've got like my V over here uh this is Theta uh I don't want to do that okay V Theta um Theta dot is defined as like the unit uh in the unit y direction so that's writing it at the identity which is 01 over here so this is my um this is Theta Dot and then if I imagine like my uh then this is V Dot over here and what's going on is you know this like where my current Uh current Vector is okay hopefully that made some sense or clarified the situation somewhat um let's see so this kinematics kind of equation up here rotates Thea Dot from the tangent plane which here is like tangent line at the identity which is theta equals z to tangent at the current V cool that make sense so it's like this this picture generalized to 3D right so hopefully that gives you a little like geometric insight as as well any questions about this stuff so far okay now we're going to talk about differentiating for Jans and like doing doing some uh okay so there's kind of two key things to keep in your head while we do this um the the first one is that the derivatives of like rotations even if our like base representation is not 3D like it's a Quran or rotation Matrix the derivatives live in this like tangent space and the derivatives are all are really 3D they're really like this kind of idea of a 3D tangent vector or 3D Vector living in this particular like tangent space we want to write them that way there is are really 3D tangent vectors um and then uh the second big idea so first one is derivatives are always 3D even if the rotation representation is not and then the second one is that rotations uh composed by multiplication not addition like the key key idea here is rotations are not vectors they're a weirder different thing like Matrix pans whatever you like they do not add right so and in particular the reason they do not add is just the physical reason that they don't commute if I do like 90 degree pitch 90 degree roll and appear if I flip the order and do 90 degree 90 degree pitch I'm over here so that like non-commutative nature that is reflected in matrix multiplication right I flip the order of two matrices I get different answer it's not reflected in vector addition like if I flip the order of addition I get the same answer right so it can't be addition it has to be something else and in particular the math that captures that fact multiplication right so compose by not addition okay so we agree on those things so now we're going to talk about to like get us into like derivative land is start talking about like infimal rotations really tiny rotations and start looking at what that does for us and so let's think about that like a a little tiny Pat a rotation that corresponds to a tiny rotation it's still a unit querian right it's a small angle unit Quan so let's write that as like Delta Q and if I remember my definition of in terms of you know axis and angle over here is what we had from last time and what I'm going to do now is just plug in a tiny Theta so if I have a tiny Theta um and it doesn't matter the axis is what happens to the coine in like a small angle situation one what happens to the sign it's approx IMA and in this case it's a two in there SO2 so I the bottom of this turns into 12 a Theta and remember what we called a times Theta before that's our axis angle so this thing approximately a one on top which remember one Zer is the identity but really what this is saying is it's an the identity plus like half of the axis angle sort of down here in the vector part that's how I'm going to write that and this guy is a small axis angle Okay cool so this is how we're going to kind of bootstrap our our small angle stuff so um let's keep keep chug in here so this is equal to I just said like an identity Quran Plus 12 times like the axis angle put in the vector part which is our hat map this is identity Quan plush B based on our like definitions cool okay so the next thing we're going to do is uh look at a small perturbation yeah so I mean as far as this is going like I mean really right here it's just like a definition of like I take the vector and I shove it in the the vector part with like no scalar part um algebraically it's also the same thing as the cross product Matrix for so in querian land it means take the vector and make a zero scalar part Quan in rotation Matrix land it means the symmetric cross product Matrix they do the same stuff there is a like deeper thing there which feel like is maybe a long digression into some group Theory stuff if you care we can talk about this later um if you care and you want to like go look up some stuff the Hat math thing these are elements of the Lee algebra and uh the lead group the unit querian or the rotations the rotation equals lead group this like cat Mappy stuff and these like 3D Vector things these are the Lee algebra and yeah if you want to Google that cool we can talk about later I can like punch some stuff it's probably too much of a tangent for right now but I'm I'm happy to nerd out of that okay so let's try now composing like this little rotation with a finite rotation Q let's see what happens so if I have some the idea here is I'm I want to compute some perturbed Q Prime which is going to be Q times a Delta Q so using our notation and I should write it like that I guess this is L of Q um and then remember uh times this guy like identity plus 12 h e that we just figured out if I now kind of like just distribute that out it's Q Plus that's the identity term plus 12 L of qh V cool so this is like again if I were to take a tiny infinite tmal rotation and perturb some you know other some some arbitrary Quan by it is if I perturb if if this were normal Vector X and I were perturbing it by a little Delta X it would just be X Plus Delta X right the idea here is the only difference between this and vectors now as far as like these small perturbations AKA like derivatives so in Q Plus Delta Q they so effectively the price you pay here is you pick up an extra Jacobian when you're like trying to represent so instead of X Plus tiny Delta X it's Q Plus itin need little Fe and this Jacobian thing shows up a lot um and we're going to give it its own name we're going to call this uh G of Q and to be clear this guy's 4x3 and it's mapping our little 3D uh tangent plane perturbation into a 40 thing like tack on a motan and we're going to call this guy the attitude Jacobian okay um a quick note on this we derived all of this like you know with this Quan picture and like whatever um and we we weren't super explicit about what the parameter guy is like is that a you know axis angle is that a vector part of a querian is it something else um I guess kind of we kind of said it was a small axis angle it turns out for tiny rotations for like infinitesimal rotations it doesn't matter it turns out that all three parameter attitudes so whether it's boiler angles axis angle Vector part of your Quan if you like uh Gibbs vectors uh Rodriguez parameters whatever there's a bunch of these things turns out for small like infinitesimal small angle stuff they all linearize the same way and so this attitude out it doesn't matter which parameterization you use to derive it it comes out the same at the end of the day which is kind of interesting um yeah so that would like you pick up a permutation and literally that just corresponds to flipping the ordering so yeah then it would be so I guess up to permutation if you want to be weird and flip your axes around that literally corresponds to like you know so usually we Define um R pit as rotations about the XYZ axis so under tiny tiny rotations that corresponds to like an axis angle Mak sense so they're they're under infinite tal small magnitude whatever if you define the things the same way in terms of the coordinate axes it carries over if you want to be weird and order your Oiler angles a weird arbitrary way fine then you just need a permutation Matrix that flips The Ordering of the vector does that make sense so yeah like roll pitch y usually is roll about X you know uh pitch about Y and then yaw about Z and so that for very very small rotations that that axis angle Vector with XYZ components is the same same thing okay that was a little weird aside hopefully that made some sense um so I'm going to just write that down real quick uh note we can use any free parameter uh rotation representation uh we want to for fee in this stuff um it turns out you know they all linearize the same way which is kind of interesting and there's like some deeper thing going on there right there's like a deeper kind of group Theory thing happening there um but in principle you can use anyone you want when we're doing this stuff and that attitude Jacobian doesn't really change at least not up to some weird permutation or scaling um I guess yeah like you could also decide to use not radians or something weird right and then it would have a scaler uh that you would have to imped and matash but uh is the same like up to caveat up to a permutation flash scaling okay good point uh okay so yeah basically just to illustrate that like the axis angle looks like this right it' be like Coast Norm V over two and then over here get the axis you do V over Norm Fe and then similarly sign Norm over two um you could also do the vector part of the querian so I could just like take the vector part out and use that as my representation in which case you know the bottom of the qu turning is just the fee and then to get the top I just compute root of 1 minus the trans it's just whatever it needs to be to normalize this guy that's perfectly valid I'm just showing you how to like reconstruct Q from any one of these three parameter things the other one that's really common is called the Gibbs vector or the Rodriguez parameters anyone has any heard of this before yes so the way you write that down is or the mapping back from that one is is this there's like a normalization out front and then it's just this guy it's it's almost like using um so this just showing you actually if I stare at this for a second if I your eyes they all give you the same so like any one of these you want to use like it doesn't matter if you wanted to be sadistic and other angles for this same story you'll get some Bizarro mapping in here with like a bunch of trig functions but if you linearize it for small angles add Jacobian like roughly speaking right all like they're all higher order and so underation like you throw all the higher order terms you just get this at the which is clear and here cool okay so uh let's see what this is Axis angle which we already talked about this is a vector part of U and then this one is the uh gives Vector Rodriguez uh yeah I always get that that's that's a French rodri it ends with an s instead of a z okay uh well so yeah for us um just for like consistency or whatever uh we're going to use the vector part of Q it's easy and like less work uh it doesn't really matter um but these Gibbs vectors also have some nice things about them and stuff okay cool so now that we did all this um there's sort of a meta message here which is like the strategy for doing calculus with these things is literally to just take the derivative as if you were a normal Vector get your Jacobian and then flap a g on it slap an attitude jobi on and that Maps it from whatever weirdo 4D Quan derivative it would have been it's a 3D derivative with respect to this axis angle guy that's nice and works the way you expect so I will say it again take the derivative as if Q were a regular 4D Vector get your Jacobian or whatever or your gradient and then everywhere there would be a querian just slap an attit jobian on it and then everything works chain roll works all the usual stuff so I'm going to write that out like with a few examples right now just to kind of illustrate it um okay this whole thing lets us differentiate with respect to qurans by um inserting this attitude Jacobian guy in the right places and I will make that very explicit right now what that means so um with a bunch of examples literally just anywhere there's a querian you put an attitude Jacobian and you can think of this is like impedance matching you from querian Land back into like 3D tangent space land which is where we want to live okay so let's take an F of Q let's think about like a cost function now so if we have F of Q which is um a function that eats a quatran and spits out a scaler so think cost function uh so this guy we're g to say is from so H is the like Blackboard B if where is querian uh it's from Hamilton I guess who invented them so this is a function from uni querian to R so like let me just note that uh this is s for Hamilton okay so this kind of thing which is to be clear gradient of a scalar valued function theeran argument um I have d f and we're gonna kind of use this like I'm gonna kind of slightly abuse notation where if I write that I mean the like same impedance matched 3D gradient um whereas if I write the partial F partial Q notation I mean the like 4D derivative so this is going to be efd so the other way to think about this is that attitude Jacobian is basically dqd a value atq like it's just chain ruling there another way to think about it but there's like a lot of there's a rich geometric picture there so this guy is just partial F partial Q times G of Q Okay so it's a function that goes from a querian to a scaler I take the regular Jacobian we're just slap on the now this multip AIS angle over here a little get thing out on the other side the Delta you know scalar on the other side that I expect okay so that's if it's a cost function e thing like a scalar valued function um now we're going to talk about is let's say we've got an F of q that Maps us from querian input to querian output and the scenario here is a Dynamics function so I've got like a rigid body Dynamics where it's you know like my discreet time Dynamics where I have qk coming in and qk plus one going out right so it's mapping from a quan to a quan so this is a Jacobian now right and the recipe here is um what I want is I want to put in a little fee on the right a little infinite tmal perturbation and I want to get out a f Prime like a an infinite tmal rotation perturbation thing on the output as well so I need to put G's on both sides of this guy right so the idea is everywhere with the Quan I put a g so if I got a quan on the input I put a g on the right I got return it on the output put a on the left and so that's kind of what's going on here so I got B Prime equals and I've got keep in mind the the output needs to be evaluated at the original output also so this is G not g of q but G of f of Q so whatever the original output is and shows up as the transpose on the left uh is kind of a normal normal way this kind of stuff works uh so that's right so I just G's on both sides here this thing in the brackets the whole thing is what we're going to call uh DF here and this guy is going to be in the case of quan to querian this is going to be a 3X3 thing because I'm transferring it to map from perturbation on the input perturbation on the output and the idea here is it's B transforming the input on the right and then transforming the output on the left cool okay so then like kind of the last one we need is the H of a like scalar value thing which is H of a cost function is where this comes up right uh so this guy so it's similar it's a scalar valued function so if e equitan spits out a scalar it's a cost function e and this we're just going to write as Dell squ of F and it's pretty much what you expect um it's going to do the remember the hessen is something that I want to sandwich by Delta x's on both sides the quadratic form so I've got to transform both inputs from both sides so I pick up G's on both sides of this guy um but it turns out I also pick up an extra term on here that's coming from DG DQ which shows up when I if I take this thing and take a second derivative I get like a chain rule situation so I pick up another term um another way of thinking about this is this is like the full Newton session and that's like the G Newton thing like there's an extra second there's a curvature term here that that comes out to uh and this looks like this so this is you know 3x3 identity and then this is this whole thing's a scaler and this term is coming from this like partial G partial Q this like next derivative of that attitude jopan thing okay that's all the things you need so now you know how to do radi cobians and H you have all the ingredients for like applying this to all the stuff we've done so far so you can do like Newton's method you can do you do lqr you can do DVP like all this stuff and you just bolt this G Matrix on wherever you need anywhere querian shows up on the input or output of the function take the jacobians or take the Hess like normal and stick the G's in the right spot okay so that's the message Now can do [Music] Newton's method and EDP and sqp all the things okay any questions about this so far but now we're going to do this on a real problem and I'll give you some code um so we're going to do an example that's I don't know the simplest example that does this stuff that I could think of so we're not going to go to a full up tra out problem we're going to instead do a a problem called wava problem has any heard of this before this is pose estimation just for the attitude though not the position so the ideas assume you've got so let me write this down so this is a very classical uh like Aerospace problem uh so it's just attitude estimation I should say not pose estimation we're not doing the position part so the idea is you have you have a camera in your you know or whatever like we can abstract this a little bit the idea is the way to think about it is I have a camera in the body frame like on the robot or on the spacecraft classically where this comes from and I'm observing a bunch of vectors to known features in the environment no Landmark say in my camera so let's just forget camera details right now and just assume so with the camera I can observe like assume I've calibrated my camera so I know what the body frame stuff is so let's say I observe like I don't know the vector to that clock the vector to that exit sign whatever in my camera and I so I know what those vectors are in the body frame uh and I for this I I just need unit vectors so I need to know the unit Vector directions to those things in my body frame if I have a map of the world I know where those things are in the world frame already from the map now I have pairs of vectors so I have the vectors I observe in the camera frame uh and then I have the vectors I know from the map right make sense so then I have these pairs of vectors and I'm now going to basically set up and solve a Le squares problem for the attitude given a bunch of these Vector pairs that is W's problem and I'll write down the math for it right now um given a bunch of vectors to known landmark in the environment determine the robots attitude um yeah so the classic where this comes from this has an interesting history also this this problem was like originally comes from the early 60s like the birth of the space program and stuff and um it was originally posed as a challenge in a in like a journal it was like posed as an open challenge um and the name Wabba is from Grace Wabba who's a statistician at I think University of Wisconsin she only retired like 10 I think she's still alive she retired like hsh years ago but this was like a completely R it's like it is known as wba's problem like it is associated with her and it was this most random aside that she didn't care about at all like was this random thing that she did like as a grad student or something and like she went on to have a long career and statistics and like never cared about this and she became like famous in the Aerospace community and like has this thing named after her kind of funny um so let's see uh okay so here it is written out in math um so it's a leas squares problem for the like attitude the Q um that and it has this kind of cost that we're going to minimize which is the least squares you know loss um so let's say we have M of these vectors and this is what the least squares cost looks like it's uh so I have this unit Vector um unit vector k in the N frame so like the inertial world frame whatever you want to call it and then I have the same Vector observed from my camera in the body frame and I just take the least squares loss uh over all these vectors so observe to be super clear right this is assuming I know my location already and just solving for the attitude so if I have a map of the room and I know where I'm standing in the map and I know like all these features I say okay I see that thing in my camera I know where I am so I know what this Vector is in the inertial frame I know what it is in the camera frame I have a few of those and then that clear to everybody that's the setup okay and so this we can like sort of abstract this slightly into just uh like a classic thing in leas squares is to define a residual this residual function so this is big Vector it's just all the errors right so here it's Vector which is just one giant Vector we're going to call that the residual R and then so it's that thing the sum of the squares of the residual AKA our transpose R right is is the thing we're minimizing okay so let's see let's write some notes on here so this is uh known vectors in in World frame AKA from the map this is observe vectors in the body frame uh from the camera and then this R thing is the residual vector well has never heard of this before residuals we squares hopefully a little bit okay so this is a pretty classic nonlinearly squares problem with just the one twist that the thing we're solving for isn't the vector rotation so there's just a little extra Twist on the classic Le squares setup okay uh let's see so let's write this thing down down uh in a little more detail the residual thing um and remember so X these x's and are unit vectors we don't care about their magnitude we're just trying to estimate attitude um so directions is the way to think about them and this residual function looks like just I'm stacking up all the all the individual observations so it's there's you know the first feature Vector that I observe the second one dot dot dot until you know the m one so that's all all the x's and like it's the same CU right okay so this this is my residual vector and the way we're going to do this obviously we need like G standard thing squares so get there we need the Jacobian so let's do that so we want this what we're going to call delr and this guy is going to be um write it down so we're gonna we're going to do the stuff we just said so we're going to diff it with respect to Q as if Q were just a vector and then we're g to back on this Jacobian guy and just to clarify some stuff this guy we want this to be 3M by 3 this thing is going to be 3 m by four this guy is going to be four by3 that's hows work out right so we have M observation the idea okay so just as a an aside we've kind of touched on this but to to make sure we're super solid on this we're gonna just recap G Newton uh for Le squares so the least squares problem you can like sum it up as min oh man I I went RL there for a second all right so like generic Le squares it's Min over some x uh of some cost function uh which is like the minimizing the two Norm of these this residual Vector squared which is where the least squares comes from which is equivalent to R transpose R cool so the way you solve this is you take partial of your loss function with respect to X AKA of the gradient of this loss function which for lease squares this has a very particular form so if I take this guy it's I'm just going to do my usual chain rule stuff that's R transpose Dr DX and then so that's the gradient I'm gonna write down the hessen next that looks like e^2 J dx2 and if I just kind of like I'm going to get partial R partial X transpose partial R partial X that's you know just so if I do that using all that konvex stuff that we talked about before that turns out to be ion R let suppose times b^2 that's like the stuff we talked about last time right okay so remember in we talked about this with DDP versus IQR G Newton if I were this Newton method but this is annoying I don't like this so I'm just going to throw this out and just meod so we're gonna throw the CI out by the way this is like the original context where G Newton comes from it's a statistics Le squares bitting thing classically okay so that's the whole thing and I get the following like kind of Newton step thing um I'm GNA approximate that with this G Newton stuff so it's going to be Dr DX transpose Dr X inverse times R DX transpose times R of a okay that's the whole thing but this isian of our residual function function this also looks like normal equations for linearly squares if it were linear squares problem um cool questions about that hopefully seen this before someplace everyone good that's classic as Newton it looks like Newt method stuff that we've been doing okay so now we're going to do it for W's problem with the querian trips it's literally just going to be this but we're going to swap the X for a que and plug in our attitude Jacobian business okay and I think this basically sums up all the the weird things there are to know about optimizing over rotations in like the simplest you know setting possible I think and then from here you can go do it on all the stuff we've been doing SL you will on your homework also okay so we're going to initialize this with some guest you not and I'll like write out the full pseudo code and then we're going to do the following stuff all right so we're going to compute our gradient which be clear is this thing BR down then we're gonna take our G Newton step or sorry that's yeah Jacobian of residual then we're going to compute our step which is going to be one of these little axis angle guys it's going to be this kind of deal cool that's our little free parameter thing and then we're going to update Q with a multiplicative update and the way we're going to do that is we're going to multiply it on from the right like with all the tricks that we've been playing so it's going to be Q times or maybe I'll use the L of Q thing all well I'll write both ways here so it's Q times we're going to take that Fe and turn it into a unit quitan um and we're going to use kind of the the vector part of the quinian kind of parameterization we do this which is equal to l of U times this guy and the key thing is right it's a multiplicative update the reason we do this is if I do right this gotum when I start this is always guaranteed to be un quum by construction this mulation always guaranteed to be a uni quatran by construction so lots of fancy math ways of talking about this but the gist is this algorithm always guarantees that every iterate is a unit Quan so another way of saying this is I stay on manifold like I'm always staying on the unit sphere I never have to worry about extra constraints I never have to worry about the uniform constraint it just does it it's all kind of built in right all right cool um and then in general you would also do a line search on this you'd use your arho stuff like the usual stuff um but it turns out this problem's like benign enough that we don't really need one so I'm not going to do it in here but it's it's super straightforward to do the standards that we've been doing and then um yeah this whole thing you know kind of until your convergence criteria is satisfied in this case it's going to be like Norm of the residual bigger than some tolerance or something like that you know whatever okay questions about this it's classic G Newton the only difference literally this could have been the original G Newton algorithm like with the two exceptions of slap on this attitude you're here and then when I apply the update If This Were normal Vector land I would get from Delta X and i' do you know I'd add it on but instead I'm gonna take this three parameter guy I'm gonna turn it into a unit Quan I'm gonna multiply it on Co everyone good with that okay code I SAR this stuff's cool you can actually like you know this how you make like you know back flips happen and stuff so we'll do that you do that on the homework and stuff flash maybe a little bit next time cool so this is like this all the random functions we derived like before got your map cross producty thing you got your your L just the like e and stuff this is all just stuff to have attitude Jacobian rotation Matrix as a function of Q um the way I'm going to do this I'm going to generate a random querian um so the like simple way to do that is to generate a random four vector and then normalize it uh it turns out that's actually a good way to generate uniformly distributed qurans since it's it's actually statistically legit um and then uh we're going to generate a bunch of random vectors normal so the unit vectors and then rotate them to generate our observations like just arbitrary El to generate 10 vectors here cool um here's my residual this exactly we just wrote down it's the inertial frame one minus the body frame one rotated by our Q like return that whole thing stacked up I'm G to make a random initial guess again just generate another random unit rning then here's my gas Newton method so by the way another good reason for this like full attitude Jacobian thing is that it plays nice with autoing right so I can just so what I'm doing in here right so here's the algorithm make a random guess I compute my residual and then to get my Jacobian I just diff the residual function as if Q was a normal Vector so I can just use four diff on this get the Jacobian then I after I do four dip I just slap on the attitude Jacobian there's my jacoban I actually want to use here's my G Newton step and then I take this guy I turn it back into a unit quanan here and and multiply it on and then I'm just going to Loop this until you know the residual is small six iterations okay here's a fun one check this out this is huge error what about this one look at that it's exactly right so what happened there I do this yeah exactly yeah so this is remember we're doing Newton it's going to converge to whatever you know is closest to the initial guess and since we have q and minus Q correspond to the same rotation it literally just depends what my initial guess was if I'm closer to this one I'll go here if I'm close out of this one I'll go here they're equivalent they they get you so if I were to repeat this handful more times with like different random guesses I'll get it to to go both in general yeah so this time it went the other way that's kind of interesting in practice there are some like annoying things there with the double cover stuff like Mainely what can happen is if you're this is all stuff that you need to just be aware of and be careful that we'll talk about this a little bit next time if you're doing a controller that's trying to track a desired attitude you have to be a little careful about this because if you you're naive about writing your cost function down or whatever and you're trying to like track a particular Q you can have a situation where your state estimator spits out minus q and then your controller freaks out and tries to literally wrap you around by two Pi this is this has a name in in the control lingo it's called the unwinding problem so like if you're if you're not careful you can have a controller that thinks you have a two Pi error and tries to flip you around uh when your error is actually like zero effectively um but you just have to be a little careful and then it's not a problem cool anybody have questions about this

Transcript for:Optimization on Quaternions

Transcript for:
Optimization on Quaternions