[Music] hello and welcome to this lecture ah in the previous lecture we saw very important characteristics of estimators bias variance and squared error risk or mean squared error and we saw how in simple cases one can try and calculate it using a basic knowledge of probability and the properties of iid samples and the properties of the underlying distributions and all that and how we can get very interesting answers for such estimators okay so in this lecture we are going to start looking at design of estimators okay so how do you approach the design of an estimator a very very popular and simple method is something called method of moments and that's the first method that we're going to see so first let me remind you about what parameters are and what moments are supposing you have an x which is distributed according to some function f x of x this could be pmf pdf whatever right and there are parameters theta 1 theta 2 etcetera in this pdf or pmf which we don't know okay so one can always compute given the pmf or a pdf we can always compute the moments expected value of x and expected value of x squared maybe expected value of x part 3 x power 4 etcetera you can keep computing these moments and you will get answers in terms of the parameters see the distribution is expressed in terms of the parameters if you compute moments you will again get answers in terms of the parameters it is not very difficult as well as example right so if you look at bernoulli p expected value of x is p if you want to find expected value of x squared you will get something else but that is what you get right the poisson distribution lambda right this again has a parameter lambda expected value of x is equal to lambda right exponential distribution again parameterized by this lambda and expected value of x is one by lambda isn't it so we know all these distributions the normal distributions mu sigma squared is two parameters so you can look at the two moments expected value of x is mu and expected value of x squared is mu square plus sigma squared right and then some other distributions which we saw very briefly earlier something called the gamma distribution remember the gamma distribution i think the pmf or pdf is proportional to x power alpha minus 1 and e power minus beta x if i am not wrong something like this right so that is the gamma distribution and if you compute the first and second moment for the gamma distribution here is what you will get expected value of x is alpha by beta expected value of x squared is alpha squared plus b by beta squared plus alpha by beta squared okay so so what is the point here there are parameters for these pmfs and pdfs and you can find the moments and the moments will be functions of these parameters so here is another one in case you are wondering about supposing i have binomial n comma p ok so x is binomial n comma p expected value of x is n times p expected value of x squared is what you may remember variance is n p into one minus p expected value of x square is second moment so it is going to be np squared plus np times one minus p ok so these are all various different distributions hopefully you can refresh yourself so if you look at the beta distribution right there are two parameters a b in the beta distribution and the expected value of x expected value of x square can be expressed in terms of a and b okay so this is moments and parameters unknown parameters of your pmf and pdf you have moments and your ah distribution moments expected value of x expected value of x squared they can be expressed in terms of these moments using either integration summation ok so this is the first ah first thing to understand in method of moments so the next thing is moments of samples okay supposing i give you n id samples from some distribution you can always compute the sample moments okay so the kth sample moment is defined in this fashion capital mk i will use this notation for the kth sample moment capital mk is a function of the n samples it is simply 1 by n summation of the kth powers of each sample so if you put k equals 1 you simply get the sample mean if you put k equals 2 you get the second moment of the sample okay k equals 3 you get the third moment of the sample so on so forth okay it's just the average of you know suitable powers of the samples that you observe so important thing to note is the sample moment i have written here is a random variable its not a constant right so you may have one sampling instant small x one to small x n and once you have that your capital m one will take a small value small m one ok for a particular sampling instance x one to x n small x one to small x n you will get your first sample moment kth sample moment and all that so if you get another sample instance you will get another moment ok so this moment sample moments are random variables or random variables and they take different values for different samplings ok they have a distribution and all that ok maybe there is some concentration etcetera but sample moments and distribution moments are the same this are different right distribution moment is a fixed constant function of some parameters right so mu squared plus sigma squared or mu or you know some alpha squared by beta square plus alpha by beta square whatever some function of the parameters it's a constant but the sample moments are random variables and they will have a distribution will keep varying okay so that's what i have written here but you know the last line is may be something for you to think about we expect that mk will take values around expected value of x bar k we sort of expect it and we have seen various results justifying this particularly in the earlier lectures about central limit theorem weak law of large numbers concentration so all of that was designed to think of us i mean try and convince you that you know these sample moments for larger and larger n take values close to their distribution expected value okay so this is something we we've seen before so we intuitively expect these two to be similar you expect mk to take values around expected value of x power q ok so the method of moments exploits this concentration or what we expect in some sense the moments to be equal the distribution moments and the sample moments are sort of equal okay so that is what we do in the procedure for method of moments we simply equate the sample moments to the expression for the moments in terms of unknown parameters and we solve for the unknown parameters okay so that is what you do in the method of moments so supposing you have just one parameter theta usually it needs only one moment okay i say usually because i'm expecting the first moment to be a function of theta what if it's not a function of theta what if it's a constant right for instance you may have a normal with zero mean and variance sigma squared so in that case first moment is zero it's not a function of theta so then you can't use it okay so you have to keep searching till you get the first moment which is a function of the theta that you want okay in the normal case maybe the second moment will work right so so so you you have to mean this is just a i said usually here remember that usually just works most of the time but sometimes if it is not a function of theta you cannot do it okay so if you have one parameter theta that's unknown your sample moment will be some small m one you actually have your samples you get m one your distribution moment expected value of x will be some function of theta we'll see exact examples as we go along but you'll see this so you solve for theta from f of theta equals m1 okay some function of theta equals m1 you will get theta and once you find a solution you simply replace your small m1 by capital m1 that gives you the estimator okay remember estimator is also a random variable right you can't find it just for one sampling issue you should say what do you do in general for samples okay so this is what you do for one parameter so we'll see some example it will be clear enough when when i do a precise example for high level this is what we do if you have two parameters may be the first and second moment if they end up being good functions of theta one theta two then you can invert this function find theta one and theta two in terms of small m one and small m two and then you know for the estimator you simply replace small m one by capital m one and m two ok so this is the method of moment so let's see some example this is what will help you okay x one through x n are iid bernoulli p okay expected value of x equals p okay remember what is your unknown parameter it's p expected value of p equals ah expected value of x equals p and what's your method of moments equation now the function p equals m1 right so i am going to replace this guy so this is another way to think of it you replace expected value of x by the sample moment m1 okay that's it so you you simply get p equals m1 okay so once again this is uh i mean it's sort of very very simple example but still it's important the expected value of x equals p right so it's the the first moment of the distribution simply a function of the parameter in this case is the identity function is equal to the parameter itself so if you compute the sample moment you are expecting it to be close to p so that is the equation p equals m1 and if you want the estimator p hat you simply replace m one with capital m one you go to the random variable version of it in some sense and you will get this okay so it's just a simple little equation hopefully you see how i get to this you can cut short all these things and directly see that p hat will be the sample mean but hopefully you see the method of moments working in this fashion ok so you write down an equation so in this case the equation was really really simple it can be a little bit different in some case but let us proceed slowly okay the next example is iid poisson lambda okay supposing you have n samples from the poisson distribution how do you design a method of moments estimator here also it is very very simple the expected value of x is directly equal to lambda whenever you have a parameter being equal to a moment it's it's nothing it's very simple right your method of movement's equation is very easy lambda equals m1 right and so the estimator lambda hat is simply capital m1 which is again the sample mean right so if a parameter is equal to a distribution moment that is the easiest case okay so you simply replace the distribution moment by the sample moment and you get your better improvements estimator okay so for both for poisson and bernoulli finding the method of moments estimator is trivial simply equal to the sample moment capital m one k p hat and lambda hat the first uh difference we are going to see is with the exponential distribution say notice this exponential distribution lambda here if you were to use method of moments expected value of x is 1 by lambda okay the first time we are having a difference here so if you are going to replace expected value of x with m 1 which is the sample moment i will get 1 by lambda equals m1 so now you have to solve for this equation solving for this equation is quite trivial lambda becomes 1 by m 1 right so just 1 by lambda equals m 1 lambda is 1 by m 1 and in your estimator you simply replace small m 1 by capital m one so you get lambda hat is one by capital m one ok so that is n divided by x one plus x n so the first time we are seeing the method of moments at work here the equation was not a direct function of the moments you had to do a little bit of an inverse to get to this okay so this is part and parcel of the skill that you need to pick up for deriving method of moments estimators okay express the moments in terms of the parameters and then invert one equation to get the parameters from the sample moments and then replace the sample moments with the random variables capital m1 m2 etcetera you get your estimator okay so this is a simple case we will slightly see slightly more complicated cases in the rest of this lecture so it is an important skill to pick up how to design a method of moments estimator okay we will keep seeing more and more examples one more example ok so this beating you to death examples here but its important and hopefully it is clear okay now i am doing normal distribution there are two parameters here mu sigma square the first time we are seeing two parameters so clearly we will need two moments at least so first two moments are working out nicely right expected value of x is mu expected value of x squared is mu square plus sigma squared ok in these two equations what do i do i replace the distribution moment with the sample moment so i get mu equals m1 mu square plus sigma square equals m2 okay and you solve right you have two equations two variables mu and sigma are unknown m one and m two are known right from your samples you know them so mu equals m one directly that is solved for sigma you just put mu equals m one take it to the other side take square roots you get sigma equals square root of m two minus m one square ok so when you want the estimator for mu its very easy its just x one plus x 2 plus x n divided by n the estimator for sigma is a little bit more involved right so you need the second moment minus the first moment squared okay so it's possible to simplify this a little bit i'm not focusing on the simplification here you know you can see that you know n will come here and then you will have you know x 1 squared etcetera so you can you can do some simplification i am not not claiming it the simplest form for sigma hat but you know it's the clearest and easiest form you can write down okay so go through these steps once again make sure you can reproduce it very clearly ah this is how you derive method of moments estimator for a normal distribution okay when you do not know mu and sigma you write mu and sigma ah in terms of the you know you first express the distribution moments in terms of mu and sigma when you have enough equations you replace the distribution moments with sample moments and do the solution in reverse find mu and sigma in terms of the sample moments and then replace with the sample moment random variable to get your estimators okay so hopefully this is clear enough simple enough procedure ah maybe the next example is a little bit more complicated so let me work it out in detail for you this is for the gamma distribution okay now there are two parameters alpha and beta and the first two moments are alpha by beta and alpha square by beta square plus alpha by beta square okay so what is my method of moments equation ah alpha by beta equals m1 and alpha squared by beta squared plus alpha by beta equals m two ok so from here i have to solve for alpha and beta ok so how do you do it so one very crude way of doing it is from one of these two equations can you express one unknown in terms of another unknown okay so that is the approach you can take in many cases it will work so in the first equation if you look at alpha by beta equals m1 you can write alpha to be equal to beta times m1 okay and in the second equation you substitute alpha in terms of beta so basically you will get from two equations in two variables you will get one equation in one variable you eliminate one of the variables okay so that is the approach and i know of course there are more fancy methods to solve simultaneous equations we we wont ask you about that this course is not about solving simultaneous equation but we'll ask you simple things but simple equations that's basic skill is good to have so here you just find one variable in terms of the other and you substitute back here if you substitute back here you're going to get beta squared m1 squared by beta squared plus okay i'm sorry put alpha by beta squared i forgot the beta square beta m1 by beta square equals m two isn't it so this beta squared will cancel this beta will cancel so this m one squared will go to the other side you will get m one by beta equals m two minus m one squared or beta equals m one by m two minus m one square so you see how the equation worked out its slightly more complicated ah its not as easy as the previous two cases but nevertheless it's good enough okay so what do you do for alpha alpha just substitute back in here right so alpha becomes m1 squared by m two minus m one square i hope i didn't make ah mistakes here let me just make sure all right not too bad okay so beta is this alpha is this m1 squared by m2 minus m1 squared so now what are the estimators estimator alpha hat the method of moment estimator is going to be m 1 squared by m 2 minus m 1 square beta hat is m 1. m two minus m one square ok slightly more complicated estimators right so moment you see the gamma distribution the moments are complicated function of the parameter slightly more so you get estimators also being slightly more complicated functions of the parameters okay simple skill you would agree right just plug in the equations express one variable in terms of the other solve for the other variable okay so let's see one more problem of along the same lines okay in this case we have binomial n comma p both n and p are unknown okay so you have n samples from binomial n p both n and p are unknown expected value of x is n p expected value of x squared is n squared p squared plus n np into 1 minus p you have to solve for these guys here okay so what are my equations np equals m1 n squared p squared plus p into 1 minus p equals m 2 okay now solve for n and p from this equation ok what is the strategy for solving you express one variable in terms of the other so this implies n equals m one by p ok so you write n equals m one by p here so you will get m one squared by p square times p square plus m one by p times p times one minus p equals m two so here you have this cancelling this cancelling you get a p in terms of this so so you will get ah one minus p equals m one into one minus p is just writing it out laboriously right 1 minus p is m 2 minus m 1 squared by m 1 and p is 1 minus m 2 minus m 1 squared by m one you can write it out if you like you know m one squared plus m one minus m two by m one its the same thing ok now n would be what n would be just m one divided by p so it is going to be m one squared pi m one square plus m one minus m two ok so this is p and n so notice how i started with this equation and i came to these two guys right so this is this is the skill okay simple skill inverting two equations two variables you know finding n and p in terms of m one and m two just i told you how to go about doing this so once you do this you can find p hat p hat will be capital m one square so writing in term terms of random variable plus m one oh my god writing all over the place m one minus ah m two divided by m one then n hat is m one square by minus m easy enough in some sense but ah you know this this this gives you a ready-made simple method otherwise how will you start thinking of what estimators are right so just just gives you very simple starting point if you have a distribution with unknown parameters you have samples from the distribution how to find those parameters from the samples method of moments is just a plug and play formula and this skill is important and easy to pick up ok how do you use these things in practice sometimes you know you get lost in those equations you do not know what happens in the you know where are the samples how do you go about doing it so bernoulli p here are the samples one zero zero one zero one one one zero zero okay how do you find p hat in this case it was just you know capital m one so you find m one right so you add up all these things divided by ten so you get five by ten ok straight forward isn't it so we looked at this alpha particles emission later earlier right so if you look at the number of alpha particles emitted in a 10 second interval we know it is a poisson lambda number of particles emitted per second usually is given as a average number right 0.8392 from here you can find the average number of particles emitted in 10 seconds it is 8.392 okay so sometimes this is also important so so what you have to do when you count particles is the number of particles will be counting different number of particles over 10 second intervals but really lambda is just the average number of particles emitted in 10 seconds so you can as well count the total number of particles emitted divided by the total time to find the average number of particles per second and multiply by 10. so that also will give you the same number you will get 8.392 okay so this is a nice simple calculation so notice how you are calculating m1 in different ways okay so you can do it in multiple ways in some cases the average is easy to compute in this fashion okay so let us go to normal mu sigma square here is ten samples how do you find mu hat it is just m one right and what is sigma hat square root of m two minus m one square you calculate it you get it okay so this is how you calculate just find the equation in terms of the moments compute the sample moments plug it in you get the answer right so this guy is m1 once again right so this this whole thing is just m1 okay computed m one ah finally binomial n p ok somebody gives you binomial n p you do not know n don't know p but you know its binomial and those those are the samples eight seven six eleven eight five three seven four six nine you can plug in the formula we we looked at the n hat and p hat right so there was some m two m one squared m two square divided by m two minus m one i mean it's a complicated fraction expression you write it down ah this n hat will not be exactly nineteen it might be eighteen point eight nineteen point something but you can approximate it as ninety because you know n has to be an integer so you can give the smallest closest possible integer if you like ok so n hat works out as nineteen in this case and p hat is point three seven okay so notice how i don't know if you like it or not but i mean binomial np looking at these samples will you say 19 and 0.371 where does that come from you know maybe maybe it's correct maybe it's wrong right so one needs to look at these kind of methods and be convinced that they are working out correctly but in this case method of moments gives you the answer it's not unreasonable for for some method like that to work okay so hopefully this lecture was interesting to you you saw how to build a point estimator using the method of moments idea very simple method just write a few equations solve for the unknowns in most cases it works out very cleanly it gives you a very nice estimator and hopefully nice in many cases some cases it may not work also but nevertheless it's a method that works out for us and we can see how we can use it in actual scenarios we have samples okay so that's the end of this lecture in the next lecture we will see another very nice design idea called maximum likelihood that is also a very popular idea we will see that in the next lecture thank you very much