Transcript for:
Overview of Difference and System GMM

hello and thanks for joining me on this ev's presentation on difference GMM and system GMM this video shows how to estimate difference GMM choose between difference GMM and system GMM and interpret the results of the selected model the features of the two estimators are as follows large n and small T the number of groups in must exceed the number of time periods a large functional form sorry a linear a functional form Auto regressive dependent variable endogenous regressors group specific fixed effects which is the heterogeneity factor heteroscedasticity which is onal error variance and serero correlation within groups so the difference GMM credited to arenum bond 91 uses first differences of the variables to remove fixed effects and corrects indigeny with the use of instrumental variables the weaknesses though include the fact that differencing eliminates previous observations and erases time constant variables is examplified in this illustrative table right here where X1 as you can see is time invariant and so if you took the differences of these observations you're going to wind up with zero system GMM which is credited to Ariana and bver in 95 with methodological improvements if you will in 98 by blundel and bond corrects indigeny by introducing additional instruments to dramatically improve the efficiency of the model and what it does is to transform the instruments rather than the regressors by differencing the instruments to get rid of the fixed effects and it uses what's called orthogonal deviations instead of first differences by subtracting in the average of future observations from the current value contemporaneous value if you like and here's an example of how that works so from these uh from this original series if we were to use difference GMM obviously in the first time period we're not going to have any observations uh any observations since there's nothing preceding it but in the second time period is going to be 21 minus 10 to get us 11 third one 34 - 21 to get us 13 and the bid goes on now with orthogonal deviations and the system uh GM in the first time period we're first of all going to calculate the average of the future observations meaning from 21 down to 95 that average is 57.7 and so 57.7 minus 10 will get us 47.7 as you see in this definition and then the second time period again we're going to first calculate the average of the future observations meaning from 34 down to 95 and that comes out to 6440 so 6440 minus 21 will fetch us 43.4 and so you work your way down the journey Mery Lane so you can already see that this manner of differencing is more robust when it comes to the use of unbalanced panels and missing observations indigen in both estimators is resolved with the use of instrumental variables as noted earlier but because these two estimators are designed for General use they do not actually assume that good instruments are available outside of the data set and so the assumption is that the only available instruments are going to be internal instrument specifically lacks of the endogenous regressors now the to be sure the the estimators do allow the inclusion of external instruments and for that matter there are three main types of variables we encounter in this type of estimation exogenous variables are those that are uncorrelated with the error term predetermined variables are uncorrelated with a present error term but correlated with a past error an example of which is the lag dependent variable which is included as a regressor in the dynamic model endogenous variables are correlated with the error term and so in differencing in difference GMM the original model is differenced as you see here and while differencing does in fact remove fixed effects as you can see in the expansion of the error term right down here and the differencing of it we are still left with among the things the lagged de the lagged error term which naturally is correlated with the lag dependent variable again which is included as a regressor in the model and so even if the other regressors are strictly exogenous you find that there's still going to be an element of endogenity left in the model noting that difference GMM produces biased and inefficient estimates in particular when the model is persistent in that it follows um a close random walk and instr ments are weak in that they are weekly correlated with the underlying regressors and time period is short system GMM uses a two equation approach with additional moments conditions that is additional instruments to deal with those issues the additional instruments have been found to yield more efficient parameter estimates and so the question becomes how do we choose between difference GMM and system GMM the rule of Thum recommended by bond in 2001 is to First estimate this original model in this example I have a three variable model with um two original regressors X1 and X2 and then we have the inclusion of the like dependent variable as a regressor also in the model so we're going to first estimate this original model using pulled OLX and make a note of the coefficient of the lagged dependent variable and repeat with fixed effects model estimation using either the least Square exam variable or within group approach and again make a note of the coefficient of the lag dependent variable and then finally estimate difference GMM and make a note of its coefficients what happens is that the estimated coefficient with OLS is is the upper bound while that with fixed effect is considered the lower bound and if we find that the coefficient estimate with difference GMM is greater than that of fixed effect then what it means is that difference GMM is going to be the way to go and meaning that difference GMM is correctly instrumented however if it comes out to be less than the fixed effects coefficient estimate there then we roll with system GMM because what that means is that difference GMM in this case has a downward bias due to weak instrumentation so that's the rule of thumb so let's go ahead and demo that real quick right here and EV views so in this example I have 247 groups and six years of annual data and so this definitely fits the bill in that my number of groups far exceeds as you can see the number of observations the number of time periods and so with generic notations for ease of understanding Y is my dependent variable X1 is my first independent variable and X2 is the second one so what we're going to do is to First estimate pulled OS in the order I mentioned so right click on any of these open us equation and right here let's include the lag dependent variable Yus one right there and we're done we're just going to have to click okay method is list sares because it's pulled OLS so okay and this is what we're looking for we make a note of this coefficient which is approximately 69 all right then X out of it and do do the same this time with fixed effects estimate so right here again before we go ahead include the loged dependent variable like so and then go here to panel options and for crosssection switch this to fixed and that's all we need to do okay and that's our output right here and once again we're going to make uh make a note of this so This 44 is going to serve as the lower bound and we X out of it and finally we'll go ahead and then estimate difference GMM so right click open us equation and we're going to come to method and switch this to generalize method of moments Dynamic panel data so click on it and now click on panel options let's go through the tabs panel options I'm for crosssection down here choose difference GMM because that's what we're getting ready to estimate we can leave everything else as is including GMM weights then click on instruments and let's type in our internal instruments which is going to be X1 log one and then X2 l two oh sorry lag one all right so the first LS of the regressors and then click on options and for options we're going to check this always keep GLS and instrumental variable GMM weights all right so that's it so go back to spec don't click okay go back to specification again and now let's go here to the lower left corner and click on Dynamic panel wizard all right and uh begin our journey to ensure that we're set to estimate difference gmn so here the dependent variable is recognized as y next step two the independent variables are recognized as X1 and X2 and the constants and if you do have period Dy variables you can check this but we don't have in this example so I'm going to uncheck it then next for step three difference is already selected so we're good to roll with that next up and right here the um instrument for the F for the first lag of the independent variable specifi as a regressor that the instrument of it is what you see right here which the system has already identified so click next to step number five and in Step number five our other internal instruments are selected are identified so next and final step and we're going to stay with um the um default option right here so we're done really and finish so now as you can see our Dynamic model is correctly specified which now includes the logged dependent variable as a regressor right there and if you want to confirm the instruments click here on the instruments Tab and you're going to see all the the three instruments the instrument for the first lag of the dependent variable the first regressor and the second uh original regressor that is so we're good to go just click okay and then that's what we got going right here and this is the highly coveted if I may say coefficients of the lag dependent variable which we now have to compare to the upper bound from OS and the lower bound from fixed effects and here is a summary on my PowerPoints go here right there so as you can see the coefficients from different GMM is greater than that from fixed effects and so we conclude that different GMM is correctly instrumented and with that we're going to have to base our interpretation of the dynamic model specification on the results of difference GMM however before we give this a good WRA let's go ahead and perform the arano um Bond test of seral correlation so we're going to go right back here and on this different GMS output we're going to go to view residual Diagnostics Ariano Bond zero correlation test and this is what we're looking for for reasons explained in the last video so we're looking for the P value corresponding to ar2 and at the 5% level of significance we cannot reject the null hypothesis of no SEC second order zero correlation and so we're quite happy with this and now we summarize and conclude for that I head back to my PowerPoint so here's a summary of the differ GMM output where the J statistic of 7.73 has a P value of .56 which as you can see is greater than the threshold of 25 and so we say that the null hypothesis of overriding restrictions is not rejected which supports the validity of the dynamic panel model specification next up we find that the coefficient of the L dependent variable which captures persistence in the model has a value of 47 remembering this coefficient is between 0 and one we can say that this reflects moderate a moderate degree of persistence and that there there's a bit of history or memory if you like in the behavior or performance if you like of the dependent variable next up we already have determined that there is no evidence of second order zero correlation and finally we note that the two original regressors X1 and X2 are statistically significant X1 at the 1% level and X2 at the 5% level and both have positive coefficients meaning that individually they do have a positive impact on the behavior of Y and this concludes this presentation what I'm going to do next upop is to show an example of system generalized method of moments and that's going to be the case where the coefficients of the L dependent variable with differ GMM estimation is less than that with fixed effects estimation you enjoyed it