Transcript for:
Understanding Statistical Marginal Effects

okay hi everyone um this is a recorded run of my talk that I'm going to be giving at USR uh 2024 which is going to happen in salsburg in in a week and I'm doing this on video in case you can't make it or if you want to attend another talk at the same time that's fine uh just listen to this instead so the talk is about how to interpret statistical results with marginal effects which is our package that I've been maintaining and developing over the last few years and I'm really excited to present it so who am I I am a professor at the University de moral in Canada I've been publishing packages on cran for about 15 years so it's been a while um and I plan to keep on doing it I like it it's fun uh I've recently been named as an associate editor of the art Journal so send us your papers um this is uh nice Journal it's good uh send us your stuff uh okay so that's me uh today I'm going to discuss two main problems that I believe affect uh well certainly affect me and almost certainly you I think most people who deal with data um have to deal with these problems uh the first one is that our outputs and really software outputs in general are pretty inconsistent when it comes to statistics so here's just a simple example so you have this predict command that is in uh in r that everyone uh has used almost certainly if you're at this conference um so predict computes fitted values for your model so you do predict mod and then oftentimes you'll write comma SE fit to get standard errors around your your predictions essentially and what's interesting to note is that depending on the type of model that you fit the this function you're going to get wildly different results so if it's a glm model for example you're typically going to get a list with two vectors and one scalers you'll get the fitted values you'll get standard errors around those fitted values and you'll get something called residual scale now if you do something else you estimate a ordinal logit model for example using the Paul R function then if you do the exact same command you run the predict comma uh SC fit true you're going to get a matrix of estimates and no standard errors so that's kind of weird and it's annoying and it's a problem that everyone has to deal with when you use different modeling packages often times these will behave slightly differently I mean it's not so bad R is great and the packages are great but there are inconsistencies and it can really slow you down um so that's that's one problem the second problem is more basic it's just that stats is really hard um so take this example so imagine you estimate a low jit regression model so that's just with the standard glm function so the dependent variable is bin because it's a a binary variable then you're going to have two predictors a numeric one that's just a continuous variable maybe normally distributed or something like that and then a categorical variable with three levels a b and c so you estimate this super simple uh basic logit model people see this in in their let's say second stat class or something uh so it's a very simple model and you have the coefficients on the side right uh and so even though this model is pretty simple I would argue that it's surprisingly hard to interpret so first of all people are just terrible at interpreting probabilities in general right so if you look at the literature in Psychology and in behavioral economics for example there's really lots of evidence that people cannot do this properly they when when the probabilities are very high they tend to to have a bias and not interpret it correctly same thing with very small probabilities uh so so it's just it's just a really hard thing to interpret probabilities and if probabilities are hard these kinds of Expressions there the the log odds ratio is much harder and really for most non-sp Specialists I would argue that's an expression like this which corresponds to a logic coefficient of course uh is really much harder so when you look at the individual estimates in this table like this one there well it's really hard to to say exactly at to well it's it's maybe possible to give a specific definition of it a very uh classical definition of it it's possible but to get a gut understanding of this thing is is really much harder I think so so what we do is in these stats classes or in practice we develop a bunch of very specific model specific tips and tricks so we'll expon exponentiate these coefficients for example or do something like this but what happens if you move Beyond this simple logic model you have a model with interactions you have a model with splines you you go for a multinomial model maybe or even some machine learning algorithm like XG boost so what do you do then how do you interpret your model well then you you're going to have to rely on yet again some model specific tips and tricks and and oftentimes well people just don't know what to do so so the question here is that stats is hard what do we do after we fitted the model how do we interpret those parameters so the one solution that I'm going to recommend to people is to do Post talk transformation so this is a quote that I like from Philip David um says that a parameter is just a resting Stone on the road to prediction and so parameter estimates by themselves I would argue are usually very difficult and sometimes basically impossible to interpret as is and so in our practice as statisticians as social scientists as data scientists we often need to transform parameter estimates to create quantities that stakeholders will understand and care about basically nobody cares about the parameters of your models and so you better make sure that those parameters are represented in a way that makes sense that is intuitive that that yields understanding and actionable insight so so using this post talk transformation that is you take your model you fit it you get your estimates and then you transform those estimates then that can help you U do better I think okay so how do you do those postto Transformations well I'm proposing the marginal effects package uh which I uh believe is easy consistent flexible it supports over a 100 different types of models so whether you estimate a glm a gam model some basan thing or mixed effects uh multi-level model or even some machine learning models so it it's compatible with the Tidy models and mlr3 uh uh ecosystems so regardless of the model type that you use uh almost regardless uh you'll be able to use a single workflow with a unique set of commands that do uh a lot of what you want to do and so all of this is documented in a free online book that is already online at marginal effects.com that I encourage you to visit uh and it has 30 plus chapters with really detailed explanations and case studies um and it's uh I spent a lot of time on this so please please look it up uh it's I think it's a good website it has cute illustrations of of animals and such uh all right so what does this package do well it offers basically three different quantities or sets of quantities that you can you can compute the first one is just straightforward predictions so often times we call those fitted values but they can they can be made for for different uh predictor values so we're going to use the predictions the average predictions and the plot predictions functions for this then we can do counterfactual comparisons those counterfactual comparisons are basically just functions of two predictions and you'll see that we can express a lot of different quantities of Interest as these can factual prediction so you can do contrast differences like a risk difference you can do risk ratios you can do odds lift all these quantities can be expressed as functions of two predictions and this is what I'm going to call a counterfactual comparison and then finally you can do slopes uh so that's a partial derivative essentially um in some uh literature and some disciplines we call this a marginal effect uh and that's going to be estimable using the slopes function and and family so the key uh of this the key of this package what's really great is that you can then once you've estimated the these uh these quantities once you've transform the parameters of your models um you can run hypothesis tests on all of the quantities and you can compare different quantities very easily with the with the package all right so here's the the demo so first we're going to start with this function called hypothesis which I haven't spoken about yet but which this function has a hypothesis argument that's going to be carrying over through all the other functions so let's let's start here so imagine that we we look at the same mod same logit model we used earlier so it has an intercept and some coefficients there now imagine that I have a null hypothesis that I want to try to reject that is the beta one beta 3 is equal to Beta 4 so this is a a hypothesis test of coefficient equality so what I do is I use hypothesis plural that's my model and then I specify my hypothesis using this argument and I say B3 equals B4 okay and so the output there is the null hypothesis that these two things are equal and I see that the standard error is pretty small and the Z value is big and the the the P value is pretty small and so I can reject the null hypothesis that these two coefficients are equal so this is a very very simple test but you could do much more complicated test so this was just a linear test essentially but you can do any um almost any function that is uh evaluable uh in R so you can do uh like exponentials or or logarithmic Transformations and and and set those hypothesis I'm just showing showing you that this is a really really flexible argument all right so the killer feature here is that the hypothesis argument is available in all the functions of this package so if you compute predictions you're going to be able to run hypothesis tests on those predictions same think for counterfactual comparisons or slopes and so this is going to allow you to answer really important scientific questions so so for example if you compute the predicted probability of survival for men and women you're going to be able to run a formal statistical test to determine if the two predicted probabilities of survival are equal if the men survive at the same rate as the women or if you're going to estimate two treatment effect sizes for example so one for the old and one for the young maybe like a pill works better for young people or something well you're going to be able to run a formal statistical test to compare these things so is the treatment Effect one bigger than treatment effect two or A and B all right so that's that's the killer uh the killer feature I think all right so let's move now to predictions so as I said before predictions are essentially fitted values um so in base R what you would do is first you'd construct this data frame with the predictor values that you care about so for example in this case I want the predicted value for an individual who is A on the the category variable on the cat variable and zero on the numeric variable and then I would feed this data frame to the predict function so you do predict model then your data frame with the values that you want and then using the type argument you would specify what scale you want this on so in glm models typically you have the response scale and the link scale here I'm I'm picking the response scale which is in the logit model just a probability so this is basar uh uh uh syntax it's been there for years nothing's changed here this is there's nothing new there what is new is this predictions function and it has the exact same syntax plus some other arguments that we'll see later but it's the same syntax but it gives you much richer results so first you get the estimate the P value the confidence intervals uh so this is this is great stuff already now if you just run predictions without any argument at all you'll get fitted values for all the rows of the original data set so in this data set that I'm using here there's about 200 rows so you see there like you see the top few rows the bottom few rows and then there's there's some emission but the these are the fitted values for all of them with with confidence intervals and this is just a a basic data frame with pretty print so you can extract uh rows you can extract columns in a very tidy friendly uh workflow all right now this this was predictions but you one prediction per row of the original data but you can also do an average prediction so let's say I want to make predictions for every row of the data set but then I want to take their average I just do average that's a AVG prefix and I do predictions and then I get the estimate and the estimate here is that the average uh uh predicted outcome for for this data set is 0. five um so remember this is a binary outcome uh in a logit model so we're at 0.5 this is the this is the average prediction in a small now what is pretty cool is that we've added this buy argument so you can do average predictions by subgroup so in this case we had a predictor called cat so we do average predictions mod by cat and there you're going to see that the predicted uh probability that the outcome is one is really different in all these subgroups so in subgroups it's really it's really low so we have 18 19% chance that the outcome is one in the a group but 84% chance that the outcome is one in the C group so these are very different and you get you get average predictions by subgroup with again all these confidence intervals and all the the the goodies now let's go back to this hypothesis argument that I've told you uh a lot about already so let's say we start with same command as in the previous slide there's nothing new here we just make average predictions for each subgroup now imagine that I want to compare these subgroups I want to say is the you know first subgroup bigger than the second one or our average predictions in cat a in category a and category b equal so that's the killer feature that I think is really important so what I do here is I do the same syntax but I add one argument the hypothesis argument and here I'm going to say is the first row equal to the second row so that's going to compare my8 186 to. 344 and what I see here is that in fact these things are different they're about 0.16 apart and this that difference is statistically significant so I can reject the idea or the hypothesis that these two predicted values are or these two average predicted values are equal so I do know that these groups are different uh in terms of the predicted outcome yay all right so just uh showing you very quickly it is very easy to plot the predictions using the plot predictions uh function so almost all of the arguments in that function are the same as in the other one so there's there's a high level of consistency across interfaces there uh so that's pretty nice this is a standard ggplot object so you can always customize it with all the themes that you want and the labels and such all right now let's move on to uh Beyond predictions Beyond fitted values to something that's more that's closer to causal inference so here um I want to present coun factual comparisons and that's going to be contrast uh risk differences risk ratios odds lift these things okay um so imagine uh that you're interested in this kind of counterfactual question how much does the predicted outcome the predicted probability of b equals one change when the predictor numb increases by one unit so I increase my treatment by one unit how much does the outcome change in terms of predicted probability so the expression there uh that you'll recognize probably as uh estimated risk difference right um so this is what we get when we run the comparisons function and we specify that we care about the variables num so when the variables argument is set to a value like this it says I'm interested essentially in the effect of increasing numb so what you see here is that there's the estimated effect of increasing numb by one unit on the predicted probability of the outcome okay that the outcome is equal one now since this is a logit model the actual effect of this num variable is going to change based on all the covariates in the model so every single row of the data set has a different estimate estimate of this this risk difference now since this is kind of unwieldy we don't want one estimate per per row of the data set what we can do is do an average comparison so here we would get average difference in predicted probability that being equals one associated with an increase in Num so so the number here says that when I increase num by one unit on average it's going to reduce the probability that the outcome is one by one percentage Point okay so we compute the estimate for every single row of the data set and then we take that average or alternatively I could be interested in a in a bigger shift so this is a change of one unit but I could be interested in the effect of five units of none or maybe it's a standard deviation of none of num or I want to move and across the inter quarti range or something like that or from the minimum to the maximum so all of these different treatment effect sizes or or treatment sizes or treatment strengths I guess uh are very easy to specify in this so in this case what I'm showing here is the effect of a change of five units of num on the uh predicted probability that the outcome is equal to one uh the average comparisons function is super flexible so so far I've only shown you risk differences but there's this argument called comparison and there you can specify um a lot of different comparisons so you can do ratios so uh how does the ratio of predicted probabilities change when num increases by one unit right or you could do lift or odds or you can even specify custom quantities so that's that's a really powerful feature that I can uh talk to you about later if you if you can see me um there's um it's it's kind of crazy like I stumbled on this by this feature by accident and and I'm really really happy with it um okay so now what happens if you have different predictor types so if you have binary categorical or numeric variables the average comparison function is going to be pretty smart about it so num here is a numeric variable so it's going to by default show you the effect of increasing by one unit but if you have a c categorical variable it's going to show you all these comparisons so it's going to go what's the effect of moving from B to a from A to B or from a to c and you can do sequential or you can do a reference or you can do you can do all these these different things for categories so that's all done automatically for you or you can specify and customize it as you wish um and now this is we're back to the we're back to the killer feature right the killer feature is the hypothesis test on these thing um on these things so if you look at the first two rows of this so the first one is the effect of moving from A to B on the predicted probability that the outcome is one the second row is the effect of moving not from A to B but from a to c on the predicted probability that the outcome is one so these are two treatment effects essentially right and we see that these two treatment effects are different so the first one is a 16 percentage point the second one is 65 percentage points so these things look different but I want to know are they different statistically well what I can do is use the hypothesis argument which I've shown you already so the question here is is the average effect of B versus one equal to the average effect of C versus one and here we get an estimate that's the difference in effect sizes right this is not an effect size it's a difference uh in Risk differences right so a difference in differences essentially and we see that this difference is indeed statistically significant so we can rule out the the idea that B has the same effect as C relative to a okay very nice all right so just very quickly we can do slopes with this slopes are partial derivatives essentially or marginal effects in some uh disciplines uh so this is uh very quickly uh this is a a gam model with some splines uh these things I've always find found are are pretty hard to interpret but here the this makes it much easier I think we can evaluate the the the partial derivative of y That's our dependent variable with respect to time and evaluate that partial derivative at every point in The observed sample so that's just this one command does the all of that and then we can plot predictions versus slopes so I I've shown you the plot predictions function already but there's also a plot slopes function which gives you ah I don't I don't have it um all right my slides are broken uh these two plots are beautiful next to each other believe me I'll try to paste it maybe and post uh all right so that's it so if you want more uh there is more so I've told you about this this website but this website has 30 different chapters of stuff it covers all the quantities that I've shown you already it tells you how to do these hypothesis tests and it shows you a bunch of really more advanced topics so how how to do machine learning how to do causal inference robust standard errors and bootstrapping conformal predictions categorical outcome so if you do like a a multinomial lit or something like this equivalence tests interaction survey experiments uh ipw matching um this one is Fun the that's some people call this MRP poststratification and then multiple imputation so so there are detailed case studies with code uh in both R and python actually um so so look that up and then uh just to finish I want to plug two things that I've also been working on in parallel so the first one is called tiny table and tiny table is a dependency-free very simple to use super minimalist um package that is despite all the Simplicity pitch uh very flexible so it takes data frames and it converts them to tables in HTML latc typist um word markdown and and you can do all this this fun stuff and then if you're interested I also maintain this package called Model summary which allows you to create both regression tables regression plots and data summaries so if you have if you want table ones or table twos like descriptive statistics uh it has super flexible and Powerful capabilities for this and it can save your tables in in a bunch of format so check those out and if you have any questions about these or about martial effects I'm always super happy to chat so send me an email or open a isue on gith on GitHub or ping me on Twitter and um we'll engage all right thank you