Transcript for:
Fundamentals of Structural Equation Modelling

What is structural equation modelling? Well, I think one of the first useful things to understand about SEM, as I'll refer to it, is that it isn't a single technique as such. We wouldn't want to compare it to... say learning ordinary least squares regression or logistic regression, log linear modeling, which although these techniques have a number of different aspects, we can think of them as, if you like, single... approaches to address research questions. I think SEM is much better thought of as a general modelling framework that integrates a number of different multivariate techniques into this overall framework. It's a framework which draws on a number of different disciplines. It brings together measurement theory from psychology, factor analysis also from psychology. and statistics, path analysis from epidemiology and biology, regression modelling from statistics and simultaneous equations from econometrics. And all of these different techniques come together to form structural equation modelling as a general modelling environment and it's also an environment which is somewhat dynamic. It is not set in stone at this point in time, it is actually a very complex environment. actually often integrating new ways of fitting models as the technique develops over time. What sort of research questions would CEM be particularly suitable for addressing? Well I think it's being a general model fitting environment it can address many different kinds of research questions but I think it's particularly suitable in situations where the key constructs, the key concepts that a researcher is interested in, are complex and multifaceted, often relating to psychological, social psychological concepts. So these kinds of concepts can be quite difficult to measure and are often measured with error. And one of the useful aspects of SEM, as we'll see, is its ability to make corrections for error. errors of measurement. Other kinds of research questions that SEM is well suited to are ones which specify systems of relationships rather than as we may be used to if we're fitting regression models where we have a single dependent variable and a set of predictors or independent variables. Structured equation models may have numerous different outcomes or dependent variables each of which is affecting other dependent variables in a more complex system. So if a researcher is interested in modelling a causal system then structural equation models are particularly suitable. Another kind of research question that structural equation models are often used to address is where the researcher is interested in indirect or mediated effects. So in many research it's questions, we're interested in the effect of variable X on variable Y. That would be thought of as the direct effect of X on Y. But in many research contexts, we're interested in more complex kinds of relationships where the first variable X perhaps influences a second variable Z, which then has a second effect on Y. That would be seen as an indirect effect. And then, same as on the other side. are very well suited to addressing those kinds of mediated research questions. Now SEMS are known by a number of different names in the existing literature and this can be somewhat confusing. Sometimes they are referred to as covariance structure analysis models. This relates to the fact that with SEMS we're actually analyzing covariance matrices not their variables directly. We'll come on to that in later films. They're also known as analysis of moment structures. This is what gives the SEM software Amos its name, because this is in recognition of the fact that the more modern SEMs analyse not just covariances, but also means, so higher order moments. It's also known sometimes as a LISREL model, which again takes its name from possibly the most well-known software. certainly the first software for fitting SEMS, which is LISREL. More controversially, SEMS have been referred to as causal modelling, and they often certainly have historically been associated with analyses which get at causal effects. But I think that is probably a more controversial name to give to any modelling technique, because the claims for causal inference will come from the research... design rather than the statistical model that we apply to analyze the data. There are many different software packages that are available for fitting SEMS and this is a list that's changing and growing all the time. As I mentioned the probably best known is LISREL which was developed by Karl Joroskog and Sorbonne, one of the first available packages. Now there are many more software packages available M+, EQS, AMOS, R is a free package, Stata. And many of these packages have more limited versions that are available for free for students to download and try to see which one is most suitable. I wouldn't want to make a recommendation for any particular software package. Each one has its own particular advantages. disadvantages. So what is structural equation modeling? Well there are many possible answers to that question. The one that I'm going to propose in this film is that SEM can be thought of as path analysis using latent variables. Now this definition may not be very helpful to you if you aren't very familiar with either path analysis or latent variables so for the remainder of the module I'm going to run through what path analysis is and what latent variables are. So, what are latent variables? Well, most of the concepts that we're interested in in social science are not directly observable. Things like intelligence, social capital, trust. It's very impossible to go and put some kind of meter into people and get a direct reading of their level of social capital or trust. So this makes these concepts hypothetical or latent as we refer to them. believe that they are latent within people at some level and that they drive attitudes and behavior but we can't actually directly observe them. So we're in a bit of a difficult position if we can't measure these concepts that we're interested in but fortunately we can use approaches which measure these latent variables using observable indicators, using variables that we can measure. directly that we believe to be caused by the underlying latent constructs. So if we think of a questionnaire item, a question in a questionnaire that's been administered to a sample of people, this would be a good example of an observable indicator of a latent construct. So let's imagine that this question asks people how happy they are with their lives on a scale of one to 10. Now some people will give higher answers or lower answers. There will be variability, variance in this variable across the individuals in the sample. Now we don't think that all of that variability is only to do with people's level of happiness. Some of it will be, so some of the variability will be caused by variability in the true level of happiness across people, but there will be other factors that also cause variability, possibly to do with the questionnaire design, the temperature in the room, whether the question is administered by an interviewer or completed on a computer. These are all other factors that we're not really interested in, in what we're trying to measure, which is happiness. So some of the variability will be to do with happiness, the latent construct, but some of the variability will be due to other factors. error and unique variance. So we can summarize these ideas quite simply in this formula, the true score equation where x equals t plus e. So here the measured variable, the observed indicator is x and as I said the x, the variability in x is comprised of both true score and error. So... The true score is simply where the individual is on the true happiness dimension, their true underlying level of happiness. The error comprises two components. The first is what we could think of as systematic error. This is a bias where perhaps the question is phrased in a way which makes people give higher happiness ratings than they... than their actual level of happiness. Maybe it's because it's a question administered by an interviewer and they don't want to seem unhappy because that's socially undesirable. This would be a systematic error. A random error would be one where you're just as likely to overrate as to underrate your happiness. So we can think of the systematic error as being one where the mean of the individual errors doesn't cancel out. It doesn't equal zero. Whereas a random error, you're as likely to give a higher as a lower score. So the expectation would be that the mean of the errors would cancel out and be zero. So this is all by way of saying that when we measure a variable, when we measure x, ideally what we would be able to isolate would be the t part of the variance, the true score. And to remove the error variance when we're trying to either predict... t or use t as a predictor in a model. So we can now translate this true score equation into a very simple path diagram which is key to representing structural equation models. So here we can see that the the X reads over to being the observed item in the rectangle, the T reads over to being the the latent variable, the true score in the ellipse, and the E reads over to being the circle at the top of the diagram, the error. And the arrows indicate that the observed score is caused by both the true score, the latent variable, and by other factors, the error. So we can we can encapsulate those ideas in this simple path diagram. It would be nice if we could implement this as a statistical model. Unfortunately, when we only have one indicator of the latent variable, if this is happiness, then this equation is what we would call unidentified. We have more unknown pieces of quantities that we're trying to estimate, the t and the e, we don't know what they are and we would like to estimate them, than we have known pieces of information, the x. We've measured x in our sample. We have two unknowns and one known, so we can't solve that equation uniquely. The equation is unidentified. So we can't separate the true score from the error when we only have one measure of the underlying concept. What this then tells us is that we need to have multiple indicators of our latent constructs. When we have multiple indicators, then we can start to... over-identify the true score equation and estimate the quantities of T and E for each indicator. So we can apply many different kinds of latent variable models. We can use principal components analysis, factor analysis, latent class models, depending on the metrics of the observed indicators that we have in our data set. But What these are all going to do is to provide us with a summary score, a reduced set of factors or components relative to the full set of indicators that we start out with. And in doing that, they will correct for the error in each of the individual indicators and give us a better measure of the true score of the concept. We can... Represent this simply here with a common factor model. Here we have four measured variables. Let's think of these as questionnaire items. Again, they might be measuring happiness, different aspects of happiness. Are you happy at home, with your work, with your friends, and so on. So we've got four indicators of the same underlying latent variable, happiness. Now, because they measure the same thing, we would generally expect these variables to be correlated in R. in our population and that's what these double-headed arrows indicate. The curved double-headed arrows indicate that the X's are all correlated with one another. That's one way of representing what's going on here. Another way would be to do away with these correlations and add in the underlying latent variable, someone's true level of happiness which we've here denoted as eta. In this model now we have happiness latent variable having a causal effect on each of the indicators and that causal effect is what we can think of as the true score, the t part in our x plus. x equals t plus e equation. Now if that's the case then we also need to have error terms for each of these equations here, and that's what we show in the diagram there. So with these multiple indicators we can apply a latent variable at this case of a factor model and we can get empirical estimates of these key quantities and here now the lambda coefficients there in this model we'll refer to as factor log. loadings and these are the correlation between the factor, the eta, and each of the x variables. Now if these indicators are good indicators of happiness we would expect these correlations to be high. We would expect the correlation between a good indicator of the latent construct and the latent construct to be close to or approaching one. So, if we are able to measure our constructs with multiple indicators, we can apply latent variable models and this brings a number of benefits. Well first the kinds of things that we're interested in modelling in social science are generally complex and multifaceted. If we think of happiness for example it's difficult to come up with a single question which covers all aspects. of a person's individual well-being. So we probably need to have multiple indicators to get a good coverage of the concept. As I mentioned it also enables us to remove or at least reduce random error in the construct that we're measuring. This I think we can convince ourselves that removing error seems to be a good thing to do but more formally we can demonstrate that if we have random error in in a dependent variable, although it leaves the estimates in a model unbiased, these will be less precisely measured. They'll be a noisier measure with wider confidence intervals. More seriously perhaps, if we have random error in independent variables, then regression coefficients that we estimate using those independent variables will be attenuated. They will be smaller than they are in the population, systematically smaller. tending towards zero. So we will underestimate effect sizes and we will falsely fail to reject the null hypothesis. So what is path analysis? Well again there are many ways that we could answer this question but I think a key feature of path analysis and one that makes it very appealing as part of structured equations modelling for social scientists is that the model that you're wanting to fit to the data is represented diagrammatically rather than in the form of equations. Of course we can represent the structural equation model as a system of equations, but we can also represent it as a diagram and this visual aspect again is very appealing for social scientists who are perhaps less comfortable and less intuitive in their reading of equations. So the standardized notation of path analysis is a very important feature. The path analysis presents regression equations between our measured variables. So we're interested again in kind of systems of relationships between multiple observed variables. Now that's important that I'm saying observed variables there because in a standard path analysis we would not be using latent variables but variables which are directly observed, again perhaps single questionnaire items other kinds of measures. A third key feature of path analysis is its focus not just on direct effects but also as I was talking about earlier on indirect effects and total effects. So for research questions where we don't have a simple linear model where we're estimating the effects of some set of predictor variables on an outcome, a dependent or a criterion, but we're interested in the pathways between multiple independent variables and possibly multiple dependent variables. So in this slide I'm presenting some of the... the standardized notation, the way that we represent different parts of the model using diagrammatic notation. We can see at the top a measured latent variable, so a latent variable would be presented as an ellipse. An observed or manifest variable, such as a questionnaire item that we might use as an indicator of a measured latent variable, would be a rectangle. An error variance, or a disturbance term is a small circle and there's a similarity with the measured latent variable there they're both circular shaped and because an error variance is also a latent variable it's just that we are not specifying it as measuring anything in particular it is the what's left over the residual or disturbance term a covariance path where we're specifying that two variables in the model are related, are correlated with one another, would be represented as a curved double-headed arrow. This is a non-directional association, we're not specifying there is any causal link from one variable to another, but we want to indicate that they are correlated. And finally, the single-headed straight arrow represents a directional path, or what we would generally think of as a vertical line. implying causality in the model, a regression path from one variable to another. So here are some examples of some simple path diagrams that we could represent in equation form or using standardized path notation. In this simple diagram we can see that the variable x has a causal effect on y and the d term there is the disturbance term. so the error term in this model. This is essentially a bivariate regression model. We could also write this in that standard equation notation. This second path diagram is somewhat more complicated, but really is just adding in a second independent variable, x2. So again, this is equivalent. to a multiple linear regression with two independent variables, a dependent variable y and an error term, which in this path diagram is labelled d for the disturbance term. Now as I mentioned one of the things that path diagrams, path analysis is particularly useful for is for studying not just indirect, not just direct effects but also indirect effects. We can see now that we've introduced a more complex relationship between these variables where x1 has a direct effect on x2 but x2 also has a direct effect on y. so we now have an indirect effect of x1 on y through x2. And we can use standard formulae to decompose these regression coefficients indicated by beta1 to beta3 into the direct, indirect, and total components. So here, beta1 represents the direct effect of x1 on y. effect of x1 on x2, beta 3 now is the direct effect of x2 on y and beta 2 times beta 3 will give us the indirect effect of x1 on y and we can also compute from this path diagram the total effect which is the sum of the indirect and the direct effects between one variable and another. So if we take the sum of beta 1 when the product of beta 2 and beta 3 this will give us the total effect of x1 on y. So that's given a very brief overview of both latent variables and path analysis. And what I'm encouraging you to think about to understand what we're doing with structural equation models is that when we have a path diagram that includes latent variables rather than just observed variables, as we can see in this diagram, then we're representing a structural equation. equation model.