Transcript for:
Understanding Data Entry and Reliability

Hi everyone, thanks for watching week 6 lecture. In today's lecture, we're going to focus on data entry and calculating inter-rater reliability. So first, we're going to introduce different types of data, and we're going to also do some demonstration on data entry and calculating formal inter-rater reliability. Types of data. Okay, so when we're collecting data, there's a different kind of data we're collecting. Some of them are categorical, some of them we call continuous. So within the categorical data, there's two subcategories. One is called nominal and another is called ordinal. So nominal is basically named categories. So basically it gives a name to a category. For example, the present or absent of a behavior. Present is like one or absent is zero. Or you can assign a number to those categories. And actually the number right here doesn't really mean anything. It just gives it a label. So letting you know that, you know, like... this is the name of the category. Like gender, we could have male, female, non-binary, etc. There's no order and so that's not ordinal. It's just like naming the category. And also like race and ethnicity, it's also nominal data. So ordinal data is It has the natural ordering. So the number you put in there represents some kind of order by their position on the scale. For example, the letter grade in an exam, A, B, C, D, F, those categories actually has an order effect. So A means like a highest order, and next is B and C and D and F. And education level. high school, college level, graduate school, right? So each level is actually representing a higher grade than the previous one. So in this example, then we can say that this education level or like letter grade in the exam is ordinal data because the order matters, okay? So in the nominal data, the order doesn't matter right so you can do absent as a first and then present at the um as like next but for ordinal data you cannot making so the first and is first and then the second is the second so you cannot mess up with the order that's why it's called ordinal data okay and then we have a continuous data right so in continuous data we have a different kind of continuous data it's continuous discrete and continuous continuous okay so for continuous discrete data which means that it's just the integers and whole numbers these numbers cannot be breaked or broke into decimal or fraction values like a total number of students in the class it's you cannot have like 30 and a half students right and frequency of a behavior for example when you were calculating or like telling the frequency of a behavior it either a behavior or not right so basically when you're telling it you you won't you won't have four and a half of this behavior like happen during this time interval right so in this case we say like for example the frequency of a behavior is discrete because you can only use a whole number to represent one behavior. And days in a week, right? So how many days in a week? You say like seven days in a week. So each day, right? Each day is like a discrete, okay? And continuous-continuous data, which means that you can either further define or make it more like into fraction number, like a temperature. right? It can be whole number or can be fractional number. And height, it can be whole number or fractional number. Time, it can be whole number or fractional number, right? So whenever you encounter those variables that you can make it into fractional numbers, then it's continuous continuous data. Okay, so those are different types of data. And when we are collecting data in our observation report, for our observation report, we mainly will have nominal data, right? So nominal data is like age, the gender, right, gender, age group, setting, right, and discrete data, or maybe you're using the tally mark for the number, the frequency of a behavior that happens in each interval, then we say it's discrete data, okay, continuous discrete data. So Those are all continuous data, like discrete or continuous, but we call it continuous discrete data, continuous continuous data. Okay, just to make it more clear so you know like what's the difference between discrete data versus continuous data. But in general, when you're reporting in your paper, you can generally say categorical data or continuous data. but for your information you need to know within each type of data what are the sub subcategories okay all right so now let's move on to data entry so i'm gonna use excel to first demonstrate it's gonna be very straightforward um and um so we wanted to say to make each column as a variable okay and then each row as a subject or participant or time sample. So the first row is the variable name. So like right here you can see that the each row, each column a b c d e f g, each column represents a variable. So variable name on the first row and time interval one two three four five six right gender so i would say like i will usually put the gender like one equals to one is like f female zero is male age one is junior preschool zero is k and kindergarten and setting is indoor or outside right and for example i am looking at three behaviors one is um when Children use small number words and or spatial words or large number words, right? So this is like what we how we input data into SPSS or Excel. Okay, so this is what we say like when you are inputting data, creating a variable name, there's a couple things very important to remember. In each variable can only be one word. So one word means that you have to there's no space, you cannot have a space. So if you have wanted to have a space, you need to have some like, for example, underscore to separate it so you will know. So it's easier for you to see and to read the label to read the variable name. Okay, it can have underscores within the variable name. Okay. but you cannot have the this underscore at the end of a variable name. And you can also use like upper and lower cases to make your variable, for example, right here to make your variable distinguishable. For example, right here small number, so I just use this lowercase versus upper case for me to tell okay those are like you know the label right here so it's easier for me to see a small number spatial and large number words okay so making sure that you are very familiar with this rule when you're creating a variable name okay otherwise spss won't allow you to to have a variable name that has either like random space or you use like special characters like and the character you cannot use the special characters in spss and you wanted to make the variable name as just one word okay whole word. This is a sample data sheet for categorical data or categorical variable. For your sample sheet or sample data, it's similar to what you were doing with your observation sheet. The only different thing is that you translate that into an Excel file. And you just put like a column A as time interval. This is a time sample column and your grouping variable columns. OK, so gender, age and setting and your behavioral variables. OK, assuming that you have three variables and you have each each column as one variable. And those are this. is what we call like a categorical variable data you use absent or present right so when it's absent you use zero when it's present you use one like this okay so um basically i'm gonna use sps uh sorry excel to to demonstrate okay So let's say this is the excel file that you are inputting. For example, I'm using maybe two minute time sample or time interval. You're just inputting a data name or variable name. So I just make it wider so you can see the whole thing. This is like a categorical data. Right. And for for one observer. And so I have a for example, I have a two minute time samples. And in total, I will eventually have a 60 time samples. OK, in total, because remember, we said we need to have a 120 minute observation time. And depending on your time sample length, you will have a different. time samples eventually. So if you have a two minute time sample, then you're going to have a total of 60 intervals, right? 60 intervals and time samples at the end. And you usually will use ones and zeros. So it's easier for you to see how many ones and how many zeros. Okay, and age. setting and this is what I put in right and large number and this this will look like this your data file will somewhat look like this file when you're when you're doing your formal data collection after you're done with your formal data collection you will open your as you open your excel and recording down your the things that you put on your sheet, the physical paper, right? Transfer that into an Excel file. All right, so this is how we make it into a data file. This is data entry, pretty straightforward, okay? And, um... Next, we're going to talk about how to calculate inter-rater reliability. OK, and so for inter-rater reliability, we have three kinds of inter-rater reliability that covers all of the configurations that we have in our class. For those of you who have two raters. and you are recording categorical data, which means that you are doing absent or present of a behavior, you will use Cohen's Kappa. Okay? And for those of you who have two or more readers, and you are tallying the behavior, which means that you have continuous data, then you're going to use intraclass correlation coefficient. Okay? So it will be called ICC. and if you have a two or more readers categorical data um and you use like absence or present but have three readers right some of you do and um so you will use fly's kappa okay so we can also use um krivendorf's alpha but this requires an extra sps as um add-on or like another like software to download um so we're gonna not use this one which uh so we're gonna actually use a fly's kappa if you have a categorical data and you have a three raters. Okay so we're going to explain one by one. So for a for a Cohen's kappa, the assumptions are first the rater or the rating or the observation that is made by two raters. So only works for two raters. Okay so for Cohen's kappa You can only have two raters and the variable is measured as categorical data, like ones and zeros, yes or maybe you are no. You can only have categorical data, not continuous data. Both raters assess the same observations. Like what I mentioned, you and your partner will observe same setting, same kids at the same time for the inter-rater reliability calculation. Okay, and each rating must have the same number of categories, and which we do, right? So like we have three categories of, we have like for example two categories of each behavior or each rating. Like for this behavior one, you have a present or absent. this is considered as like two categories of rating, right? Two categories of this rating. And then for all of your behavior categories, you have two categories, right? Ones or zeros. And two raters are independent. This is another assumption, right? So you cannot really like depending on each other's rating when you are doing observation. you have to be independent and you are doing your own observation and so you and your partner can calculate your inter-reader reliability. So for Cohen's couple you can have a range of scores from zero to one. So from zero it means like there's no agreement at all. and 1 which means there's a perfect agreement like a hundred percent time you are agree with each other okay which both are pretty rare all right um but this is the range how you interpret your score when you see uh your number like spss output okay so when you see the scores from 0 to 2.0 then it's poor agreement from two point uh 0.21 to 0.40 It's fair agreement and 41 to 60 is moderate agreement. 61 to 80 is good agreement. 81 to 1 is very good agreement. Okay so now I'm going to run like a sample SPSS Cohen's Kappa so you know how to use SPSS to run this analysis. It's just a couple clicks away so it's pretty simple. and the thing is that you need to know how to input the data and interpret the data. Now let's look at the SPSS file. For example, this is a sample SPSS file and this is the data view. So you can see this is the reader one small number, reader two small number, reader one spatial words, and reader two spatial words. And if you have three variables you can keep going adding it from here. For example, if you have like a third variable, third behavior, you have two readers, reader one large number, and reader to large number okay and we can make it make the decimal as zero This is easier for you to see, but it doesn't really matter. And so right here, you can see that you can input in data after you create the variable name. Or another way that when we are doing our SPSS refresher, I'm going to show you how to translate or transfer from Excel file just to SPSS. But today, I'm just going to show you if you already have SPSS. and have the inter-rater reliability, like have a different raters ratings right here, I will show you how to calculate inter-rater reliability. Okay, so first we wanted to calculate the inter-rater reliability or Cohen's kappa for small number words use. Okay, so we wanted to see how rater 1 agree with rater 2 in terms of their observation on kids use of small number words. What we're gonna do to do Cohen's Kappa is to click analyze. Now hopefully you can see the top bar right here. When you want to run Cohen's Kappa, you go to analyze. and you go to descriptive statistics and from here you click cross tabs okay and so this is yeah let me reset that okay so this is what you will see from the the window when you click cross tabs So you wanted to see, for example, now we wanted to examine how reader 1 and reader 2, how do they agree with each other on the small number observation, right? So basically how many times that they observed small number talking in children, do they agree with each other? So we wanted to look at put them into the row and the column so basically cross tap them and you go to the statistic so statistic when you click statistic this window gonna pop up and you're gonna click coppa okay right here so I already clicked it so usually when you look at this when it pop up for the first time and you will see the screen like this and you will click COPPA and then that's it and you click continue and okay all right so this is the output you will get okay and so what you need to see is that this table actually tells you how many times both Raider 1 and reader two put absent of this behavior. It's like seven times they agree with each other. And there's a one time reader one put a present of a behavior, but reader two put like absent of this behavior. There's one time. Okay, so this table is the same thing like this two times. Reader one thinks there's no behavior present, and but reader two thinks there's a behavior present. So there's two times of that. But there's a five times that both readers agree that there is a behavior present. So a total of 15 time samples are inputted. So saying like 25% of 60 sample, right, 60 time sample is 15. So depending on at the end how many time samples you collected, And then the 25% is different, okay? So right now, since I have 60 time samples, the 25% of that is 15 time samples, okay? So among the 60, 15 of them needs to be double-coded, okay? All right, so this is the value of kappa, okay? So the kappa right here is 0.595, and you can refer back to that table. before I showed you to reinterpret the data. okay so this is about like between the moderate or fair okay so and then this is the confidence interval so you will always report the confidence interval in your in your in your report okay and this is the significance level it's 0.04 okay so this is how you run Cohen's kappa. Now we're going to look at another analysis. We just did a demonstration on Cohen's kappa using SPSS and this is how you can report Cohen's kappa. So right here, you say like a Cohen's Kappa was run to determine the inter-rater agreement on what behavior. So we just did a small number use in children, right, between two independent raters. So you and your partners are two independent raters. Cohen's Kappa showed moderate or like you can say fair or moderate or this is like basically a word that you need to change. OK, and. showed like moderate agreement between the raters observation right so Cohen's kappa kappa is this character and then you will just put in the number you just saw right there okay so let me show you again so Cohen's kappa will be point point of 595. So let me actually just put this in. So, okay, so it is 595. And then we will report confidence interval. So I will also show you on here, maybe it's easier. Okay. So the confidence interval is what? 0.04 to 0.048, right? So you will just report that. 0.04 and 0.04. eight okay and so the significance level it's right here you could since it's it's right here significance okay and so you will just directly report equal to point zero four four point zero four okay So for example, right here, I will just say small number. Okay. And between two independent readers and Cohen's kappa showed moderate. This is indeed moderate effect. Okay, so from it's like just right out here. Okay, 0.6. it's 0.6 right so 0.6 it's a moderate agreement between the readers okay so this is how you report inter-reader reliability all right So now let's move on to Fly's Kappa. So Fly's Kappa, remember, it is used for how many readers? Three readers, three or more readers. And on what type of data? Categorical data. So when you have absent or present type of... coding for your behaviors and you have a three people in your group you will use flyscoppa to run your analysis okay so even if you don't have a three people in your group you still need to know how to run this and what this means okay because this is going to be tested on the exam okay so i'm going to show you the spss file Okay, let's move on. I'm gonna make this. So right now I'm gonna use this data file to show you actually how if there's a three readers. Okay, so three readers. For example, for this behavior, we are having three readers. We have three readers for this behavior and we are just going to input in this reader3's data. I'm just going to randomly do it. By default, there is a decimal place. This doesn't matter. And I usually like to make it just zero decimal points. So because we are using whole numbers, so there's no need for us to have a decimal place. But it's up to you. And I usually like to come here and make it only zero for decimal places. And so the data will just looks cleaner. Okay. So for example, I'm just going to say this. I'm just creating and randomly just like you know putting in data. Okay so for example now we have three readers for this data file. Even though we are not going to run the spatial one of right now per se but just to making it more complete. See, I'm just making it a double. So remember, the variable name cannot be the same. So I just put like RADAR2 spatial again, so SPSS doesn't allow me. That was a typo, so I should have put RADAR3 spatial. and then this is like nominal data right and you will change it to nominal all right okay so it's same thing if you wanted to add you know third variable like a third readers are reading right here you will insert third readers reading right here okay so right here um you don't have to hand calculate anything about your agreement actually spss will help you to do that Let's still do the same thing for small number behavior. Now we have three raters and this is the categorical data. We're going to go to analyze. Instead of descriptive analysis, we're going to go to scale. At scale, we see that here has a reliability analysis. this is where we're gonna use what are we gonna use okay and um so for example right here and for this one for flyscaba you will just use put in the three readers um agreement right so basically you wanted to see how they agree with each other and you're gonna put their ratings into rating part okay into this box all right and then what you need to do next is to have a statistic table right here if you click statistic it's actually gonna pop up like this one and usually like when you when you click that by default it's like this and then you will just click here Okay, see right here, inter-reader agreement, flyscar bar. Okay, this is the box that you wanted to click. All right, so you just wanted to click display agreement on individual categories. And that's it. And you click continue. All right, and don't need to change anything else. And you click okay. All right, so now I'm going to show you. the output file okay So the output file right here, you will see like this is the overall agreement. OK, so kappa is right here, 0.550. And this is the 95% confidence interval, the lower bound and the higher upper bound. And this is the significance level. OK, and so this is how you tell like. the inter-reader reliability or calculating kappa when you have three readers and you have a categorical data for your behavior so when you have absence or present of your behavior okay and so now let's move on to reporting the kappa okay so for flex kappa I basically copied the previous table, so for kappa interpretation, it's the same thing. And how do you report that? And right here we're gonna say small number, right? Flask kappa was run to determine the inter-rater agreement on small number among three independent raters. Fleiss kappa showed moderate agreement between the radars observation and we can input in the numbers right here okay so Cohen's kappa it's the same signature like it's the same symbol okay it's a still kappa but it's a different type of kappa all right and um so the overall agreement is 0.550 so we're gonna put in here this is how you're gonna report in your paper okay five five zero And what is the confidence interval? And we're going to look at here. Okay. So the confidence interval, I'll just put it right here so it's easier for me to see. It's 0.258 to 0.842. Okay. When you are writing your paper, you need to highlight those numbers. After you export this file, you will be able to highlight the numbers that you're reporting. You need to report in APA style. You can also save your file name. SPSS output file into a PDF file and so you can attach as an appendix at the end of your paper. Okay so making sure that after you run your analysis you save the files and this is the p-value and right here we have a significance level as right here okay less than 0.001 Alright, so that's how you're going to report your coins kappa or flies kappa in APA format. Okay, alright, so now we're going to keep moving on to the next section. How about you have continuous data, right? And so you will use intro class coefficient. Okay. So if you have a continuous data and it's suitable for this intro class coefficient, it's suitable for two or more readers, which means that if you have two readers, you have a continuous data, you're going to use intro class coefficient. And if you have three readers and you are using tally mark and continuous data, right? If you have a continuous data. you will still use intro class coefficient so it's very straightforward. Unlike categorical data, if you have a two-rater you need to use one type of SPSS analysis and if you have three-rater you have to use another kappa. But for continuous data, no matter how many raters you have, you're going to use intro class. coefficient analysis. Okay, so this intro class coefficient describes how strongly those Ratings in the same group correlate with each other. So it is to assess the consistency of ratings made by observers or raters, measuring the same continuous variable. In our case, it's the frequency of a behavior in each time sample. So all of the raters are independent from each other. You guys are not interdependent with each other, right? You are independent. So you are coding your... behaviors independently. Okay, so for next we wanted to show you how to input data for continuous variable. So it's pretty simple and as what you did for your pilot testing, and you're tally marking that and when you're calculating inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter-inter right like a frequency so in each time sample you're gonna have a frequency of each type of behavior that you observed okay so it's pretty straightforward so the interpretation of intro class coefficient is basically um also from um it's from 0.05.5 to one okay So from 0.5 to 0.74, we would say it is poor to moderate. From 0.75 to 0.86, it's good agreement. And if it's above 0.86, it's excellent level of agreement. You can refer to the table as you are interpreting your inter-rater reliability. Now we're going to do an analysis. Okay, so we found this data file. Okay, so this is continuous data and we have two readers for this one. I mentioned that for three readers, it's still the same thing. For reader 1, we have a small number reading right here, like frequency of small number. For reader 2, right here, this is reader 1's spatial words, reader 2's spatial words. Now what we're going to do is to analyze first the behavior. It's a small number behavior and how do reader 1 and reader 2 agree with each other. We're gonna click analyze, same thing go to scale, reliability analysis. You will have this pop up, the window pop up. And so you are putting those two into the item box. OK, so this is how you calculate intro class reliability. OK, and then after you're putting those two ratings, reader one, reader two into the item box and you click statistics and this pops up. and as you can go all the way down right here you see intro class correlation coefficient okay and then we're gonna say we wanted to see the absolute agreement okay so we wanted to see if rater one absolutely agree with with rater two all right The consistency is to basically see if reader 1's rating is linearly correlated with reader 2's rating. But we're not interested in you know correlation if they have a linear correlation. We wanted to see if they have absolute agreement. Okay so you will click absolute agreement. That's the first step and then you're going to select the model. Right here it's two-way mix which is actually what we're going to choose. Okay but I wanted to also let you know what this model means. So one-way random which means that there's only one variable available. There's like so random right here. One-way random is basically saying either the rater is randomly chosen or the observer, like observation, like the people you're observing are randomly chosen, right? So which is very rare. And we always have raters and observers, raters and kids to participate, or like kids that you're observing. So this is pretty rare. And for this one, which means two-way random, means that either, like both, okay, both raters are randomly chosen. And kids who you are observing is also randomly chosen, okay? But you are not randomly chosen as a rater. You are... you're fixed right basically we chose you are fixed raters you're not we're not like randomly distinguishing and putting you to like rating different things right so you and your partner partners are basically like a fixed raters so two-way mixed is the model you need to choose okay so you and your partner are fixed Raiders and Predetermined and it's not randomly you're not like randomly swapped by other people and then choosing different like, you know like randomly to observe with another partner like that you are pretty much preset and fixed Observers and but you're about the kids that you're observing are randomly chosen, right? So this is the model that we're going to choose, two-way mixed. Once you click here, intro class correlation coefficient, and then select the right two-way, leave the two-way mixed as is, and then select absolute agreement right here, and then you click continue, and then you just click ok. all right so now let me show you the output okay so this is the output that you see of when you're running the when you run the um intro class coefficient right and right here we wanted to see the average measures okay not the single measure but average measure So the average measure intraclass correlation is actually pretty high. It's 0.929, okay? And you also report the confidence interval, and this is the significance level, okay? So this is how we actually run analysis for intraclass correlation coefficient. And this is how you actually report this behavior, okay? So we're gonna basically based on what we saw from SPSS and then update the report. Okay, so small number. So intro class correlation coefficient was run to determine the inter-rater agreement on small number. among two independent raters, intro class correlation coefficient showed moderate, this is not moderate anymore, right? So we have very high, so showed excellent, or very good, so this is excellent, right? Because we have a 0.929, okay? So showed excellent, agreement between the raters observations ICC equals to see this is just a 99 and confidence 95% confidence interval is 754.754 and .977 okay And so the significance value, p-value, is right here, okay, and it's... smaller or less than 0.001. So basically this is how you're going to report in your paper about inter-rater reliability on each of your behavior. So for each of your behavior you will have like a little paragraph for this when you're describing the measures. So that's it for running analysis of on inter-reader reliability. And just to recap, when you're calculating inter-reader reliability, when you have two readers and you have categorical data, absence or present of a behavior, you use Cohen's kappa. And when you have two or more readers, usually if you use three readers, and you have a continuous variable and you will use intraclass coefficient. And if you have two or more readers having categorical data, you're going to use Flask Kappa. This is what we just showed you how to run analysis, how to do the data entry. Please bring any questions to the lab. and we can answer any of your questions in the lab. Okay, so a couple reminders for this week. We're gonna have our formal data collection this week, starting this week, and also next week you're gonna carry out another formal data collection. So in total, you're gonna have a 120 minutes of observation time. So making sure that you're going to be on time and you're going to be attending the class because there's no makeup available for data collection. And making sure to have your Excel ready on your computer for this week's lab class as you will input your data into Excel file. And then next week after you're done with all your data collection, and you're going to finish all of the data collection, you're going to finish all of the data entry, and also calculating inter-rater reliability using SPSS afterwards. Okay? And another reminder is that this week we have a quiz number one, and it's going to be 25 points. And... So it's open book and you will do it at your own time. And I will make it available on the day of the lab class. And you have until the end of the week to finish the quiz. Okay, so let me know if you have any questions. And that's it for today's lecture.