Transcript for:
Categorical Data Analysis Basics

hi everyone this is Matt to show with intro stats and today we are starting a discussion about basic categorical data analysis so we've learned that there was sort of two types of data categorical data quantitative data we said the categorical data was data that describes right usually it's words that describe people or objects like what city you live in or what state what school do you go to all those kinds of good good categorical questions so today we're going to be looking at how to start to start to analyze that data a little bit so when you're talking about categorical data analysis when the key things is this idea of percentages and proportions so this is probably a review for a lot of you but think of a percentage as a part or an amount out of 100 even the word percent per means divided or out of cent comes from the Latin where we get the word hundred so even the word percent means out of 100 or divided by 100 so when we say 35 percent were saying 35 out of 100 all right that's a good way to kind of think about it now in stats though what you're often looking for is this word proportion so proportion when you hear somebody in statistics say proportion or type in the proportion into a computer program they're looking for the decimal equivalent of a percentage so if you convert the percentage back into its decimal equivalent by removing the percent sign that's what we referred to as a proportion I know some of you that have been in algebra classes when you hear the word proportion you're thinking cross multiply and solve for X right but that's really not what we were for what we mean by proportion and statistics when we say proportion is the test we mean the decimal equivalent of a percentage okay so well the first things you should be comfortable with is converting proportions into percentages and percentages back into proportions so kind of some basic conversions but they're really important you really want to make sure you're able to do this a lot of times when we get a we read an article and it's given us a percentage and then we want to do some more advanced analysis of that percentage we have to convert the percentage back into a proportion before I can put it in the computer or if the computer does some kind of analysis a lot of times they're going to calculate the proportions for each categorical variable so it's going to be in decimal form but then when I want to explain that to someone I might want to turn it back into a percentage so you really want to be comfortable going back and forth so to convert a proportion a decimal proportion into a percentage or convert a perp or a percentage back into a proportion so we'll start with converting a percentage into a proportion so if we let's start with thirty three point seven percent so how would I convert that back into its decimal equivalent right well remember the word percent means per 100 right or divided by 100 that's what this symbol means when you see that percentage symbol that symbol means divided by a hundred or out of 100 so really all you have to do is divide the number by 100 so 33.7% is really the same as let's see let's see that it would be thirty three point seven divided by 100 you could do that on a calculator or your cell phone another way to do it when you divide by a hundred it actually moves the decimal two places to the left so some of you may have learned a rule in the past where you say move the desk two places to the left and you can totally do that if you move the decimal two places to the left you would get zero point three three seven that would be the decimal proportion equivalent of thirty three point seven percent okay now we also want suppose we'll suppose we have a hundred percent 100 percent is a very famous percentage right what would that be is the decimal equivalent would be the decimal equivalent of a hundred percent some people have trouble with these because the percent the percent symbol of the decimal symbol is not there but if you if you're using that rule about moving the decimal two places to the left remember the decimal if you have a whole number percentage the decimal is still there it's just after the ones place so even though it's not written it's really right right after that last zero now if I move that two places to the left or I can simply do 100 divided by 100 and you would get one right yeah 100 percent is the decimal equivalent that the decimal equivalent of 100 percent is one that's why a lot of times in proportion density curves the the area under the curve is usually equal to one all right what about 6% well again same thing you can write that as 6 divided by 100 in your calculator by the way don't do 100 divided by 6 it has to be 6 divided by 100 okay and if you do that on your calculator you would have gotten 0.06 now I always have some of my students ask me can I write this as point zero six yeah sure it's the same thing it's equivalent though most of time you will see somebody write zero before then just to let you know there was zero in the ones place now by the way okay what if I what if I'm using the move the decimal two places to the left rule well again I don't see the decimal there right it's just six but remember the decimal is there is just after the ones place the decimal always comes after the ones place in our number system so if we moved it two places to the left I would get point zero six okay so if you see the percent sign divided by a hundred now what if I want to convert a decimal proportion into a purpose or a decimal proportion into a percentage what if I want to print a percent sign on a number so maybe these were numbers that were calculated by computer programs and now I want to explain it to somebody so I want to turn it into a percentage well what could we do well let's see again if you have to convert a percentage into a proportion you divide by a hundred to go the other direction you just have to multiply by a hundred so I kind of think of it as multiplying by a hundred and sticking on the percent sign or taking a hundred percent of the number so you're basically always just multiplying the decimal by a hundred percent so if I did that point zero one eight times 100 and stick the purse pana times one hundred percent I could do that in my calculator point zero one eight times 100 or when you multiply by a hundred it does move the decimal two places to the right remember multiplying by a hundred the numbers getting bigger so the best one has to go to the right dividing by a hundred the numbers getting smaller so the best of us to go to the left so you could also just move the decimal two places to the right either way you're going to get one point eight percent how about this one point zero point eight seven three again same thing I can either multiply the number by a hundred or I can move the decimal two places to the right so zero point eight seven three times 100 stick on the percent sign which again moves the decimal two places to the right and I got eighty seven point three percent how about one again I don't see the percent I don't see the decimal right people always have trouble with whole numbers especially if you've learned this the decimal two places to the left or two places to the right rules I can't remember the decimal is there it's just after the ones place so if you have a whole number like one the decimal is just after the one or you can just think of it as one times 100 and stick on the percent side well 1 times 100 is just a hundred percent by the way if I was using the rule of moving the decimal two places to the right I would just go one two and then I would have to add a couple zeros as placeholders now this last one is a little interesting a lot of times when you're dealing with proportions that are very close to zero later in the class we'll get into something called p-value which is a very famous proportion but it's usually numbers that are super super low and super close to zero when that's the case computer programs oftentimes will write the answer in scientific notation this is called scientific notation what this means is this times 10 to the negative 5 you may buy it by the way you can get this on a calculator if you were calculating something on a calculator and it gets really close to zero sometimes you'll see the calculator write it this way so this this times 10 to the negative 5 just means move the decimal five places to the left if it was 10 to the negative 4 you'd moved the decimal four places to the left it was 10 to the negative 3 and moved the decimal three places to the left so this this means move the decimal five places to the left okay move the decimal five places to the left so so if we were doing that think of it this way here's my two point three right I'm gonna go one two three four five that's where my decimal really is that's what this times 10 to the negative five means now if you notice I'm gonna actually have four placeholders so I'm going to need four zeros there okay so that's really the decimal number that this is representing you could put a zero in front of the decimal if you want and write it this way but should have four zeros after the decimal and then two three that would be the decimal proportion if I was typing that number into a computer program or something but we've the question was how do you convert that into a percentage so the first thing I had to do was convert the scientific notation into actually decimal notation and now I can go ahead and convert it right I can multiply by a hundred or move the decimal back two places to the right so if I do times a hundred percent which again moves the decimal it's kind of like knocking off two of the zeros so I'm going to get zero point zero zero two three percent pretty okay there all right so you should be pretty comfortable converting back and forth by the way don't forget to put the percent symbol on there point zero zero two three percent is very different than saying point zero zero two three they're not the same thing all right so usually when you get categorical data one of the first things you want to know is the counts or the frequencies sometimes we call that the amount or sometimes you'll hear stat books say the number of successes so that's sometimes denoted by a letter X a number of number of people or objects that have a certain characteristic in your categorical data and then you have n which is the total sample size or sometimes you'll hear that as total frequency or total number of trials also sample size is another another name for the total number of people or objects in your