hello students so this is a continuation part two of chapter three we're going to start here with the slide on types of frequency distributions and we will begin our discussion on the normal distribution so you may have heard the phrase a bell-shaped curve bell-shaped distribution the terminology doesn't isn't really used in statistics portions any longer it's it should be referred to as the normal curve or the normal distribution but you may still hear bell-shaped curve or bell curve why because it looks like a bell so let's take a look at the distribution referring to here this is an ideal normal distribution and uh ideal is important here because it's actually very infrequent that the distribution any distribution of scores actually looks just exactly like this but this is in theory what a normal distribution should look like so what are some of the characteristics you can see that frequency is along this axis so this indicates how many people scored a particular score and you have this in this particular example you can have a number of variables that you are looking at but this is for test scores so for test scores you can see in this situation that most individuals scored around 30. so if you just go over to the side here to this axis you can see this again indicates the frequency and that the frequency of scores is greatest for 30 30 a quart occurs more frequently than any other score and as you get to the tails these ends are called tails tails of the distribution you see that the frequency drops so a score of five occurs with much less frequency this 455 similarly occurs with a lower level of frequency so most of most of the folks for this particular this particular situation are scoring right around the middle here and the middle in this case is about 30. you may also notice that it is elation theory symmetrical if you imagine a line drawing i'm drawing a line down here through the middle this half of the distribution is the same as this half of the distribution so going back up to the previous slide we see that the characteristics are the fact that it is symmetrical and the far left and right portions containing the low frequency these are referred to as the tails of the distribution so this is again the ideal theoretical normal distribution and we will use the normal distribution quite frequently for many statistical tests the assumption is that scores are distributed normally or look like this or sit close to this and it's an assumption it's actually a requirement for many statistics the the statistic will not work unless this assumption is met the scores are indeed distributed normally so some distributions don't look right that's symmetrical and they are referred to as skewed distributions so skewed distribution looks similar to a normal distribution except that it's again not normal and one of the tails is is pronounced and the other is not so a distribution can be either a negatively or positively skewed and that whether it's negative positive depends on the shape of the distribution let's look at some examples so with a negatively skewed distribution you can see that uh there are some extreme low scores here that occur with very low frequency but up on that the upper end of the spores really there aren't any um low frequency extreme high sports so there's a the tail looks something like this um in the in the end of the with the lower scores on this end of the distributions so a positive skew looks it's just the opposite but you have some extreme high scores that have low frequency here but really not many um low frequency extreme low scores it's not always perfect looking like this but this is the general pattern for a positive and a negatively skewed distribution let's take a look at some examples so here's some examples of negatively skewed distributions so you have here the first one we have some in 2012 some men's long long jumping qualifying rounds so where they made three attempts and uh then the uh the data was plotted in terms of the length that they they achieved in their long job and uh perhaps the best of the three or um i'm assuming but uh you can see that most of the scores or most of the men scored right around here in this 7.5 to 8 range you can see again this frequency this peak here indicates a high frequency but there are some low frequency very low scores so here's another example of a negatively skewed distribution where you have the age of death of australian males in 2012 another 2012 study and you can see that there are some low frequency uh occurrences down here at this end of this of the scale where there are some deaths at a very young age and you can even see this this elevated uh bar here at uh right around birth it looks like now you for a negatively skewed or positively smoother for any distribution really it's not going to be perfect so you can see some little blips in this case right around childbirth presumably but in general you see a negatively skewed pattern where you see some low frequency here at the lower end and then most of the scores right here up at the higher end between 70 and 90. makes sense so now let's look at some positive listing distributions what do we have the first one is length of stay in hospital after surgery and so you have most people staying you know a few days all right maybe up to a week and then but then you have some people saying for a long time so you have low frequency of um the values here at this end of the scale but you don't have that same pattern here at this end the lower end so you have some low frequency at the higher end here and the classic is household income where you have most people with a household income right around here in this range but then you have some low frequency but some very extreme high incomes and notice here you have again not a perfectly positive skew distribution you have some some uh elevations of of the uh the bars here at the very extreme and and that again you will sometimes find that but the basic pattern is a positively skewed distribution what about a bimodal distribution so a bimodal distribution is as it sounds by two modes what this kind of distribution shows us is that there is there are actually two places in the distribution of scores with cloud frequency so by both so the first in this let's say these are scores on an exam bimodal distributions are are sometimes seen in classrooms in academia on exams where you'll have quite a few students perhaps performing let's say this is a corresponds to a score of 40 right around 40 and then another high frequency score right around 80. another popular score if you will so you have two modes two humps in the distribution if you will so it drops down around the middle scores but it's higher at some value of the lower score and some value of a higher score so two distinct hops bimodal distribution you could also have a rectangular distribution which is where the frequency here again is the same across all the scores not terribly interesting statistically okay so now what you're going to see that moving forward is that we're going to talk about the normal the normal distribution and we're going to be able to make statements about how likely a particular score is in the distribution how how many how many people scored right around that score or at that score or less there are all sorts of pieces of information you get you can get if a group of scores are normally distributed for instance let's talk about finding relative frequency and we talked about relative frequency earlier in the chapter and so you know that that that that is how frequently the score occurs relative to other scores so the proportion of the total area under the normal curve that is occupied by a group of scores corresponds to the combined relative frequency of those scores so for instance if we're looking at all the scores below 30 it's going to be the combination of the likelihood of all of these scores percentiles the percentile and we can use a normal distribution to find percentiles the percentile for a given score corresponds to the percent of the total area under the curve that is to the left of the score it's always to the left of the score let's take a look at that so 0.50 this particular percentile refers to this midpoint in the distribution and so what you would say is that 30 is represents the 50th or 0.50 percentile in this distribution that also means that 50 of the scores are below 30. now you can look at the percentile for any given score with a normal distribution because there are assumptions about how many folks lie what percentage falls in specific areas of this distribution and again we'll see more on this later 20 in this case is at the 0.15 percentile of the distribution so that means 15 scores are below 20. 85 percent of the scores are below 45 with this particular distribution so with this normal distribution we can talk about percentiles areas that are to the left of a specific score and you can you can discuss or look at how likely that particular score is or the percentage of scores that are below that score or in any given area in the in the normal distribution you will see that we'll be able to talk about the percentage of scores that are between 20 and 30 given the normal distribution the percentages that are actually above 40 but whenever the term percentile is used that refers to the area under the curve to the left of the score that we are referring to so that is about it now with percentiles and relative frequency if it seems a little bit out of context right now that's understandable but know that we will use those concepts quite frequently in future chapters so they will make much more sense in context all right that's it for the lecture see you at the next one