Transcript for:
AP Statistics Unit One Overview

all right so let's tackle everything you know for AP Statistics unit one starting off with the difference between categorical and quantitative data quantitive data just deals with numbers so think quantity numbers that's stuff like heights class size population size anything you put a number on and categoric C stuff that's like names and labels so think like eye color hair color you can't put a number on that um so we're going to Branch off here and talk about categorical data which you want to represent with the 2-way table um you've probably seen these before but basically you have two variables on either side and it just shows you like the cross or the um intersections between certain variables so as you can see here we have math students sure we have 10 math students but then we also have the internal and external uh variables that split up the number of math students so we have three math students who are internal seven students an external in the same case for English um so the important thing you need to understand is a couple vocab terms that the AP might use in your um teachers might use on your midterm is marginal relative frequency that's a percentage of the data in a single row or column compared to the total and if you look at this little chart here that's going to look at uh B over D or C over D so we have our column total which is just going to be c c over D and then our row total is B over D um the next thing we're going to look at is joint relative frequency so that's percentage of data in a single group compared to Total um so group here that might be a little confusing is just we're going to look at a right so that's just a singular variable or intersection that we're going to look at between a certain variable so that's a over D and if we look in our two-way table with actual variables that's something like math internal so that would be 3 over 23 or maybe we want to look at external English so that's we look down English external so that's 9 over 23 okay so next we're going to talk about conditional relative frequencies that's percentage of data in a single category when you're given a specific group so now that we're going to be looking at A over B or a over C now that's something like just a right and then the total in the group or column is to be over a over C or A over B and in our little two-way table that's going to look like let's say we're given that the student is in math um so that would be 10 so our total would be 10 and then we want to know what percentage of that is internal well we have three so that would be 3 over 10 as a example for that part okay so now let's go on to quantitative data so for quantitative data you want to know how to describe it big part of AP Stat is just interpreting because a lot of the calculations are pretty straightforward you have a reference table you can use your calculator all that um but you need to know how to interpret so for quantitive data use the Acron SE Soxs uh the C stands for context shape or S stands for shape so is it symmetrical is it skewed number of Peaks is a unimodal bodal all that look at the outliers write any variable uh any numbers or data points are super out wide um compared to the rest of the data you have your center right so look at your mean or your median um you have spread right that's just range your standard deviation and IQR and then here's a little tip when you are describing the data is utilize descriptive language like strongly roughly like roughly symmetrical and all that um and also use comparative language to maximize number of points you can get on the exam okay now let's go on to some basic terms you must know I mean you've probably seen these before mean that's just the sum of all the values divided by the number of values or the average I mean that's like the average value standard deviation is just a measure of variation and here's an important part when you're describing it in context okay so you want to say like the value context it typically varies by standard deviation value from the mean of mean that sounds kind of weird so let's look at example the average prep subscribers IQ typically varies so preper subscribers IQ is the value SL context always put it in context uh varies by five IQ points so five IQ points varies by five IQ points that's the standard deviation from the mean of 169 IQ points the median is just the 50th percentile so that's where if you have a data you organize it from least to greatest and then look at the value in between right you probably heard of that um and range is just the value not the interval the value of Max minus Min in your data set okay so the last thing for exploring data in this part is how to make box plots you need to know the five number summary so the first number in the summary is your minimum that's just the smallest value you have then you have the 25th percentile or q1 that is just in between your minimum and median right it's the exact it's like it's pretty much like the median of your minimum point and 50th percentile if that makes sense and then the next number is your median or 50th percentile then is Q3 or your 75th percentile and then your maximum value okay so the next thing is IQR IQR is just Q3 - q1 that's important because you also need to be able to describe specific outliers in their characteristics that being low-end outliers or high-end outliers so lowend outliers uh like the name suggests is just a super super low outlier um so that's going to be any value less than q1 minus 1.5 * the value of QR and the same thing for high-end outliers of super high outlier that's any value greater than Q3 + 1.5 * IQR so make sure you know those two equation and here we just have a visual diagram because box plots are cool to draw all right so to round off AP Stat unit one we just have a couple more terms and then we'll get into normal distributions um the first is officially defining percentile we did talk about it but how I think you should think about it is that it is the percentage of values that are less than or equal to a specific value and then we also have cumulative relative frequency so that shows the cumulative percentages from each interval up through all the data and here here we have a visual so you can see here when we have a data point that obviously is graphed but when we don't have any data that is just a straight plateau and then when we get data again oops oh no it's bugging okay so if we get data again then that thing is going to go up because it's pretty much like a I'll would say a running total of the relative frequencies and relative frequency is just like the chance of something occurring um so that's like the occurrence or frequency over the total okay so now let's talk about zc scores so zc scores are tying back to the idea of standard deviations Z scores are simply the number of standard deviations a value is away from the mean and this is the official equation for it it's whatever value you're looking for the zcore for minus the mean over the standard deviation all right so now we're going to talk about what happens when you transform okay so let's say you have a data set what if you added a certain constant like you added five to every single value or maybe if you multiplied every value by 20 right what would happen to the shape the variability and also the center well here is a summary of what would happen so if you added or subtracted all the data values by the same amount the shape and variability will always stay the same the center so that's like your mean or median will move up or down by that amount now if you multiply or divide that's a different story The Shape will still stay the same but now your Center and your variability will be multiplied or divided by that amount all right so now let's talk about density curves and normal distribution so density curve is on or above the horizontal axis has an area of one and just shows probability distribution note that a normal distribution is a type of density curve um but no when you talk about a uniform density curve it's quite rare you might see questions on this though um but you can see pretty clearly that it has has a total area of one um but the more common I guess density curve you'll see is the normal distribution you'll 100% see this and you've probably seen this before um so you need to know pretty much the standard deviation and the 6895 99.7 rule so 68% of the values from the mean are within one standard deviation of the mean sorry uh 95% of the values are within two standard deviations of the mean in 99.7 % of the values are within three standard deviations of the mean um so to solve normal distribution problems you're just going to use your calculator I mean that is the most simple way to know your calculator commands go and study those to Norm PDF that finds a probability at a specific value Norm CDF this shows probably that the normally distributed variable is between a set interval pretty similar to Norm PDF and then we have inverse normal which pretty much does the reverse of these calculations it finds the value that corresponds to a given percentile or on your calculator it might be denoted as quote unquote area okay all right so the final thing we're going to talk about is super Niche but I'm still goingon to include it because it's on the CED um that's a normal probability plot so that what it does is it plots the actual values versus the theoretical zv values okay so these Theory called Z values are the Z values you would get if the actual data points were normally distributed okay so pretty much just shows you how well the data fits a normal distribution and that's you know something you might see like that um so basically all you need to know is that if it's roughly linear when you plot this or it might just give you the graph it's going to be roughly normal distribution and if it's not linear it's you know roughly not a normal distribution and yeah that does it for everything you know for AP Statistics unit one