2.3 Graphs that Enlighten or Deceive

In this lesson we're going to look at some different types of tables and graphs. Some of which will tell us some very meaningful information about the data while others will tend to be more misleading. We'll begin with what's called a frequency polygon. A frequency polygon uses line segments connected to points that are located directly above class midpoint values. Unlike our histograms where we created bars with the heights for each class, with a frequency polygon we simply plot a point for each class. It's important that we use class midpoints and that the height corresponds to frequency. A variation of this is the relative frequency polygon; where we change the vertical axis from being frequency to being relative frequency (or proportions) or percentages. Here are some important features of frequency polygons. Heights correspond to class frequencies. Line segments are extended to begin and end on the horizontal axis. That means that after we've plotted a point corresponding to each class midpoint, we must connect our polygon down to the horizontal axis using the appropriate width from the first and last midpoint. Also, if we want to put two or more polygons on the same graph, this can really help us compare two sets of data. Let's try out an example. Given this frequency distribution we're going to construct a frequency polygon. We first need the class midpoints and we will label these class midpoints on the horizontal axis. Now let's label the vertical axis. Since our frequencies simply go from 0 to 5 let's let each one of these lines represent one unit. Now we're ready to plot points for each class. The first class has a midpoint of 4.5 and a frequency of 1. So we'll first plot the point above 4.5 with a height of 1. The second class also has a frequency of 1 so above the midpoint 14.5 we will plot a point corresponding to a height of 1. Our third class has a frequency of 2 so above 24.5 we will plot a point with height 2. Our next two classes each have a frequency of 0 so above 34.5 and 44.5 we need to plot points that correspond to a height of zero, which is on the horizontal axis. The next class has a frequency of 4. Next we have a frequency of 2, following that we have a frequency of 4, then 5 and lastly 4. Now that we have our points located directly above our class midpoint values, we can connect these points with line segments. Working from left to right we connect consecutive points with straight lines. The last thing we need to do is connect the endpoints back down to the horizontal axis. It's important not to just simply draw a line back down to the horizontal axis, but rather do it at a regular increment as set by the class midpoints. On this graph our class midpoints line up with the grid and so I'll go one unit, or one line, on the grid to the left and place a point with a height of 0. I'll do the same to the right. I'll also label these two points by using the width found between any two consecutive midpoints. Between 4.5 and 14.5 is a width of 10. So to label this first point I need to go 10 below 4.5 to -5.5. To the right I need to go 10 above 94.5, so this last point corresponds to an x value of 104.5. The last thing we'll do is connect these two endpoints. Now we have our frequency polygon with our horizontal axis corresponding to our x values and our vertical axis corresponding to class frequencies. Unfortunately there are many ways to be deceptive with our graphs. One very common way to deceive the reader is to have a non-zero axis. This is where we have an axis that is anything other than 0 which can completely exaggerate the difference between two graphs. Take a look at this graph below. The horizontal axis corresponds to years and the vertical axis corresponds to interest rates. Looking at this graph it appears that interest rates increased rapidly from 2008 to 2012. It also appears that from 2009 to 2010 interest rates doubled because the bar corresponding to 2010 is twice as tall as the bar corresponding to 2009. Now look at the same data on a different graph. These are the exact same values. Looks a little different, doesn't it? On the first graph notice that the vertical axis starts at 3.14 instead of 0. On the second graph it starts at zero. Now you can see that from 2008 to 2012 interest rates barely changed. By having a non-zero axis, the difference between two years is grossly exaggerated. Another type of graph that can be very deceptive is a pictograph. Whenever we have drawings that are given in two or three dimensions. We can really distort one-dimensional data. Take a look at this pictograph that compares how much Halloween candy was collected by Shayna and Michael. Looking more closely, it appears that Shayna collected about 18 pieces of candy while Michael collected 36. So Michael did collect twice as much candy as Shayna, but by using this pictograph, where the candy corn looks three-dimensional, it really exaggerates the difference between these two. Another type of graph is a scatterplot. A scatterplot (also called a scatter diagram) is a plot of paired (x, y) quantitative data with a horizontal x-axis and a vertical y-axis. The horizontal axis is used for the first variable (the x) and the vertical axis is used for the second variable (the y). Let's create a quick scatterplot using the following data. Our first point needs to correspond to the (x, y) pair (1, 2). So beginning at (0, 0) we will move one unit to the right and two units up. The second point needs to correspond to the ordered pair (3, 5). So we will move three units to the right and five units up. Next we need a point that corresponds with the ordered pair (5, 9), so we move to the right 5 on the x- axis and up 9 in the y direction. Next we have the point (7, 10), then (9, 14) and last (11, 15). Unlike a frequency polygon, with a scatterplot we do not connect the points with line segments. With a time-series graph we are interested in data that is quantitative and has been collected at different points in time. Let's look at this table of values and construct a time-series graph. The first step is to label the horizontal axis. Our x-values in this case correspond to time given in days, so I'll label the horizontal axis accordingly. The corresponding y- values are the amount of rainfall given in millimeters, so I'll use that information to label the vertical axis. The second step is to determine an appropriate value for each unit on the vertical scale. Looking at the rainfall amounts, the highest value we have is 45 so let's label each unit on the vertical scale by 5s. Now labeling the days on the horizontal axis we are now ready to plot points on the grid that correspond to the data values in the table. On day 1 there were 45 millimeters of rainfall, so we'll plot a point above the 1 with a height of 45. For day 2, we need a point that corresponds to a height of 20. For day 3 we need a height of 40, day 4 a height of 38, day 5, 42, day 6, 15. day 7, 10, and finally day 8, 22. The final step is to connect these dots with line segments in the correct order. Notice that unlike a frequency polygon, we do not connect this back down to the horizontal axis. Next, let's look at what's called a stemplot. With a stemplot we are again representing quantitative data, but we are going to separate each value into two parts; the stem and the leaf. For example, take a look at this data set. We can represent this data set in a stemplot by having the stem be the tens place and the leaf be the ones place for each data value. For example the data value 13 is represented on our stemplot in the row corresponding to the stem 1 and leaf 3. Since there are two 13s in this data set you see that there are two 3s on this row. Similarly, the data value 9 is represented on the row corresponding to a stem 0 and leaf 9 because the tens place for the number 9 is a 0 and the ones place is a 9. A variation of the stemplot is a back-to-back stemplot or back-to-back stem and leaf plot. Let's create a back-to-back stemplot using these two sets of data: weight of dogs and weight of cats. We'll begin with our weights of dogs. The first data value in this row is 48 so in the row corresponding to a stem 4 (or tens place 4) I will list an 8. Next, we have a 13, so I'll put a 3 in the row corresponding to the stem 1. Next we have a 15, so I'll put a 5 in that same row. The tens place is 1, the ones place is 5. for the 22, I'll go to the row that corresponds to a tens place of 2 and list a 2 for the ones place. We have another 48 so I need another 8 in the row corresponding to the 4. For a 56 I'll put a 6 in the row with the 5. 62 I'll put a 2 in the row with a 6. 73 I need a 3 in the row corresponding to the 7. 52 we need a 2 in the row with a 5. For 71 we need a 1 on the row of the 7 and for 66 we need a 6 on the row with a 6. To clean up this left side just a little, I'm going to write these values from least to greatest. Now we can move on to the other side. For the cats, I'll repeat the same process but this time I'll put all leafs on the right side. Now I will clean this up just slightly by replacing this 3, 2, 3 with 2, 3, 3. Having it in increasing order is slightly easier to read. The benefits of a stemplot are that we can see the shape of the distribution of the data; we get to retain all of the original data values rather than clumping them into classes; the sample data are sorted or arranged in order; and when we put two sets of data side-by-side we can easily compare them. Now let's talk about bar graphs. This is our first graph that we've looked at for qualitative (or categorical) data. A bar graph uses bars of equal width to show frequencies of categorical or qualitative data. The bars may or may not be separated by small gaps. Let's try an example. In a survey 1004 adults were asked to identify the most frustrating sound that they hear in a day. In response 279 chose jackhammers, 388 chose car alarms, 128 chose barking dogs, and 209 chose crying babies. In order to construct this bar graph we need to first label the vertical axis. The vertical axis corresponds to the frequency for each of these categories so we need to make sure that our vertical scale goes high enough. Let's make it go up to 400. We need a bar above jackhammers with a height corresponding to 279. Since 388 chose car alarms, we need a bar above car alarms with an appropriate height. And similarly for barking dogs and crying babies. Now we have a complete bar graph. Notice the major difference between a bar graph and a histogram is qualitative versus quantitative data. Let's end this lesson by looking at a pie chart. A pie chart is a very common graph that depicts categorical data as slices of a circle. The size of each slice is proportional to the frequency count for that category. Let's create a pie chart using the data that we just had in the last example. Here are the same values from that last example. There are two different ways to construct a pie chart. We could look at relative frequency in the form of percentages and we can also turn that into an appropriate amount of degrees of the circle. I'll show you both of those here. With the total frequency or total number of people being 1004, we can change each of these values to a relative frequency. We do this by dividing each frequency by the total. In order to find the corresponding degrees of the circle, consider that all the way around is 360 degrees, so we can take 360 degrees and multiply it by the relative frequency for each category. In other words, for jackhammers we can find the appropriate amount of degrees by multiplying 0.2779 by 360 degrees. This gives us about 100 degrees. For car alarms we can find the appropriate amount of degrees by multiplying 0.3865 by 360 degrees. This gives us approximately 139 degrees. And doing the same for barking dogs and crying babies, we have approximately 45.9 degrees and approximately 75.0 degrees. Now let's think about how we could use relative frequency or degrees to construct this pie chart. Let's start with the category that has the largest frequency, car alarms. We need a slice of the pie that corresponds to about 39% of the circle or 139 degrees. Speaking in percentages, this would be 50% of the circle and this would be 25 so we can approximate the location of 38.7% appropriately between 25 and 50. In a similar way we can think of half of a circle as 180 degrees, and a quarter of a circle as 90 degrees, and we can approximate the location of 140 degrees between there. Halfway between 90 and 180 is 135 degrees which is pretty close to where we want to be, so we can approximate 140 degrees with something like that. And we have the first slice of our pie. Our next largest wedge needs to correspond to the 279 people that said that jackhammers were the most frustrating sound. This wedge needs to correspond to 27.8% of the circle or approximately 100 degrees. In terms of degrees, consider that three quarters of the way around the circle would be 270 degrees. Our last wedge ended right around 140 degrees and we need to go another 100 degrees around the circle to 240 degrees. That's 30 degrees less than 270 which I'll approximate to be right about here. Now, we have the second slice. Continuing in this way we can sketch the last two slices that correspond to crying babies and barking dogs. Now we have our completed pie chart and the last graph for this lesson. That concludes this lecture video on various tables and graphs; some of which were very enlightening while others were a little deceptive.

In this lesson we're going to look at 
some different types of tables and graphs.   Some of which will tell us some very meaningful 
information about the data while others will   tend to be more misleading. We'll begin 
with what's called a frequency polygon.   A frequency polygon uses line segments connected 
to points that are located directly above   class midpoint values. Unlike our histograms where 
we created bars with the heights for each class,   with a frequency polygon we simply plot a point 
for each class. It's important that we use class   midpoints and that the height corresponds to 
frequency. A variation of this is the relative   frequency polygon; where we change the vertical 
axis from being frequency to being relative   frequency (or proportions) or percentages. Here 
are some important features of frequency polygons.   Heights correspond to class frequencies. Line 
segments are extended to begin and end on the   horizontal axis. That means that after we've 
plotted a point corresponding to each class   midpoint, we must connect our polygon down to the 
horizontal axis using the appropriate width from   the first and last midpoint. Also, if we want 
to put two or more polygons on the same graph,   this can really help us compare two sets of data. 
Let's try out an example. Given this frequency   distribution we're going to construct a frequency 
polygon. We first need the class midpoints and we will label these class midpoints on the 
horizontal axis. Now let's label the vertical   axis. Since our frequencies simply go from 0 to 
5 let's let each one of these lines represent   one unit. Now we're ready to plot points for each 
class. The first class has a midpoint of 4.5 and   a frequency of 1. So we'll first plot the point 
above 4.5 with a height of 1. The second class   also has a frequency of 1 so above the midpoint 
14.5 we will plot a point corresponding to a   height of 1. Our third class has a frequency of 
2 so above 24.5 we will plot a point with height   2. Our next two classes each have a frequency of 0 
so above 34.5 and 44.5 we need to plot points that   correspond to a height of zero, which is on the 
horizontal axis. The next class has a frequency of   4. Next we have a frequency of 2, following that 
we have a frequency of 4, then 5 and lastly 4. Now   that we have our points located directly above our 
class midpoint values, we can connect these points   with line segments. Working from left to right we 
connect consecutive points with straight lines.   The last thing we need to do is connect the 
endpoints back down to the horizontal axis.   It's important not to just simply draw 
a line back down to the horizontal axis,   but rather do it at a regular increment as set 
by the class midpoints. On this graph our class   midpoints line up with the grid and so I'll go 
one unit, or one line, on the grid to the left   and place a point with a height of 0. I'll do 
the same to the right. I'll also label these   two points by using the width found between any 
two consecutive midpoints. Between 4.5 and 14.5   is a width of 10. So to label this first point 
I need to go 10 below 4.5 to -5.5. To the right   I need to go 10 above 94.5, so this last point 
corresponds to an x value of 104.5. The last   thing we'll do is connect these two endpoints. Now 
we have our frequency polygon with our horizontal   axis corresponding to our x values and our 
vertical axis corresponding to class frequencies. Unfortunately there are many ways to be deceptive 
with our graphs. One very common way to deceive   the reader is to have a non-zero axis. This is 
where we have an axis that is anything other than   0 which can completely exaggerate the difference 
between two graphs. Take a look at this graph   below. The horizontal axis corresponds to years 
and the vertical axis corresponds to interest   rates. Looking at this graph it appears that 
interest rates increased rapidly from 2008   to 2012. It also appears that from 2009 to 
2010 interest rates doubled because the bar   corresponding to 2010 is twice as tall as the bar 
corresponding to 2009. Now look at the same data   on a different graph. These are the exact same 
values. Looks a little different, doesn't it?   On the first graph notice that the vertical axis 
starts at 3.14 instead of 0. On the second graph   it starts at zero. Now you can see that from 2008 
to 2012 interest rates barely changed. By having   a non-zero axis, the difference between two years 
is grossly exaggerated. Another type of graph that   can be very deceptive is a pictograph. Whenever 
we have drawings that are given in two or three   dimensions. We can really distort one-dimensional 
data. Take a look at this pictograph that compares   how much Halloween candy was collected by 
Shayna and Michael. Looking more closely,   it appears that Shayna collected about 18 
pieces of candy while Michael collected 36.   So Michael did collect twice as much candy as 
Shayna, but by using this pictograph, where the   candy corn looks three-dimensional, it really 
exaggerates the difference between these two.   Another type of graph is a scatterplot. A 
scatterplot (also called a scatter diagram)   is a plot of paired (x, y) quantitative data 
with a horizontal x-axis and a vertical y-axis.   The horizontal axis is used for the first 
variable (the x) and the vertical axis is used   for the second variable (the y). Let's create 
a quick scatterplot using the following data.   Our first point needs to correspond to the (x, y) 
pair (1, 2). So beginning at (0, 0) we will move   one unit to the right and two units up. The second 
point needs to correspond to the ordered pair (3,   5). So we will move three units to the right 
and five units up. Next we need a point that   corresponds with the ordered pair (5, 9), so 
we move to the right 5 on the x- axis and up 9   in the y direction. Next we have the point (7, 
10), then (9, 14) and last (11, 15). Unlike a   frequency polygon, with a scatterplot we do 
not connect the points with line segments.   With a time-series graph we are interested in 
data that is quantitative and has been collected   at different points in time. Let's look at this 
table of values and construct a time-series graph.   The first step is to label the horizontal axis. 
Our x-values in this case correspond to time   given in days, so I'll label the horizontal axis 
accordingly. The corresponding y- values are the   amount of rainfall given in millimeters, so I'll 
use that information to label the vertical axis.   The second step is to determine an appropriate 
value for each unit on the vertical scale.   Looking at the rainfall amounts, 
the highest value we have is 45   so let's label each unit on 
the vertical scale by 5s. Now labeling the days on the horizontal axis we 
are now ready to plot points on the grid that   correspond to the data values in the table. On 
day 1 there were 45 millimeters of rainfall,   so we'll plot a point above the 1 with a 
height of 45. For day 2, we need a point   that corresponds to a height of 20. For day 3 we 
need a height of 40, day 4 a height of 38, day 5,   42, day 6, 15. day 7, 10, and finally day 8, 22. 
The final step is to connect these dots with line   segments in the correct order. Notice that unlike 
a frequency polygon, we do not connect this back   down to the horizontal axis. Next, let's look 
at what's called a stemplot. With a stemplot we   are again representing quantitative data, but we 
are going to separate each value into two parts;   the stem and the leaf. For example, take a look at 
this data set. We can represent this data set in a   stemplot by having the stem be the tens place and 
the leaf be the ones place for each data value.   For example the data value 13 is represented on 
our stemplot in the row corresponding to the stem   1 and leaf 3. Since there are two 13s in 
this data set you see that there are two   3s on this row. Similarly, the data value 9 is 
represented on the row corresponding to a stem   0 and leaf 9 because the tens place for the 
number 9 is a 0 and the ones place is a 9.   A variation of the stemplot is a back-to-back 
stemplot or back-to-back stem and leaf plot.   Let's create a back-to-back stemplot using these 
two sets of data: weight of dogs and weight of   cats. We'll begin with our weights of dogs. 
The first data value in this row is 48   so in the row corresponding to a stem 
4 (or tens place 4) I will list an 8.   Next, we have a 13, so I'll put a 3 in the row 
corresponding to the stem 1. Next we have a 15,   so I'll put a 5 in that same row. The tens place 
is 1, the ones place is 5. for the 22, I'll go to   the row that corresponds to a tens place of 2 and 
list a 2 for the ones place. We have another 48   so I need another 8 in the row corresponding 
to the 4. For a 56 I'll put a 6 in the row   with the 5. 62 I'll put a 2 in the row with a 6. 
73 I need a 3 in the row corresponding to the 7.   52 we need a 2 in the row with a 5. 
For 71 we need a 1 on the row of the 7   and for 66 we need a 6 on the row with a 6. 
To clean up this left side just a little,   I'm going to write these values from least to 
greatest. Now we can move on to the other side.   For the cats, I'll repeat the same process but 
this time I'll put all leafs on the right side. Now I will clean this up just slightly by 
replacing this 3, 2, 3 with 2, 3, 3. Having it in   increasing order is slightly easier to read. 
The benefits of a stemplot are that we can see   the shape of the distribution of the data; we 
get to retain all of the original data values   rather than clumping them into classes; the 
sample data are sorted or arranged in order;   and when we put two sets of data side-by-side 
we can easily compare them. Now let's talk about   bar graphs. This is our first graph that we've 
looked at for qualitative (or categorical) data.   A bar graph uses bars of equal width to show 
frequencies of categorical or qualitative data.   The bars may or may not be separated by 
small gaps. Let's try an example. In a   survey 1004 adults were asked to identify the 
most frustrating sound that they hear in a day.   In response 279 chose jackhammers, 388 chose 
car alarms, 128 chose barking dogs, and   209 chose crying babies. In order to construct 
this bar graph we need to first label the   vertical axis. The vertical axis corresponds 
to the frequency for each of these categories   so we need to make sure that our vertical scale 
goes high enough. Let's make it go up to 400.   We need a bar above jackhammers 
with a height corresponding to 279.   Since 388 chose car alarms, we need a bar 
above car alarms with an appropriate height.   And similarly for barking dogs and crying babies. 
Now we have a complete bar graph. Notice the major   difference between a bar graph and a histogram is 
qualitative versus quantitative data. Let's end   this lesson by looking at a pie chart. A pie chart 
is a very common graph that depicts categorical   data as slices of a circle. The size of each 
slice is proportional to the frequency count   for that category. Let's create a pie chart using 
the data that we just had in the last example.   Here are the same values from that last example. 
There are two different ways to construct a pie   chart. We could look at relative frequency in the 
form of percentages and we can also turn that into   an appropriate amount of degrees of the 
circle. I'll show you both of those here.   With the total frequency or total number 
of people being 1004, we can change each   of these values to a relative frequency. We do 
this by dividing each frequency by the total. In order to find the corresponding degrees of the 
circle, consider that all the way around is 360   degrees, so we can take 360 degrees and multiply 
it by the relative frequency for each category.   In other words, for jackhammers we can 
find the appropriate amount of degrees   by multiplying 0.2779 by 360 degrees. 
This gives us about 100 degrees. For car   alarms we can find the appropriate amount of 
degrees by multiplying 0.3865 by 360 degrees.   This gives us approximately 139 degrees. And 
doing the same for barking dogs and crying babies,   we have approximately 45.9 degrees 
and approximately 75.0 degrees.   Now let's think about how we could use relative 
frequency or degrees to construct this pie chart.   Let's start with the category that has the 
largest frequency, car alarms. We need a slice   of the pie that corresponds to about 39% of the 
circle or 139 degrees. Speaking in percentages,   this would be 50% of the circle and this would 
be 25 so we can approximate the location of 38.7%   appropriately between 25 and 50. In a similar way 
we can think of half of a circle as 180 degrees,   and a quarter of a circle as 90 degrees, and 
we can approximate the location of 140 degrees   between there. Halfway between 90 and 180 is 
135 degrees which is pretty close to where   we want to be, so we can approximate 
140 degrees with something like that. And we have the first slice of our pie. Our 
next largest wedge needs to correspond to   the 279 people that said that jackhammers 
were the most frustrating sound. This wedge   needs to correspond to 27.8% of the 
circle or approximately 100 degrees.   In terms of degrees, consider that three quarters 
of the way around the circle would be 270 degrees.   Our last wedge ended right around 140 degrees 
and we need to go another 100 degrees around the   circle to 240 degrees. That's 30 degrees less than 
270 which I'll approximate to be right about here.   Now, we have the second slice. Continuing in 
this way we can sketch the last two slices   that correspond to crying babies and barking dogs. Now we have our completed pie chart 
and the last graph for this lesson.   That concludes this lecture video 
on various tables and graphs;   some of which were very enlightening 
while others were a little deceptive.

Transcript for:2.3 Graphs that Enlighten or Deceive

Transcript for:
2.3 Graphs that Enlighten or Deceive