In this lesson we're going to look at
some different types of tables and graphs. Some of which will tell us some very meaningful
information about the data while others will tend to be more misleading. We'll begin
with what's called a frequency polygon. A frequency polygon uses line segments connected
to points that are located directly above class midpoint values. Unlike our histograms where
we created bars with the heights for each class, with a frequency polygon we simply plot a point
for each class. It's important that we use class midpoints and that the height corresponds to
frequency. A variation of this is the relative frequency polygon; where we change the vertical
axis from being frequency to being relative frequency (or proportions) or percentages. Here
are some important features of frequency polygons. Heights correspond to class frequencies. Line
segments are extended to begin and end on the horizontal axis. That means that after we've
plotted a point corresponding to each class midpoint, we must connect our polygon down to the
horizontal axis using the appropriate width from the first and last midpoint. Also, if we want
to put two or more polygons on the same graph, this can really help us compare two sets of data.
Let's try out an example. Given this frequency distribution we're going to construct a frequency
polygon. We first need the class midpoints and we will label these class midpoints on the
horizontal axis. Now let's label the vertical axis. Since our frequencies simply go from 0 to
5 let's let each one of these lines represent one unit. Now we're ready to plot points for each
class. The first class has a midpoint of 4.5 and a frequency of 1. So we'll first plot the point
above 4.5 with a height of 1. The second class also has a frequency of 1 so above the midpoint
14.5 we will plot a point corresponding to a height of 1. Our third class has a frequency of
2 so above 24.5 we will plot a point with height 2. Our next two classes each have a frequency of 0
so above 34.5 and 44.5 we need to plot points that correspond to a height of zero, which is on the
horizontal axis. The next class has a frequency of 4. Next we have a frequency of 2, following that
we have a frequency of 4, then 5 and lastly 4. Now that we have our points located directly above our
class midpoint values, we can connect these points with line segments. Working from left to right we
connect consecutive points with straight lines. The last thing we need to do is connect the
endpoints back down to the horizontal axis. It's important not to just simply draw
a line back down to the horizontal axis, but rather do it at a regular increment as set
by the class midpoints. On this graph our class midpoints line up with the grid and so I'll go
one unit, or one line, on the grid to the left and place a point with a height of 0. I'll do
the same to the right. I'll also label these two points by using the width found between any
two consecutive midpoints. Between 4.5 and 14.5 is a width of 10. So to label this first point
I need to go 10 below 4.5 to -5.5. To the right I need to go 10 above 94.5, so this last point
corresponds to an x value of 104.5. The last thing we'll do is connect these two endpoints. Now
we have our frequency polygon with our horizontal axis corresponding to our x values and our
vertical axis corresponding to class frequencies. Unfortunately there are many ways to be deceptive
with our graphs. One very common way to deceive the reader is to have a non-zero axis. This is
where we have an axis that is anything other than 0 which can completely exaggerate the difference
between two graphs. Take a look at this graph below. The horizontal axis corresponds to years
and the vertical axis corresponds to interest rates. Looking at this graph it appears that
interest rates increased rapidly from 2008 to 2012. It also appears that from 2009 to
2010 interest rates doubled because the bar corresponding to 2010 is twice as tall as the bar
corresponding to 2009. Now look at the same data on a different graph. These are the exact same
values. Looks a little different, doesn't it? On the first graph notice that the vertical axis
starts at 3.14 instead of 0. On the second graph it starts at zero. Now you can see that from 2008
to 2012 interest rates barely changed. By having a non-zero axis, the difference between two years
is grossly exaggerated. Another type of graph that can be very deceptive is a pictograph. Whenever
we have drawings that are given in two or three dimensions. We can really distort one-dimensional
data. Take a look at this pictograph that compares how much Halloween candy was collected by
Shayna and Michael. Looking more closely, it appears that Shayna collected about 18
pieces of candy while Michael collected 36. So Michael did collect twice as much candy as
Shayna, but by using this pictograph, where the candy corn looks three-dimensional, it really
exaggerates the difference between these two. Another type of graph is a scatterplot. A
scatterplot (also called a scatter diagram) is a plot of paired (x, y) quantitative data
with a horizontal x-axis and a vertical y-axis. The horizontal axis is used for the first
variable (the x) and the vertical axis is used for the second variable (the y). Let's create
a quick scatterplot using the following data. Our first point needs to correspond to the (x, y)
pair (1, 2). So beginning at (0, 0) we will move one unit to the right and two units up. The second
point needs to correspond to the ordered pair (3, 5). So we will move three units to the right
and five units up. Next we need a point that corresponds with the ordered pair (5, 9), so
we move to the right 5 on the x- axis and up 9 in the y direction. Next we have the point (7,
10), then (9, 14) and last (11, 15). Unlike a frequency polygon, with a scatterplot we do
not connect the points with line segments. With a time-series graph we are interested in
data that is quantitative and has been collected at different points in time. Let's look at this
table of values and construct a time-series graph. The first step is to label the horizontal axis.
Our x-values in this case correspond to time given in days, so I'll label the horizontal axis
accordingly. The corresponding y- values are the amount of rainfall given in millimeters, so I'll
use that information to label the vertical axis. The second step is to determine an appropriate
value for each unit on the vertical scale. Looking at the rainfall amounts,
the highest value we have is 45 so let's label each unit on
the vertical scale by 5s. Now labeling the days on the horizontal axis we
are now ready to plot points on the grid that correspond to the data values in the table. On
day 1 there were 45 millimeters of rainfall, so we'll plot a point above the 1 with a
height of 45. For day 2, we need a point that corresponds to a height of 20. For day 3 we
need a height of 40, day 4 a height of 38, day 5, 42, day 6, 15. day 7, 10, and finally day 8, 22.
The final step is to connect these dots with line segments in the correct order. Notice that unlike
a frequency polygon, we do not connect this back down to the horizontal axis. Next, let's look
at what's called a stemplot. With a stemplot we are again representing quantitative data, but we
are going to separate each value into two parts; the stem and the leaf. For example, take a look at
this data set. We can represent this data set in a stemplot by having the stem be the tens place and
the leaf be the ones place for each data value. For example the data value 13 is represented on
our stemplot in the row corresponding to the stem 1 and leaf 3. Since there are two 13s in
this data set you see that there are two 3s on this row. Similarly, the data value 9 is
represented on the row corresponding to a stem 0 and leaf 9 because the tens place for the
number 9 is a 0 and the ones place is a 9. A variation of the stemplot is a back-to-back
stemplot or back-to-back stem and leaf plot. Let's create a back-to-back stemplot using these
two sets of data: weight of dogs and weight of cats. We'll begin with our weights of dogs.
The first data value in this row is 48 so in the row corresponding to a stem
4 (or tens place 4) I will list an 8. Next, we have a 13, so I'll put a 3 in the row
corresponding to the stem 1. Next we have a 15, so I'll put a 5 in that same row. The tens place
is 1, the ones place is 5. for the 22, I'll go to the row that corresponds to a tens place of 2 and
list a 2 for the ones place. We have another 48 so I need another 8 in the row corresponding
to the 4. For a 56 I'll put a 6 in the row with the 5. 62 I'll put a 2 in the row with a 6.
73 I need a 3 in the row corresponding to the 7. 52 we need a 2 in the row with a 5.
For 71 we need a 1 on the row of the 7 and for 66 we need a 6 on the row with a 6.
To clean up this left side just a little, I'm going to write these values from least to
greatest. Now we can move on to the other side. For the cats, I'll repeat the same process but
this time I'll put all leafs on the right side. Now I will clean this up just slightly by
replacing this 3, 2, 3 with 2, 3, 3. Having it in increasing order is slightly easier to read.
The benefits of a stemplot are that we can see the shape of the distribution of the data; we
get to retain all of the original data values rather than clumping them into classes; the
sample data are sorted or arranged in order; and when we put two sets of data side-by-side
we can easily compare them. Now let's talk about bar graphs. This is our first graph that we've
looked at for qualitative (or categorical) data. A bar graph uses bars of equal width to show
frequencies of categorical or qualitative data. The bars may or may not be separated by
small gaps. Let's try an example. In a survey 1004 adults were asked to identify the
most frustrating sound that they hear in a day. In response 279 chose jackhammers, 388 chose
car alarms, 128 chose barking dogs, and 209 chose crying babies. In order to construct
this bar graph we need to first label the vertical axis. The vertical axis corresponds
to the frequency for each of these categories so we need to make sure that our vertical scale
goes high enough. Let's make it go up to 400. We need a bar above jackhammers
with a height corresponding to 279. Since 388 chose car alarms, we need a bar
above car alarms with an appropriate height. And similarly for barking dogs and crying babies.
Now we have a complete bar graph. Notice the major difference between a bar graph and a histogram is
qualitative versus quantitative data. Let's end this lesson by looking at a pie chart. A pie chart
is a very common graph that depicts categorical data as slices of a circle. The size of each
slice is proportional to the frequency count for that category. Let's create a pie chart using
the data that we just had in the last example. Here are the same values from that last example.
There are two different ways to construct a pie chart. We could look at relative frequency in the
form of percentages and we can also turn that into an appropriate amount of degrees of the
circle. I'll show you both of those here. With the total frequency or total number
of people being 1004, we can change each of these values to a relative frequency. We do
this by dividing each frequency by the total. In order to find the corresponding degrees of the
circle, consider that all the way around is 360 degrees, so we can take 360 degrees and multiply
it by the relative frequency for each category. In other words, for jackhammers we can
find the appropriate amount of degrees by multiplying 0.2779 by 360 degrees.
This gives us about 100 degrees. For car alarms we can find the appropriate amount of
degrees by multiplying 0.3865 by 360 degrees. This gives us approximately 139 degrees. And
doing the same for barking dogs and crying babies, we have approximately 45.9 degrees
and approximately 75.0 degrees. Now let's think about how we could use relative
frequency or degrees to construct this pie chart. Let's start with the category that has the
largest frequency, car alarms. We need a slice of the pie that corresponds to about 39% of the
circle or 139 degrees. Speaking in percentages, this would be 50% of the circle and this would
be 25 so we can approximate the location of 38.7% appropriately between 25 and 50. In a similar way
we can think of half of a circle as 180 degrees, and a quarter of a circle as 90 degrees, and
we can approximate the location of 140 degrees between there. Halfway between 90 and 180 is
135 degrees which is pretty close to where we want to be, so we can approximate
140 degrees with something like that. And we have the first slice of our pie. Our
next largest wedge needs to correspond to the 279 people that said that jackhammers
were the most frustrating sound. This wedge needs to correspond to 27.8% of the
circle or approximately 100 degrees. In terms of degrees, consider that three quarters
of the way around the circle would be 270 degrees. Our last wedge ended right around 140 degrees
and we need to go another 100 degrees around the circle to 240 degrees. That's 30 degrees less than
270 which I'll approximate to be right about here. Now, we have the second slice. Continuing in
this way we can sketch the last two slices that correspond to crying babies and barking dogs. Now we have our completed pie chart
and the last graph for this lesson. That concludes this lecture video
on various tables and graphs; some of which were very enlightening
while others were a little deceptive.