Foundations of Data and Graph Representation

Hi, I'm Rob. Welcome to Math Antics. In this video, we're going to learn about data and graphs.

Because those are such big topics, we'll only be able to cover a few key concepts in this video. But they should give you a good foundation to build on. Data, simply put, is information about the world. The type of data that we usually deal with in math is called quantitative data, because it involves quantities. that we represent with numbers.

Quantitative data comes in two basic flavors. Chocolate and Vanilla? Uh, no. Continuous and Discrete.

Oh yuck, those sound disgusting! I would never order those flavors! Well, what I really mean is that there's two basic types of data.

Continuous data is data that can have any value in a range. For continuous data, there's an infinite number of other possible data values in between any two actual data values. For example, if you ask an ice cream shop how much ice cream they sold on a hot Saturday afternoon, they might tell you 14 or 15 kilograms.

But they could also tell you that it was any number in between, like 14.6 kilograms, or even 14.625 kilograms if they can measure that precisely. On the other hand, discrete data is data that can only have specific values. With discrete data, there aren't any other possible data values in between any two consecutive values.

For example, if you ask an ice cream shop, how many flavors they sell, they're going to respond with a whole number, like 1, 2, 3, or even 31. They wouldn't give you an in-between number because they couldn't have a fractional amount, like 31.4159 flavors. Ah, but you could mix two flavors together to get a brand new one. I call this flavor, chanilla.

Well, anyway, you get the idea. Continuous data typically comes from a process of measurement, and the values you get are only limited by the precision of the measurement device you're using. However, discrete data typically comes from the process of counting, which is often limited to specific values like whole numbers or integers. To see the difference, think about two types of data associated with a baseball player.

The number of home runs that a player hits each season is an example of discrete data. It can only be a whole number. But if you keep track of the average distance a player hit the ball, that would be an example of continuous data, because it could be any decimal value that you could measure.

Now that you know a little bit about quantitative data, let's talk about ways data can be organized or formatted. Naturally, the more data that's collected, the more important it is to organize it so it doesn't become a chaotic jumble of numbers. Organizing data is typically done with a data table.

Most data tables are made from the intersection of vertical columns and horizontal rows. Each box that's formed by these intersections is called a cell, and it can hold a number or other data in it. The columns and rows have labels to help you interpret the data in the cells.

For example, this data table shows the results of a class survey about students'favorite foods. It has five columns that are each labeled with a choice from the survey like pizza or tacos. and it has a single row that's labeled Number of Students.

These labels make it easy to tell what the numbers in the cells mean. For instance, this 9 is at the intersection of the column for hamburgers and the row for number of students, so we know that 9 students answered hamburgers in the survey. This table happens to be in a horizontal format because there's more columns than rows, but there's no reason you couldn't switch the columns and rows to get a vertical format instead.

These two tables contain the exact same set of data, they're just laid out differently. Of course, the more data a table holds, the more complicated it gets. For example, this data table organizes a lot of quantitative data about the climate in Yellowstone National Park.

It's got a column for each month of the year and several rows corresponding to weather measurements, like average high and low temperature, and average precipitation and snowfall. A data table like this can be very useful because it can help answer questions like, which month is typically the warmest in Yellowstone? If you look at all the numbers in the rows about temperature, you'll see that the highest numbers are in the column labeled July.

But there's something that would be even more useful than a data table for answering questions like that. And that something is called a graph. Graphs are visual representations of data. Instead of using rows and columns of numbers, Graphs use a variety of visual elements like points, lines, rectangles, or other graphics to make a picture of the data. To see how helpful graphs can be, let's focus on the row of data that records average precipitation and see a graph of that data only.

This particular type of graph is called a bar graph or a bar chart because it uses rectangular bars to represent numeric values. Just like the data table, the graph has labels on two of its sides. These sides are called the axes of the graph.

The horizontal axis on the bottom lists the months of the year, while the vertical axis on the left side has a sequence of numbers to show the precipitation in inches. As you might expect, the height of the bars in a bar graph correspond exactly with the numbers in the table. You can see that by looking at the scale on the vertical axis.

The data table tells us that the average rainfall for the month of May is 2 inches, and the height of the bar for that month on the graph exactly matches the 2 inch mark on the vertical axis. You can almost imagine that a rain cloud came along and filled up the bar to that level. Notice how easily the graph lets us see how the rainfall changes throughout the year. We can quickly find the highest and lowest months because our brains are able to compare the size of the bars much faster than they can compare the numbers in the table. That's why bar graphs are such useful tools for representing quantitative data.

Here's another bar graph that's made from the row of data showing average snowfall. Can you tell which month typically gets the most snow? Yep, it's January. And by looking at the scale on the left side of the graph, you can see that bar represents 14.5 inches of snow. Notice that these two different bar graphs have different number sequences, or scales, on their vertical axes.

The rainfall graph starts at a minimum value of 0 inches and goes up to a maximum value of 2.5 inches, while the snowfall graph starts at 0 and goes up to a max of 16. Also notice that each graph uses a different size subdivision on the vertical axis. The rainfall graph has a division every 0.5 inches, while the snowfall graph has a division every 2 inches. The size of these subdivisions is called the interval of that axis.

This combination of minimum value and maximum value and interval form the scale of an axis. When making a graph, it's very important to choose a scale that does a good job of displaying the data in a clear and understandable way. For example, if we used the same scale for the rainfall data as we did for the snowfall data, our graph would look like this. That's not good! Now you can hardly tell the difference between the heights of the bars because they're all squished down below the 2-inch mark.

Likewise, if we use the scale of the rainfall graph for the snowfall data, a lot of the values would be off the chart. So keep that in mind if you ever need to draw a graph yourself. You'll want to pay close attention to the range of values in the data set so that you can choose a minimum, maximum, and interval that are a good match for the data you need to graph. For example, what if you have a data table that contains negative values? Like this one that tracks how much money an investor made during each month of a year.

The highest monthly gain was $800. But there are some months with negative numbers like negative $400. which means the investor lost money. Fortunately, bar graphs can be adapted to handle negative values too.

All you need to do is extend the vertical axis scale using negative values that go below the horizontal axis. That way you can display negative values as bars that go down instead of up. Pretty cool, huh? Of course, bar graphs aren't the only way to turn data into a graph. There are actually many different kinds of graphs, some of which are better than others for certain types of data.

Unfortunately, we don't have time to cover all these different graphs in this video. Instead, we'll just focus on one more very common option called a line graph. Line graphs are great for many types of data, especially data about how things change over time. The Yellowstone data table we saw previously also had data showing how the average high and low temperatures change throughout a typical year. Here's a line graph of just the high temperature data.

If we compare it to the bar graph we made for the precipitation data, You'll notice that the bar graph only had horizontal interval lines for tracking values on the vertical axis, while the line graph also has vertical interval lines for tracking values on the horizontal axis. These two sets of lines form a grid that makes it easier to locate individual data points on the graph. For instance, notice that for the month of June, there's a dot exactly at the intersection of the vertical line for June and the horizontal line for 70 degrees. That matches perfectly with the data in the table, which shows an average temperature of 70 degrees in June. And if you count them all, you'll see that there's 12 dots total, one for each value in that row of the data table.

After all these dots were plotted on the graph, they were connected with line segments between each adjacent pair. Why was this done? To help us identify any trends in the data. A trend is simply a pattern in the data, like increasing, decreasing, or staying basically the same.

Identifying these sorts of trends in data can be helpful in a couple different ways. First, Trends can help you understand data on a higher or more general level. For example, the trend in this Yellowstone data is that the temperature increases from January to July. and then decreases from August to December. And second, trends can sometimes help you make predictions about what might happen in the future.

For example, if you have a line graph showing how the population of a bison herd was changing over time, it might help you predict how many bison you'd expect to see the next year. So line graphs are great for displaying data in a way that lets you quickly identify trends. Another strength of line graphs is that they make it fairly easy to plot and compare multiple sets of data on them using different lines.

Like, we could easily add the data for the average low temperature from our table using another line like so. Multi-line graphs like this will usually use different colors or line styles to help you tell which line is which. But like all types of graphs, line graphs have some limitations too.

For example, do you remember our first set of data about students'favorite foods? Well, here's a bar graph and a line graph of that data side by side. Can you think of a reason why the line graph wouldn't be a good way to represent this data? Yep, it's because the options on the horizontal axis aren't related by any natural sequence.

It doesn't really matter which order the different foods are listed in. And because the order isn't important, any trend you might see by using a line graph doesn't have any real meaning. In fact, it could actually be misleading to use a line graph for data like this, because it also implies that there could be choices in between the survey options like taco pizza or pizza hamburgers. So in cases like this, it's better to just stick with a bar graph so people don't get confused. That's why line graphs are often used for data that occurs over time.

Time forms a natural sequence where trend lines make sense, and there are always possible values in between whatever time interval is used on the graph. Alright, so now you have a basic idea of how data and graphs work. You've learned about continuous and discrete data, and how it's organized using data tables.

You've also learned how data can be displayed on graphs to make it even easier to understand. As I said, there are many different types of graphs, but for now, just make sure that you understand how to interpret bar graphs and line graphs. The best way to do that is to look at a lot of different types of those graphs and answer any available practice problems on your own. As always, thanks for watching Math Antics and I'll see you next time!

Oh, so beautiful! Oh man! I should have checked the climate data before I got here.

Oh hey, buffaloes! Oh, what a cute little buffalo c-Coming right at me! Learn more at www.mathantics.com

Transcript for:Foundations of Data and Graph Representation

Transcript for:
Foundations of Data and Graph Representation