Hi there! Welcome to Introduction to Data Visualization and Design. So we are going to kick off this class with a brief history of data visualization. Now, that title might make you chuckle a little bit.
It makes me chuckle a little bit. It is incredibly ambitious to think that we can give any kind of a history of data visualization. We're still going to try!
We're going to talk about some of the leaders. We're going to talk about some of the moments. We're going to talk about some of the evolutions starting in, you know, 300 BC. And, you know, go every few hundred years.
But also, quick caveat, this is a hugely Eurocentric retelling. It would be a dissertation. It would be a life's work to collect this history from non-European sources. If anyone is super motivated.
and interested in doing that, I so encourage you to take that on. Right now, the history that we have, the resources that we have, they're very Eurocentric. So bear with me as we walk through this very classic canon. So we are going to start with the very emergence of data, the very moment where data as an object, as a unit, came into being. And it was in 300 BC in Euclid's book, which you might know Euclid from his proofs, but he also had a book called Data.
Somebody who speaks Greek probably knows how to say this better than I do, but Data Menai, which translates literally to that which is given, that which is assumed before the fact. The thing which we do not interrogate, we just... take to be some sort of truth, whether that's a real truth or whether that's a pretend truth, is neither here nor there.
When we're talking about data, we're talking about that collection of facts. The establishment of the idea of data, this collection of observations, is this moment where we can start to say, oh, we can collect this, treat it as a unit, and visualize it. before treating it as a unit, we couldn't visualize it, right? So this is like this moment towards the end of the Library of Alexandria, and what this comes out of is this moment of trying to understand how we know what we know, right? Like, so we are saying, okay, I have to have some starting point, and Euclid was like, that starting point is the data.
Okay, so then fast forward about, I don't know, a little less than 2,000 years, and we come into this moment of measurement, okay? So measurement is the thing that allowed data to become the standardized thing, and I use standardized with giant air quotes because no measurement is standardized, right? Like, every measurement is going to be ever so slightly larger or smaller than the last one.
Okay. In statistics, because of that, we assume that everything kind of centers around this midpoint, and that's what we talk about are normal distributions. That is far beyond the scope of this class.
You don't have to worry about any of that. All that I encourage you to take away from this moment of measurement is the fact that measurement had to be defined, right? So in the early 1600s, early 17th century, All of these tables of measurement start appearing across Western Europe, primarily around Germany, right? And what they're doing is arranging observations into a table.
And that table structure was itself an invention, right? We still use that table structure today. It informs the entire way that we conceptualize a data set, this like unit.
I say data set to you and you might conjure up ideas of an Excel table. And that's exactly what we're talking about here. You apply a color scale to an Excel table.
We can call that a heat map, right? We can say, oh, the like big numbers are dark and the small numbers are light. That is as old as a data visualization gets, right?
So that arrangement of a table is really this one. Okay. So Gabriel de Mouton in France, 1670, he proposes this measure of this method of measuring length that's at the intersection of nature and science, right? So it's nature because he's basing it on the size of the earth.
And it's science because it's breaking it down into a hundred units. You can see where we're going with this. This gives us the meter. And like, you can look into like old Italian art and you see these like lengths inscribed into the cathedrals. And this is a moment where like religion, science, and politics are all coming together to create this concept of this unit of measurement.
And then de Mouton comes in and he's like, oh let's break this up into a hundred and base it on the circumference of the earth ultimately it's not his idea that lasts but the ideas that last are inspired by his ideas okay so 17th century also is all about let's call it exploration um but it's really like western europeans going outwards and like getting in boats and being like okay what can i find all right and they had a problem Their problem was longitude. So they knew latitude. They knew how far north or south they were from the equator.
But they could never tell how far east or west they had traveled, right? Because there's no zero point to measure from. The Earth is a sphere, and it's continuously going around on the axis that is parallel to longitude, right? So, like, latitude, you'd say, okay, the middle point, the belt.
of the earth that's your zero and you measure from that but if you have no zero point you can't measure so longitude you're really stuck right so okay so there's no zero point fine we establish a zero point like everybody who's living everywhere is like i'm the zero point measure from me okay so we could like work with the zero point but you can't tell longitude from celestial bodies so you can't use the sun or the moon or anything like that you have to be able to tell time right So as with many, many, many hard problems, like it took a long time to figure that out. But along the way, all kinds of incredible inventions come up. Right.
So the first one of those is this moment in like Michael, probably Michelle, Michael Florent van Langeren makes this chart. Right. This chart.
It's fine. It's like. the difference in measurements from Toledo to Rome, right? And this measurement here, you know, it was really short.
It said it was like, I don't know, 18 units. This one said it's like 19 and a half units. This, these ones say it's 30 units, right? It does not matter what the units are. Okay.
What matters is that this visualization is quantifying uncertainty. And Van Langren was making the argument that you couldn't really tell the difference the distance between Toledo and Rome. Okay so the king of Spain held this competition.
He wanted to know how far apart these two cities were right because he wanted somebody to solve the longitude problem. So Van Langren enters the competition like nobody really wins. The king of Spain takes him on as like uh whatever they call it. He like pays for him.
He's a I forget what they're called. Okay. So he takes him on as like part of his court. He's paying for him to like work on this really hard problem.
So one of the things that Van Langbrun does is he brings this visualization to the king of Spain. And he's like, look, this is a hard problem. You should take me on. Right.
So this visualization doesn't actually tell you how far apart these cities are. But it tells you the fact that like nobody knows how far apart they are. That had to be absolutely wild when he made this visualization because it's like, right, he's not showing anything real.
He's just showing the error in measurement. Cool. Okay.
So then that's about all that happens in the 17th century. The 18th century is all about representation. So one of the first, like, visualizations that we think of, like, our suite of visualizations is a line chart, right? You use a line chart for literally everything. And the 18th century gave us line charts.
So Francis Huckabee was like this. So he was just like not a well-respected scientist. He was this incredible instrument maker. He was a fellow in the Royal Society of London, but he was like the lowest level fellow. Right.
But he's doing these experiments, looking at like different types of water between two plates of glass and like how far. up or how far down they go. And to represent this and to publish this finding in a journal, he made a literal line chart of like how far up and how far down each experiment was, right? So this is that literal line chart.
12 years later, there's an abstract line chart and it's barometric observations. So it's this literal line chart that gives us what we think of as being a line chart today. Right. Like usually when we're looking at a line chart, we're looking at like, I don't know, death rates of COVID.
Right. Or we're looking at like gross national product. And we have on the on the x-axis years.
and on the y-axis some value, and like it goes up and it goes down over this time. That comes about in the 18th century. So we're working on this representation, okay?
The other thing happening in the 18th century was all about color, right? So there were three different color systems that emerged at this time, and this, the fact that we get these like color scales and basically like trying to order how the colors are supposed to be related to each other, like This is what allows us color scales like we think of today. Chloropleth maps, which are like, it's really dark or it's really light.
It's this moment of being like, okay, I want to order the colors that allows us to have all the coloring of every visualization. Color to mean anything that is scalar, right? Okay.
The other thing happening in the 18th century is this first, like, timeline. Like. And it's Barbeau du Bourg. He makes this cart chronography.
And this is important because what he's doing is visualizing multiple events at the same time. So you can have multiple things layered on each other on the y-axis. Never before had anyone thought to stack these things on the y-axis.
We're going to see that come back into play in just a few centuries. Okay. So end of the 18th century, what we start to see is this visualization and storytelling. Right. So we have all kinds of visualizations come up.
We have proportional squares and its counterpart like layered squares. So this is the use of area to represent like different volumes, which you're going to be using a ton of when you make tree maps in Tableau. We had fever map. of dots and circles, which is really the basis of demographic maps. There's chloroplasts, which are like light versus dark of literacy.
Faraday, who was like the electricity guy, includes visualizations and diagrams in all of his publications, right? Like the use of these diagrams really takes us away from this like basic line chart and gets us thinking about how how to literally think visually. And Faraday was the leader of that.
There's polar area charts, which you're going to see come up again. And there's this visualization and explosion, but there's two people that really stand out. There's Priestley and then there's Playfair, and they are on either side of what's often called the golden age of data visualization.
So let's talk about Priestley first. A little bit of a disclaimer. I've like idolized Priestley on so many levels. One of the many things he did in his career was the specimen of a chart of biography or the chart of biography.
So he shaped how we think about data and visualization by abstracting data from its measurement. So here he's using length of the line to represent how long someone was alive or theoretically alive for, right? And our friend Euclid is right here. He's using the dots to represent uncertainty.
That had to be crazy because, okay, so Van Langren like did a really good job of visualizing uncertainty, but he didn't encode it in this way and use some other system to encode what he wasn't quite sure of, right? So it was... It's crazy when Priestley came in, not only did he represent time as length, but he also like visually encodes uncertainty. Okay.
So this is 1765. A couple of years later, 1769, he makes a new chart of history, right? We might come to this today and be like, it is so biased, right? Like it's this Western European, like very hyper, hypo Eurocentric. And it looks biased today, but it was crazy revolutionary at the time.
Because what he's saying in this chart is that you have to understand interaction. And that the boundaries between places are artificial. So the way to read this is that region is on the y-axis.
So here's all these places like the Americas, Africa, China, lots of different places that I can't read with my eyes right now. And time is on the x-axis, right? And he's saying like read this span of time, but also read like vertically how these things interact. which was just transformational.
So Priestley ushers in this golden age of data visualization. Playfair kicks it off. So William Playfair gave us the bar chart, the line chart, the area chart, the pie chart, and the circle graph.
He was like a visualization machine, right? So Playfair was, in his day, quite unknown. But he did a lot of these time series line charts.
He was really interested in change over time and social effects of change over time. So here are these line charts and area charts. And then here is breaking up a whole into discrete units. And this is the United States broken into region sizes. So Louisiana is obviously.
the vast majority of the U.S. It was the largest territory at the time. One of his most famous charts is this 1821 price of wheat.
It's the price of wheat. Here's time on the x-axis. And who the reigning monarch is, is these like breakdowns here. And he's just showing that like Quality of life is decreasing because the price of wheat is wildly increasing, disproportionate to weekly wages.
Okay, so we get priestly, we get playfair, and then it just takes off. So we're going to like kind of cherry pick a couple of people who are super interesting. But like the 19th century really is this moment of evolution into statistics and making a mathematical argument.
There's a huge focus on morality and mortality. And I argue that a lot of that focus actually comes down to industrialization and the clustering of many different types of people into cities. I'm not the first person to argue it. Lots of other people argue it. There are different takes on this.
So people get clustered into cities. Right. And they're interacting with each other in ways that they never really had to before.
And we see this formalization of social class. and like it's kind of centered around literacy and like work so um for various other orthogonal reasons like literacy takes off at this time because the printing prices are printing presses are taking off the more print that there is the greater transfer of ideas there's going to be and the quicker evolution of those ideas because there's all of this communication happening So throughout this period, we see all of these tiny inventions that have huge effects and like lots of people are inspired by this invention here, that invention there. This phenomenon is not at all related to, not at all isolated to data visualization, but occurs across fields and across domains. So let's look at a couple of examples of just like moments of inspiration from this time. So we have 1854 John Snow's cholera maps.
So 1854, there's this huge cholera outbreak in London. John Snow comes in, he's a doctor, and he maps the instances of cholera using a bar chart. So he takes the street map and he's like, there's a case here, there's a case here, there's a case here. By stacking them up, he can see where the cases are localized. By placing that bar chart on top of the street map, First, he gives the first epidemiological map, but he also identifies the source of cholera, finds the Broad Street well, takes off the handle, the cholera epidemic ends, right?
Incredible. So it's this moment where we're like, oh, if we look at people in relationship to place and use some sort of systematic way to visualize that, we can learn a lot about people and places. Sparks a whole like revolution.
Okay. 1858, there's Florence Nightingale. This is called a polar rose. It's called a Nightingale rose. Um, she invented this method of visualization.
There were already polar area charts, so it's like, uh, invented might be a strong term. Um, but she used this visualization to argue that the most important thing that we could do for our military personnel was to improve their sanitation. So at the end of the Crimean War, she's a nurse, and she's like, I am seeing these people die not of gunshot wounds, but of some infection that they get when they're in the hospital. And she's like heartbroken and frustrated and like all of these things. So she creates this visualization to show how many people died from wounds and how many people died from infections, right?
So what you see is this big drop off in deaths when they improve sanitation. So here the blue deaths are what she considers to be preventable, deaths of disease, mostly from poor sanitation, and then you see this big drop-off. Now, this has some critique.
She's not wrong, but the visualization as presented is not necessarily clear. But she still made her point, and we still got hand-washing in places like hospitals. Go Florence Nightingale, thank you.
Right? Okay. Let's move forward just a couple of years to Charles Menard.
So Charles Menard made this beautiful flow chart, right? So this is the first flow map. He has six variables in this plane.
We're going to look at it in more detail later in the semester. We're also going to look at other artists who've done a really great job of cramming a ton of variables, giving you a key, and letting you explore a complex story. But what we're looking at is the number of Napoleon's troops.
Napoleon's name is never mentioned. on this entire chart, right? So we see a lot of troops, a lot of troops, very, very few troops, almost no troops.
And he's, I mean, making the point that he is, but he's also showing like location. He's showing latitude and longitude and how, where they traveled to. He's showing the distance traveled.
He's showing the temperature, right? Like down here. He's showing the number of people. He's showing the direction of travel.
Brown is going out. Blue is coming back, right? So he's cramming all of this information into here. And it's based on a line chart, but it's this huge deviation that also uses color and shape and volume. Fast forward to about 100 years to 1977. Last real moment of, like, visualization invention.
until we get to computers. And we have Robert Tukey. So Tukey is super important in statistics.
He also gave us the box plot, right? And this is a visual representation of a ton of statistical information, right? So it has the distribution of a variable, the mean of a variable, the quartiles, and the outliers. The way you read a box plot is that this is 50% of the instances of a variable. This gets you to...
95 or 99%, however it's set up. And then these dots are your outliers. They're the really extreme values, right?
And this was revolutionary and it's pretty much an invaluable tool in any data analyst toolkit. We are now sitting in the rebirth of data visualization. So there are tons and tons and tons of artists. This does not even begin to scratch the surface of people who are doing this.
creating incredible visualizations, telling incredible stories with like computer-based tools. And even when it's not computer-based tools, when it's colored pencils, you can still see how it's informed by this entire ecosystem of data visualization artists and specialists. So we, like Tableau is part of this outgrowth. Tableau is You know, it's really a database with visualization tools on the front, but what it's done is it's given primacy to the dashboard.
It's made making visualizations, experimenting with your data, telling different stories, taking a data set and view it from all of these different ways incredibly accessible. And that is the strength of this class, is that rather than learning D3, which gives you something that's more interactive, but you have to hardcore. hard code from the beginning, through the course of this class we are going to focus on the data.
on working with data responsibly, telling an accurate story based on the data that you have, and design principles around data visualization, how all of your design choices help to tell the story that you're trying to communicate. So with that, I look forward to the rest of the semester together and seeing you sometime soon. All right, take care.
Bye-bye.