Transcript for:
Exploring Football Statistics and Measurements

Imagine you're very, very interested in football. You know, that sport some of us like to call soccer. You are the person who wants to know all the details, like how many goals were scored by some player, how many games were won by a particular team, or how many penalties were stopped in a certain football competition. In this video, I will explain how improving your knowledge of statistics could make you a real expert on football, or any other kind of sport. The number of scored goals. won games and stopped penalties are all pieces of information that can be thought of in terms of variables and cases. Variables are features of something or someone and cases are that something or someone. Let me be a little bit more specific. Imagine you are interested in some characteristics of football players belonging to your favorite team. Of every single player you want to know his or her body weight, hair color, age and the total number of goals scored during the most recent competition. All these player characteristics are variables. The players themselves are cases. Another example. It could be the case that you are not so much interested in the features of individual players, but in the features of the teams these individuals play for. For instance, you might want to know about every Spanish team in which city it is based, what the main colors of the shirts are, and how many goals the team scored in the last year. These features are variables again. However, The cases here are not individual football players, but the teams these individuals play for. In a study, cases can thus be many different things. They can be individual football players and football teams. But they can also be, for instance, companies, schools, or even countries. Every characteristic of a case can be called a variable, as long as it meets one essential criterion. It needs to vary. What does that mean? Let's go back to the example with... teams as cases and look at the variable city where the team is based. You focus on every Spanish team so there will be many different cities. One team comes from Barcelona and other teams come from for instance Madrid, Valencia or Sevilla. We have in other words variation. Let's now focus on another characteristic, not the city but the country where the team is based. For every single team this will be Spain. The teams are after all. Spanish teams. This means that there is no variation here. Not a single team will be from another country than Spain. For this reason, we call this characteristic not a variable, but a constant. You can probably imagine that we can have many, many different kinds of variables, representing strongly diverging characteristics. For this reason, and also for other reasons that I will discuss later, it is of essential importance to distinguish different levels of measurement. The most simple level of measurement is the nominal level. A nominal variable is made up of various categories that differ from each other. There is no order, however. This means that it's not possible to argue that one category is better, or worse, or more, or less than another. An example is the nationality of the football players. The various categories, for instance Spanish, French, or Mexican, differ from each other, but there is no ranking order. Another example is the gender of the football players, or the city the football teams come from. The second level of measurement is the ordinal level. There is not only a difference between the categories of a variable, there is also an order. An example is the order in a football competition. You know who is the winner, you know who came second, and third, etc. etc. However, by looking at the order, you don't know anything about the differences between the categories. You don't know For example, how much the number 1 was better than the number 2. Both nominal and ordinal levels can be called categorical variables. The next level of measurement is the interval level. With interval variables, we have different categories and an order, but also similar intervals between the categories. An example is the age of a football player. We can say that a player of 18 years old differs from a player of 16 years old in terms of his or her age. In addition, we can say that this player is older. But we can also say that in terms of age, the difference between an 18-year-old player and a 16-year-old player is similar to the difference between a 14-year-old player and a 12-year-old player. The final level of measurement is the ratio level. It is similar to the interval level, but has, in addition, A meaningful zero point. An example is a player's body height measured in centimeters. There are differences between the categories. There is an order, there are similar intervals, and we have a meaningful zero point. A height of 0 cm means that there is no height at all. Note that we cannot say that h has a meaningful zero point, because an h does not mean that there is no h. h therefore is an interval variable. Interval and ratio variables are what we call quantitative variables. because the categories are represented by numerical values. Quantitative variables can also be distinguished in discrete and continuous variables. A variable is discrete if its possible categories form a set of separate numbers. For instance, the number of goals scored by a football player. A player can score, for instance, one goal or two goals, but not 1.21 goals. A variable is continuous if the possible values of the variable form an interval. An example is, again, the height of a player. Someone can be 170 cm tall, 171 cm tall, but also, for instance, 170.2461 cm tall. We don't have a set of separate numbers, but an infinite region of values. Why is it so important to distinguish these various levels of measurement? Well, because the methods we employ to analyze data, the... depend on the level on which your variables are measured. However, in practice the distinctions sometimes get blurred. For instance, for many statistical analyses, the differences between the interval and ratio level are not that important. Moreover, many statisticians argue that if you have an ordinal variable measured on a scale with 10 categories or even more, you are allowed to analyze this variable as if it were quantitative. An example is a survey question that asks, on a scale from 0 to 10, How good would you say player x is? Formally, this is an ordinal variable, but in practice you are allowed to cheat and to treat it as if it were a quantitative one. To conclude, how does all this information make you a better expert in football? Well, thinking about players, teams and competitions in terms of cases, variables and the levels of measurement of these variables makes your knowledge about football more structured. To become even more of an expert, do not hesitate and watch the next videos too.