Overview
This lecture introduces the concept of data, explains the importance of context, and describes different types of variables commonly found in structured data sets.
Context and Meaning of Data
- Data consists of information presented with context.
- Without context, numbers or words alone do not convey meaningful information.
- Context transforms information into useful data.
Structure of Data Sets
- Structured data displays variables as columns and subjects (cases) as rows.
- Each variable is an attribute measured for every subject, such as movie title, genre, year, and average rating.
- When sorting data, all columns (variables) must be sorted together to keep each row intact.
Types of Variables
- Variables are attributes or characteristics measured in the data set.
Categorical (Qualitative) Variables
- Categorical variables represent labels or categories.
- Ordinal variables have categories with a specific order (e.g., movie rating: G, PG, PG-13, R; Likert scales; school year).
- Nominal variables have categories with no inherent order (e.g., genre, hair color, ethnicity).
- Binary (two-category) variables are nominal (e.g., animated: yes/no).
Numerical (Quantitative) Variables
- Numerical variables represent measurable quantities.
- Discrete numerical variables can only take whole number values (e.g., year released, number of votes).
- Continuous numerical variables can take any value, including decimals, and are measured (e.g., average rating, budget, temperature).
Importance of Variable Types
- Identifying variable types is crucial for selecting appropriate statistical analyses and tests.
Key Terms & Definitions
- Data — information provided with context.
- Structured Data — organized with variables as columns and cases as rows.
- Variable — a measured attribute in a data set.
- Categorical Variable — variable with values as labels or categories.
- Ordinal Variable — categorical variable with a meaningful order.
- Nominal Variable — categorical variable with no intrinsic order.
- Numerical Variable — variable with measurable, quantitative values.
- Discrete Variable — numerical variable with countable values only (no decimals).
- Continuous Variable — numerical variable that can take any value within a range, including decimals.
Action Items / Next Steps
- Review your own data sets for examples of each variable type.
- Prepare to identify variable types in upcoming assignments.