Coconote
AI notes
AI voice & video notes
Try for free
π
Histograms and Data Visualization
Jul 6, 2024
π
View transcript
π
Review flashcards
Histograms: A Method for Displaying Continuous Data
Introduction
Why histograms?
Means, median, standard deviations don't tell the whole story.
Distribution shapes are important and not captured by single summaries.
What is a Histogram?
Definition
: Displays distribution of data by charting the number or percentage of observations within predefined numerical ranges.
Similarity to Bar Charts
: Histograms are similar to bar charts but focus on data distribution.
Example: Age Data from 1995 Statistical Abstract (US)
Dataset
: Proportions of individuals over 65 years for the 50 states.
Interesting Findings
:
Smallest percentage: Alaska (4.6%)
Largest percentage: Florida (18.4%)
Steps to Create a Histogram
Step 1
: Break data into mutually exclusive, equally sized bins.
Step 2
: Count observations in each bin.
Notation
: Use brackets and parentheses to define precise ranges.
Example Breakdown
Bin Setup
: Bins from 4% to 19% with 1% width.
Observation Counts
:
4-5%: 1 observation
5-6%: 0 observations
6-7%: 0 observations
8-9%: 1 observation
More action within 10-16% range.
Graphical Summary
: Histogram visualizes where values are centered and spread out.
Blood Pressure Data Example
Dataset
: Blood pressure data from 113 men.
Statistics
: Mean = 123.6 mmHg, Standard deviation = 12.9 mmHg.
Histogram Properties
:
Bin Width
: 5 mmHg, height represents number of men.
Shape
: Symmetric, bell-shaped around mean and median.
Arbitrary Bin Width
:
20 mmHg width: Too crude.
1 mmHg width: Too fine.
Percentage Representation
: Vertical axis can represent percentages instead of counts.
Choosing Bin Width and Number
General Guidance
:
Dependent on sample size and data spread.
Rough rule: Number of intervals β βsample size.
Example: 10 observations β 3 bins, 50 observations β 7 bins, 100 observations β 10 bins.
Computer Selection
: Computers typically choose the optimal number of bins for you.
Conclusion
Histograms are useful for summarizing continuous data and understanding distribution shapes.
Other visualization options will be explored in the next section.
π
Full transcript