ECON1006 INTRODUCTION TO ECONOMIC METHODS
SUMMARY NOTES - WEEK 4
Required Reading:
Ref. File 4: Sections 4.7 to 4.9
Ref. File 5: Introduction and Sections 5.1 to 5.4
4. PROBABILITY THEORY CONTINUED
4.9 Sampling With and Without Replacement
Definition (Random Sample from a Statistical Population)
A random sample of ‘n’ elements from a statistical population is such that every possible combination of ‘n’ elements from the population has an equal probability of being in the sample.
Many experiments involve taking a random sample from a finite population. If we sample with replacement, we effectively return each observation to the population before making the next selection. In this way the population from which we are sampling remains the same from one selection to the next; provided sampling is random, the successive outcomes will be independent.
If we sample without replacement from a finite population, the outcome of any one selection will depend on the outcomes of all previous selections; the population is reduced with each selection.
Example 4.16:
Suppose that in a given street 50 residents voted in the last election. Of these, 15 voted for party ‘A’, 30 voted for party ‘B’ and 5 voted for neither party ‘A’ nor ‘B’. Suppose that one evening a candidate for the next election visits the residents of the street to introduce herself. What is the probability that the first two eligible voters she meets voted for party ‘A’ at the last election? ( )
Here sampling is without replacement. Define the following events:
: first person voted for party ‘A’
: second person voted for party ‘A’
We require [Intersection = AND condition]
Example 4.17:
Consider the experiment of successively drawing 2 cards from a deck of 52 playing cards. Define the following events:
: ace on first draw
: ace on second draw
What is the probability of selecting 2 aces if sampling (drawing) is (i) without replacement, and (ii) with replacement? ( )
Without replacement:
With replacement:
Note: If we simultaneously select a sample of ‘n’ elements, we are effectively sampling without replacement.
4.10 Probability Trees
Tree diagrams can be a useful aid in calculating the probabilities of intersections of events (i.e. joint probabilities).
________________
Example 4.18:
Greasy Mo’s take-away food store offers special $10 meal deals consisting of a small pizza or a kebab, together with a can of soft drink, a milkshake or a cup of fruit juice. Past experience has shown that 60% of meal deal buyers choose a pizza (‘P’), 40% choose kebabs (‘K’), 75% choose softdrink (‘S’), 20% choose a milkshake (‘M’) and 5% choose fruit juice (‘J’). Assume the events ‘P’ and ‘K’ are independent of the events ‘S’, ‘M’ and ‘J’. What is the probability that a meal deal customer (chosen at random) will choose a pizza and fruit juice? (0.03)
The tree diagram for this example can be drawn as below.
S:0.75
M:0.2
P:0.6 J:0.05
S:0.75
K:0.4
M:0.2
J:0.05
Thus , etc.
________________
5. PROBABILITY DISTRIBUTIONS OF DISCRETE RANDOM VARIABLES
5.1 Probability Distributions and Random Variables
A probability distribution can be considered a theoretical model for a relative frequency distribution of data from a real life population.
For example, the probability distribution normally used for the experiment of tossing a fair coin once and noting whether a head (‘H’) or tail (‘T’) results can be written
This can be interpreted as saying that if the coin tossing experiment were repeated many times, we would expect a relative frequency of both outcomes to be a half.
A probability distribution thus specifies the probabilities associated with the various outcomes of a statistical experiment. It can take the form of a table, a graph or some formula.
From now on we shall be concerned with the characteristics of probability distributions. However, to facilitate our study we shall now represent simple events and events associated with statistical experiments by values of random variables.
Definition (Random Variable)
A random variable X is a rule that assigns to each simple event of a statistical experiment a unique numerical value.
The above definition can also be expressed in the following slightly more mathematical way.
Alternative Definition (Random Variable)
A random variable X is a real valued function for which the domain is the sample space of a statistical experiment.
Remember that by the term random experiment we mean an experiment which gives rise to random outcomes.
In most statistical experiments of interest, outcomes give rise to quantitative data that can be considered values of the random variable being studied.
For example, if the experiment consists of selecting a household at random and noting the number of children in the household, we would naturally define
random variable representing the number of children in a household.
X could thus take the values 0, 1, 2, 3,.... corresponding to possible outcomes of the experiment.
In experiments which give rise to categorical or qualitative data, a random variable can normally also be defined.
Example 5.1:
Consider the experiment of selecting a person at random and noting their hair colour.
Here we could define X to be the random variable representing hair colour, where
if the person’s hair colour is blonde
“ “ brown
“ “ grey
“ “ black
“ “ white
“ “ red
There are two basic types of random variables.
Definition (Discrete Random Variable)
A discrete random variable can only assume a finite or infinite and countable number of values.
(By countably infinite we mean that the values can be listed in order, although the list is infinitely long)
Definition (Continuous Random Variable)
A continuous random variable can assume any value in an interval (finite or infinite).
________________
Some examples of discrete random variables:
the number of errors on a typed page
the number of cars owned by a household
Some examples of continuous random variables:
the length of time between bus arrivals at a bus stop
the weight of an individual
At this stage we will concentrate on discrete random variables.
Definition (Discrete Probability Distribution)
A discrete probability distribution lists a probability for, or provides a means (e.g. a rule or formula) of assigning a probability to, each value a discrete random variable can take.
Suppose our random variable is called X. Then represents the probability that the random variable takes on the particular value ‘x’. (As a result of the outcome of an experiment).
Properties of the Discrete Probability Distribution of a Random Variable X:
* for all values of ‘x’
*
________________
Example 5.2:
Consider again the experiment of tossing a fair die once and noting the number of dots on the upward facing side (X).
We have
and
At this point we can also introduce the concept of a cumulative distribution function (or simply distribution function) of a random variable (discrete or continuous).
Definition (Cumulative Distribution Function)
The cumulative distribution function of a random variable X, denoted , is defined as
where ‘x’ is any real number.
(In the above definition, ‘x’ represents not just numbers that the random variable can take).
Thus a cumulative distribution function shows the probability of the random variable taking on values less than or equal to some value ‘x’.
5.2 Expected Values of Random Variables
It is of interest to have a measure of the centre of the probability distribution of a random variable X. This role is filled by the expected value of X.
Definition (Expected Value of a Discrete Random Variable)
The expected value of a discrete random variable X is defined as
(A weighted average of all the values X can take)
If a statistical experiment considered generates values of the random variable that coincide with values in the population considered, and the theoretical probability distribution of the random variable and population relative frequency distribution are the same, the mean of the theoretical distribution of X will be the same as the population mean (i.e. ). That is, .
We will generally assume that our model (i.e. the probability distribution) is correct, so the above holds.
________________
Example 5.3:
Suppose you buy a lottery ticket for $10. The sole prize in the lottery is $100,000 and 100,000 tickets are sold. If the lottery is fair (i.e. each ticket sold has an equal chance of winning), what will be your expected net gain (or loss) from buying the lottery ticket? (-9)
Let X be the gain in dollars. Then
[Expectation = Weighted average of all values X can take]
Theorem (Expected Value of a Function of a Discrete Random Variable)
Suppose a function of a discrete random variable X. The expected value of this function, if it exists, is given by
There are several important properties related to expected values.
________________
Theorem 5.2 (Various Properties of Expected Values)
* If ‘c’ is any constant then
* If ‘c’ is any constant and is any function of a discrete or continuous random variable X then
* If are ‘k’ functions of a discrete or continuous random variable X then
* If and are two functions of a discrete or continuous random variable X such that for all X, then
For example,
Note:
Two discrete random variables X and Y are independent if
for all values of x and y.
(or equivalently for all values of x and y)
5.3 The Variance of a Random Variable
To gauge the dispersion of a random variable X about its expected value or mean we can calculate the expected value of its squared distance from the mean. This is called the variance of the random variable X, denoted .
Definition (Variance of a Random Variable)
The variance of any random variable X (discrete or continuous) is given by
If X is a discrete random variable that can take ‘n’ different values , the above definition specializes to
Definition (Standard Deviation of a Random Variable)
The standard deviation of any random variable X (discrete or continuous) is given by
Again assuming the probability distribution of X is an accurate representation of the population relative frequency distribution of X, we can write , where is the population variance.
An alternative way of writing (and calculating) is
Example 5.4:
Suppose a lottery offers 3 prizes: $1,000, $2,000 and $3,000. 10,000 tickets are sold and each ticket has an equal chance of winning a prize. Calculate the variance and standard deviation of the random variable X representing the value of the prize won by a ticket. (1399.64, 37.4118)
x
0
0
0
0
1,000
1,000,000
0.1
100
2,000
4,000,000
0.2
400
3,000
9,000,000
0.3
900
Total
0.6
1400
If we wish to determine the variance of a linear function of a random variable X, the following rule can be used
5.4 The Binomial Distribution
The binomial distribution is a discrete probability distribution based on ‘n’ repetitions of an experiment whose outcomes are represented by a Bernoulli random variable.
(a) Bernoulli Experiments
A Bernoulli experiment (or trial) is such that only 2 outcomes are possible. These outcomes can be denoted success (‘S’) and failure (‘F’), with probabilities ‘p’ and , respectively.
A Bernoulli random variable Y is usually defined so that it takes the value 1 if the outcome of a Bernoulli experiment is a success, and the value 0 if the outcome is a failure.
________________
Thus
The mean and variance of a Bernoulli random variable defined in the above way are
An example of a Bernoulli experiment is the tossing of a fair coin, denoting a head a success and a tail as a failure , with .
(b) Binomial Experiments
Definition (Binomial Experiment)
A binomial experiment fulfils the following requirements:
(i) There are ‘n’ repetitions or ‘trials’ of a Bernoulli experiment for which there are only two outcomes, ‘success’ or ‘failure’.
(ii) All trials are performed under identical conditions.
(iii) The trials are independent.
(iv) The probability of success ‘p’ is the same for each trial.
(v) The random variable of interest, say X, is the number of successes observed in the ‘n’ trials.
Theorem (The Binomial Probability Function)
Let X represent the number of successes in a binomial experiment consisting of ‘n’ trials and with a probability ‘p’ of success on each trial. The probability of ‘x’ successes in such an experiment is given by
for
(See reference file for proof if interested)
Example 5.5:
A company that supplies reverse-cycle air conditioning units has found from experience that 70% of the units it installs require servicing within the first 6 weeks of operation. In a given week the firm installs 10 air conditioning units. Calculate the probability that, within 6 weeks
* 5 of the units require servicing (0.1029 approx.)
* none of the units require servicing (0 approx.)
* all of the units require servicing (0.0282 approx.)
Tables are also available to calculate these simple binomial probabilities.
________________
(c) Cumulative Binomial Probabilities
The calculation of cumulative binomial probabilities of the form is often tedious, even using a calculator. However, tables to determine such probabilities are available. (See Reference files appendix Table 3)
(Extract of Appendix 3)
CUMULATIVE BINOMIAL PROBABILITIES:
p
n
x
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
....
0.70
1
2
3
10
0
1
0
1
2
0
1
2
3
0
1
2
3
4
5
6
7
8
9
10
0.9500
1.0000
0.9025
0.9975
1.0000
0.8574
0.9928
0.9999
1.0000
0.5987
0.9139
0.9885
0.9990
0.9999
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0.9000
1.0000
0.8100
0.9900
1.0000
0.7290
0.9720
0.9990
1.0000
0.3487
0.7361
0.9298
0.9872
0.9984
0.9999
1.0000
1.0000
1.0000
1.0000
1.0000
0.8500
1.0000
0.7225
0.9775
1.0000
0.6141
0.9393
0.9966
1.0000
0.1969
0.5443
0.8202
0.9500
0.9901
0.9986
0.9999
1.0000
1.0000
1.0000
1.0000
0.8000
1.0000
0.6400
0.9600
1.0000
0.5120
0.8960
0.9920
1.0000
0.1074
0.3758
0.6778
0.8791
0.9672
0.9936
0.9991
0.9999
1.0000
1.0000
1.0000
0.7500
1.0000
0.5625
0.9375
1.0000
0.4219
0.8438
0.9844
1.0000
0.0563
0.2440
0.5256
0.7759
0.9219
0.9803
0.9965
0.9996
1.0000
1.0000
1.0000
0.7000
1.0000
0.4900
0.9100
1.0000
0.3430
0.7840
0.9730
1.0000
0.0282
0.1493
0.3828
0.6496
0.8497
0.9527
0.9894
0.9984
0.9999
1.0000
1.0000
0.6500
1.0000
0.4225
0.8775
1.0000
0.2746
0.7183
0.9571
1.0000
0.0135
0.0860
0.2616
0.5138
0.7515
0.9051
0.9740
0.9952
0.9995
1.0000
1.0000
0.6000
1.0000
0.3600
0.8400
1.0000
0.2160
0.6480
0.9360
1.0000
0.0060
0.0464
0.1673
0.3823
0.6331
0.8338
0.9452
0.9877
0.9983
0.9999
1.0000
....
....
....
....
0.3000
1.0000
0.0900
0.5100
1.0000
0.0270
0.2160
0.6570
1.0000
0.0000
0.0001
0.0016
0.0106
0.0473
0.1503
0.3504
0.6172
0.8507
0.9718
1.0000
Example 5.6:
Referring to previous air conditioning unit example, calculate the probability that within 6 weeks of installation
* less than 8 of the air conditioners require servicing. (0.6172 approx.)
* 4 or more of the air conditioners require servicing. (0.9894 approx.)
from the tables
from the tables
Example 5.7:
A referring to previous air conditioning unit example, use the cumulative binomial tables to calculate the probability that within 6 weeks of installation
* 5 units require servicing (0.103)
* 10 units require servicing (0.0282)
The slight difference to the answer calculated previously is due to rounding in the tables
________________
(d) Characteristics of the Binomial Distribution
Theorem (Mean and Variance of a Binomial Random Variable)
Let X represent the number of successes in a binomial experiment consisting of ‘n’ trials, and where the probability of success on each trial is ‘p’. Then
For example, the mean and variance of the binomial distribution of the previous air conditioning unit example are and , respectively.
Each combination of ‘n’ and ‘p’ gives a particular binomial distribution. We say ‘n’ and ‘p’ are the parameters of the binomial distribution.
If , the binomial distribution is symmetric.
________________
Example 5.8:
Suppose and
(probability histogram)
probability
0.3125
0.1563
0.0313
0 1 2 3 4 5 X
The binomial distribution will be skewed to the left (i.e. ‘negatively skewed’) if , and skewed to the right (i.e. ‘positively skewed’) if . In either case the tendency to be skewed diminishes as ‘n’ increases.
(See the diagrams in reference file). This is a characteristic which is useful in approximating binomial probabilities as we shall see later.
Binomial probability table for n > 100:
https://www.statisticshowto.com/tables/binomial-distribution-table/
________________
MAIN POINTS
* If we sample without replacement from a finite population, the outcome on any draw will depend on the outcomes of all previous draws.
* Sampling with replacement from a finite population is ‘equivalent’ to sampling from an infinite population.
* Tree diagrams can facilitate the calculation of joint probabilities (i.e. the probabilities of intersections of events).
* A probability distribution can be interpreted as a model for the relative frequency distribution of some real statistical population. In any given situation, the model may or may not represent the relative frequency distribution exactly.
* It is convenient to associate the outcomes of a statistical experiment with values of a random variable (e.g. X). We can then think in terms of the probability distribution of the random variable.
* The mean (expected value) and variance of a discrete random variable are given by
* The binomial distribution is a model for the relative frequency (probability) distribution of numbers of successes in ‘n’ trials of a Bernoulli experiment.
* The binomial distribution can be represented by the probability function
where ‘n’ is the number of trials, ‘x’ the number of successes and ‘p’ the probability of success at each trial.