Contrast of frequentist vs. Bayesian
approach
Suppose
we have a salesperson who has the following track record over a 3week period:

Number
of sales calls 
Number
of sales 
Week
1 
5 
1 
Week
2 
5 
0 
Week
3 
5 
5 
Total 
15 
6 
If
the goal is to have the sales rate exceed 20%, can we conclude at a 5% level of
significance that the goal is being achieved?
The null and alternative
hypotheses are:
Ho: p < 20%
Ha:
p > 20%
From
a frequentist point of view, the pvalue is the probability of having at least
6 sales in the 15 trials. If p = 20%, P(X >
6) = 1 – P(X < 5) = 1 – 0.939 = 0.061 = 6.1%. Since 6.1% > 5%, we
do not reject the null hypothesis and conclude that the goal is not being
achieved.
From
a Bayesian point of view, the pvalue can be viewed as the probability of the
null hypothesis being true.
For
the prior distribution of p, we can assume that it can uniformly take any value
between 0 and 1. Thus, f(p) = 1.
Since
we have 6 successes and 9 failures, the posterior distribution of p is
proportional to p^{6}(1 – p)^{9}. This indicates that p follows a Beta
distribution with α = 7 and β = 10. Thus, when we solve for P(p < 0.2), we get:
Once
you work through the math, P(p < 0.2) is equivalent
to the probability of having at least 7 successes in 16 trials given p = 0.2.
This works out to 0.0267 = 2.67%. Since 2.67% < 5%, we reject the null
hypothesis and conclude the goal is being achieved.
Asymptotic results
Suppose the null and alternative
hypotheses are changed to:
Ho: p < 40%
Ha:
p > 40%
From
the frequentist point of view, the pvalue = P(X >
6) = 1 – P(X <
5) = 1 – 0.403 = 0.597 = 59.7%. From the Bayesian point of view, the pvalue = P(p < 0.4) which is equivalent to the probability of at
least 7 successes in 16 trials given p = 0.4. This works out to 0.4728 = 47.28%.
Now,
the sample proportion ( = 6/15 = 0.4. For various values of n and p,
we want to find the probability of the null hypothesis being true.
n
 p 
0.2 
0.4 
0.5 
0.6 
0.7 
0.8 
0.9 
15 
2.67% 
47.28% 
77.28% 
94.17% 
99.29% 
99.98% 
100% 
50 
0.04% 
48.48% 
91.96% 
99.78% 
100% 
100% 
100% 
100 
0 
48.92% 
97.7% 
100% 
100% 
100% 
100% 
500 
0 
49.51% 
100% 
100% 
100% 
100% 
100% 
1000 
0 
49.66% 
100% 
100% 
100% 
100% 
100% 
For
p < 40%, as n increases, P(Ho being true) decreases
and eventually reaches a probability of zero for all intents and purposes.
However,
if p = 40%, as n increases, P(Ho being true)
approaches 50%.
Finally,
for p > 50%, as n increases, P(Ho being true)
increases and eventually reaches a probability of 100% for all intents and
purposes.
These
probabilities are roughly comparable to pvalues from the frequentist school.
For example, if p = 0.2, n = 100 and = 0.4, the value of the
test statistic would be:
The
pvalue = P(Z > 5) = 0 for all intents and
purposes.
If
p = 0.4 and = 0.4, Z = 0 regardless
of the sample size and P(Z > 0) = 0.5.
Finally,
if p = 0.5, n = 100 and = 0.4, the value of the
test statistic would be:
The
pvalue = P(Z > 2) = 97.73%.
Jeffrey’s noninformative prior with test
for p
If
Jeffrey’s noninformative prior is used, α = x and β = n – x. In the
first example with n = 15 and x = 6, the posterior distribution of p is proportional to p^{5}(1
– p)^{8}.
Given the null and
alternative hypotheses:
Ho: p < 20%
Ha:
p > 20%
P(p < 20%) is equivalent to the probability
of having at least 6 successes in 14 trials given p = 0.2. This works out to 0.0439
= 4.39%.
In
general, P(Ho being true) = P(p < p_{o}) = 1 – P(X < x1  n1, p_{o}) in which p_{o}
represents the hypothesis proportion.
If
x = n, P(Ho being true) = 1 – P(X < n1 
n1, p_{o}) = 1 – 1 = 0.
Thus,
Jeffrey’s noninformative prior cannot be used for hypothesis testing. This is
also due to the fact that P(p) = 1/[p(1p)] is not a proper PDF as illustrated
here:
Let
u = 1 – p. Then du = dp.
The
result is undefined since ln(0) is negative infinity.
Reference:
Zellner, Arnold. An Introduction to Bayesian Inference in
Econometrics. New York: John Wiley & Sons, 1970.