https://d5nxst8fruw4z.cloudfront.net/atrk.gif?account=xQyRm1a4KM+2l9

Bayes Test for One Proportion


Home | Academic Articles


Contrast of frequentist vs. Bayesian approach

Suppose we have a salesperson who has the following track record over a 3-week period:

 

Number of sales calls

Number of sales

Week 1

5

1

Week 2

5

0

Week 3

5

5

Total

15

6

 

If the goal is to have the sales rate exceed 20%, can we conclude at a 5% level of significance that the goal is being achieved?

The null and alternative hypotheses are:

Ho: p < 20%

Ha: p > 20%

From a frequentist point of view, the p-value is the probability of having at least 6 sales in the 15 trials. If p = 20%, P(X > 6) = 1 – P(X < 5) = 1 – 0.939 = 0.061 = 6.1%. Since 6.1% > 5%, we do not reject the null hypothesis and conclude that the goal is not being achieved.

From a Bayesian point of view, the p-value can be viewed as the probability of the null hypothesis being true.

For the prior distribution of p, we can assume that it can uniformly take any value between 0 and 1. Thus, f(p) = 1.

Since we have 6 successes and 9 failures, the posterior distribution of p is proportional to p6(1 – p)9. This indicates that p follows a Beta distribution with α = 7 and β = 10. Thus, when we solve for P(p < 0.2), we get:

Once you work through the math, P(p < 0.2) is equivalent to the probability of having at least 7 successes in 16 trials given p = 0.2. This works out to 0.0267 = 2.67%. Since 2.67% < 5%, we reject the null hypothesis and conclude the goal is being achieved.

Asymptotic results

Suppose the null and alternative hypotheses are changed to:

Ho: p < 40%

Ha: p > 40%

From the frequentist point of view, the p-value = P(X > 6) = 1 – P(X < 5) = 1 – 0.403 = 0.597 = 59.7%. From the Bayesian point of view, the p-value = P(p < 0.4) which is equivalent to the probability of at least 7 successes in 16 trials given p = 0.4. This works out to 0.4728 = 47.28%.

Now, the sample proportion ( = 6/15 = 0.4. For various values of n and p, we want to find the probability of the null hypothesis being true.

n | p

0.2

0.4

0.5

0.6

0.7

0.8

0.9

15

2.67%

47.28%

77.28%

94.17%

99.29%

99.98%

100%

50

0.04%

48.48%

91.96%

99.78%

100%

100%

100%

100

0

48.92%

97.7%

100%

100%

100%

100%

500

0

49.51%

100%

100%

100%

100%

100%

1000

0

49.66%

100%

100%

100%

100%

100%

For p < 40%, as n increases, P(Ho being true) decreases and eventually reaches a probability of zero for all intents and purposes.

However, if p = 40%, as n increases, P(Ho being true) approaches 50%.

Finally, for p > 50%, as n increases, P(Ho being true) increases and eventually reaches a probability of 100% for all intents and purposes.

These probabilities are roughly comparable to p-values from the frequentist school. For example, if p = 0.2, n = 100 and = 0.4, the value of the test statistic would be:

The p-value = P(Z > 5) = 0 for all intents and purposes.

If p = 0.4 and = 0.4, Z = 0 regardless of the sample size and P(Z > 0) = 0.5.

Finally, if p = 0.5, n = 100 and = 0.4, the value of the test statistic would be:

The p-value = P(Z > -2) = 97.73%.

Jeffrey’s non-informative prior with test for p

If Jeffrey’s non-informative prior is used, α = x and β = n – x. In the first example with n = 15 and x = 6, the posterior distribution of p is proportional to p5(1 – p)8.

Given the null and alternative hypotheses:

Ho: p < 20%

Ha: p > 20%

P(p < 20%) is equivalent to the probability of having at least 6 successes in 14 trials given p = 0.2. This works out to 0.0439 = 4.39%.

In general, P(Ho being true) = P(p < po) = 1 – P(X < x-1 | n-1, po) in which po represents the hypothesis proportion.

If x = n, P(Ho being true) = 1 – P(X < n-1 | n-1, po) = 1 – 1 = 0.

Thus, Jeffrey’s non-informative prior cannot be used for hypothesis testing. This is also due to the fact that P(p) = 1/[p(1-p)] is not a proper PDF as illustrated here:

Let u = 1 – p. Then du = -dp.

The result is undefined since ln(0) is negative infinity.

Reference:

Zellner, Arnold. An Introduction to Bayesian Inference in Econometrics. New York: John Wiley & Sons, 1970.