Chapter 3 A Basic Introduction to Statistical Inference
Key Definitions:
A population is the complete set of all individuals, items, or data of interest in a particular study. This is the entire group about which we want to draw conclusions.
A sample is a subset of the population that is actually observed or measured. We use samples that are randomly selected to make inferences about populations.
A parameter is a numerical characteristic that describes some aspect of a population (e.g., population proportion \(p\), population mean \(\mu\)).
A statistic is a numerical value calculated from sample data (e.g., sample proportion \(\hat{p}\), sample mean \(\bar{x}\)) used to estimate a population parameter.
Statistical inference is the process of constructing confidence intervals or testing hypotheses about a parameter.
3.1 The Two Major Tasks in Statistics
3.1. 1. Estimating a Parameter
Involves using sample statistics to: - Calculate point estimates (single best guess) - Construct confidence intervals (range of plausible values)
A level \(1-\alpha\) confidence interval for a population proportion is given by: \[\hat{p} \pm z\cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\] where \(\hat{p}\) is the sample proportion, \(n\) is the sample size, and \(z\) is the critical value dependent on \(\alpha\). If \(\alpha=0.05%\), \(z=1.96\).
3.1.2 Testing Hypotheses About a Parameter
Involves assessing evidence against a null hypothesis (\(H_0\)) using:
Test statistics
P-values (probability of observed or more extreme results if \(H_0\) is true)
When testing a population proportion \(p\), the null hypothesis is always written as: \[H_0: p = p_0\] and the alternative hypothesis can take one of the following three forms:
\[H_a: p < p_0\]\[H_a: p > p_0\]
or
\[H_a: p \neq p_0\] They are called the left-sided, right-sided, or two-sided alternative, respectively.
Note: Sometimes, \(H_a\) might be written as \(H_1\).
3.4 Examples of Constructing a Confidence Interval for a Single Population Proportion
A confidence interval provides a range of plausible values for the true population proportion based on sample data. The general form is:
\[
\hat{p} \pm z \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}
\]
Where: - \(\hat{p}\) = sample proportion - \(z\) = critical value from standard normal distribution. For the 95% confidence level, \(z\) is 1.96. For the 90% confidence level, \(z\) is 1.645. - \(n\) = sample size - The part \(\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\) is called the standard error of the sample proportion \(\hat{p}\), usually denoted \(se\). - The part after \(pm\) is called the margin of error (\(m\)), which is the product of the critical value and the standard error.
3.4.1 Example 1: Political Poll
Scenario: In a survey of 500 voters, 280 support Candidate A. Construct a 95% confidence interval for the true proportion of supporters.
1-sample proportions test without continuity correction
data: 280 out of 500, null probability 0.5
X-squared = 7.2, df = 1, p-value = 0.00729
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.5161969 0.6028882
sample estimates:
p
0.56
The R output shows that the 95% confidence interval for the population proportion is between 0.5162 and 0.6029.
Note: ignore the p-value output, since it is for testing whether the population proportion is 0.5, and thus is irrelevant.
If we do it by hand using the formula in section 3.4, we have \(n = 500, \hat{p}=280/500=0.56\), \(z=1.96\) and \(se=0.0222\) (try to keep 4 decimal places), and \(m=(1.96)\cdot (0.0222)=0.0435\). So, the 95% Confidence interval is from \(0.56-0.0435\) to \(0.56+0.0435\) or from 0.5165 to 0.6035. The results are almost the same as those of R. The difference is due to rounding error.
Interpretation: We are 95% confident that the true proportion of voters supporting Candidate A is between 51.62% and 60.28%.
3.4.2 Example 2: Quality Control
Scenario: A factory tests 200 products and finds 12 defective. Construct a 90% CI for the defect rate.
1-sample proportions test without continuity correction
data: 12 out of 200, null probability 0.5
X-squared = 154.88, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.5
90 percent confidence interval:
0.03781444 0.09393107
sample estimates:
p
0.06
Interpretation: We are 90% confident the true defect proportion is between 3.78% and 9.39%.
3.4.3 Key Considerations:
Sample Size Requirements:
For accurate confidence intervals, need both \(x\) and \(n-x\) to be at least 10.
Margin of Error depends on:
Confidence level (higher level → wider interval)
Sample size (larger n → narrower interval)
Assumptions:
Random sampling: observations are selected at random
Independence: observations are independent of each other
3.5 Examples of Testing Hypotheses about a Single Population Proportion
Hypothesis testing evaluates whether sample data provides sufficient evidence to reject a claim about a population proportion.
The general procedure:
State hypotheses:
Null (\(H_0: p = p_0\))
Alternative (\(H_1: p < p_0\) or \(H_1: p > p_0\) or \(H_1: p \ne p_0\) )
Calculate test statistic: \[
z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}
\]
Determine p-value based on alternative hypothesis. We will use R code to find the p-value, as the next section shows.
Compare p-value to the significance level \(\alpha\) (typically 0.05). If the p-value is no greater than the significance level, reject the null hypothesis. Otherwise, fail to reject it or don’t reject it.
3.5.1 Example 1: Two-Sided Test (Market Research)
Scenario: A company claims 30% of customers prefer their product. In a survey of 150 customers, 36 prefer it. Test if the true proportion differs from 30% (\(\alpha = 0.05\)).
prop.test(x=36, n=150, p =0.30, alternative ="two.sided", correct =FALSE)
1-sample proportions test without continuity correction
data: 36 out of 150, null probability 0.3
X-squared = 2.5714, df = 1, p-value = 0.1088
alternative hypothesis: true p is not equal to 0.3
95 percent confidence interval:
0.1786931 0.3142914
sample estimates:
p
0.24
Results:
Conclusion: Since p-value 0.1088 > 0.05, we fail to reject \(H_0\). There is insufficient evidence to conclude that the true proportion differs from 30%.
Note: Always use the R function prop.test and set correct = FALSE when using the code above.
3.5.2 Example 2: Right-Sided Test (Quality Control)
Scenario: A factory claims at most 5% of products are defective. In a batch of 300, 22 are defective. Test if the defect rate exceeds the claim (\(\alpha = 0.05\)).
prop.test(x=22, n=300, p =0.05, alternative ="greater", correct =FALSE)
1-sample proportions test without continuity correction
data: 22 out of 300, null probability 0.05
X-squared = 3.4386, df = 1, p-value = 0.03184
alternative hypothesis: true p is greater than 0.05
95 percent confidence interval:
0.05220849 1.00000000
sample estimates:
p
0.07333333
Results:
Conclusion: Since p-value 0.0318 < 0.05, we reject \(H_0\). There is significant evidence that the defect rate exceeds 5%.
3.5.3 Example 3: Left-Sided Test (Quality Control)
Scenario: A manufacturer claims more than 95% defect-free parts. A sample of 50 parts gives 44 defect-free parts. Test if the defect-free rate is below the claim (\(\alpha = 0.05\)).
prop.test(x=46, n=50, p =0.95, alternative ="less", correct =FALSE)
1-sample proportions test without continuity correction
data: 46 out of 50, null probability 0.95
X-squared = 0.94737, df = 1, p-value = 0.1652
alternative hypothesis: true p is less than 0.95
95 percent confidence interval:
0.000000 0.963578
sample estimates:
p
0.92
Results:
Conclusion: Since p-value 0.1652 > 0.05, we fail to reject \(H_0\). There is NOT significant evidence that the defect rate is below 5%.
3.5.3 Key Considerations:
Assumptions:
Random sampling
Type I vs. Type II errors:
Type I error: Rejecting \(H_0\) when it is actually true.
Type II error: Failing to reject \(H_0\) when it is actually false.
If \(H_0\) is rejected (p-value \(\le\)\(\alpha\)), a type I error might have been committed.
If \(H_0\) is not rejected (p-value > \(\alpha\)), a type II error might have been committed.
Power of a hypothesis test:
Statistical power is the chance a study will correctly detect a real effect, like a drug working. High power (e.g., 80%) means you’re likely to spot it; low power means you might miss it. It depends on sample size, effect size, and significance level. Bigger samples or stronger effects increase power.
3.5.4 Important Notes
Always report exact p-values, not just “p < 0.05”
Consider practical significance in addition to statistical significance
3.6 Sample Size Determination
When estimating a 95% confidence interval for a population proportion, the maximum sample size \(n\) to achieve margin of error \(m\) is approximately \((\frac{1}{m})^2\).
Example: The maximum sample size \(n\) to achieve margin of error 0.12 at confidence level 95% is \((\frac{1}{0.12})^2=69.44\), or 70 (always rounded up).
A practical constraint limiting large sample size n is cost and time.
3.7 Exercises: Statistical Inference for a Single Proportion
Definitions
Match each term to its correct definition:
Term
Definition
Population
A) A numerical characteristic of a sample
Sample
B) The complete set of items of interest
Parameter
C) A subset of the population that is observed
Statistic
D) A numerical characteristic of a population
True or False:
A 95% confidence interval means there’s a 95% probability the true parameter is in the interval
For the same data, a 99% CI will be wider than a 95% CI
The p-value is the probability that the null hypothesis is true
When p-value < α, we reject the null hypothesis
Calculation Practice
In a survey of 400 students, 160 reported using public transportation daily.
Calculate the sample proportion
Construct a 90% confidence interval manually
Verify using R’s prop.test()
# Your R code here
Interpretation
For the above CI (0.360, 0.440), explain what “90% confident” means in context.
Test Setup
A medication claims to be effective for 70% of patients. In a trial of 50 patients, 32 found it effective.
Formulate appropriate null and alternative hypotheses
Explain whether this should be one-tailed or two-tailed
R Analysis
Conduct the test in R at α = 0.05 and interpret:
# Your test code here
Case Study
A website claims its conversion rate is 8%. In a sample of 200 visitor, 22 converted.
Construct a 95% CI for the true conversion rate
Test whether the actual rate differs from 8%
Discuss any discrepancies between CI and test results
Error Analysis
If you reject H₀ when α = 0.05:
What type of error might you have made?
How could you reduce the chance of this error?
Sample Size Impact
Holding everything else constant, what happens to: