The ASTA team
The \(p\)-value is the probability of observing a more extreme value of \(T\) (if we were to repeat the experiment) than \(t_{obs}\) under the assumption that \(H_0\) is true.
“Extremity” is measured relative to the alternative hypothesis; a value is considered extreme if it is “far from” \(H_0\) and “closer to” \(H_a\).
If the \(p\)-value is small then there is a small probability of observing \(t_{obs}\) if \(H_0\) is true, and thus \(H_0\) is not very probable for our sample and we put more support in \(H_a\), so:
\[ \textbf{The smaller the $p$-value, the less we trust $H_0$.} \]
## np.float64(0.0775772517893365)
The book also discusses one-sided \(t\)-tests for the mean, but we will not use those in the course.
from statsmodels.stats.proportion import proportions_ztest, proportion_confint
nobs = 1200
count = nobs * 0.52 # number of individuals preferring tax increase
stat, p_value = proportions_ztest(count = count, nobs = nobs, value = 0.5)
ci_low, ci_high = proportion_confint(count, nobs, alpha = 0.05, method = 'normal')
print(f"95% CI: ({ci_low:.4f}, {ci_high:.4f})")
## 95% CI: (0.4917, 0.5483)
## sample estimate: 0.5200
## p-value: 0.1655
## np.float64(0.04005254768213129)
sex
.import pandas as pd
from scipy.stats import binomtest
chile = pd.read_csv("https://asta.math.aau.dk/datasets?file=Chile.txt", sep="\t")
counts = chile['sex'].value_counts()
counts
## sex
## F 1379
## M 1321
## Name: count, dtype: int64
successes = counts.iloc[0]
n = counts.sum()
result = binomtest(successes, n, p = 0.5, alternative='two-sided')
print("Estimated probability of success:", result.statistic)
## Estimated probability of success: 0.5107407407407407
## p-value: 0.27265346580284056
## 95% CI: ConfidenceInterval(low=0.49169713495924583, high=0.5297610330103562)
from statsmodels.stats.proportion import proportions_ztest, proportion_confint
stat, pval = proportions_ztest(count = successes, nobs = n, value=0.5, alternative = 'two-sided')
ci_low, ci_high = proportion_confint(count = successes, nobs = n, alpha = 0.05, method = 'normal')
print("Z statistic:", stat)
## Z statistic: 1.1164681495304731
## p-value: 0.26422179636401866
## 95% CI: (0.49188533046505395, 0.5295961510164274)