Statistical Significance Tests in Python Using SciPy

Last updated 3 months, 3 weeks ago | 283 views 75     5

Tags:- Python SciPy

In the world of data science and research, statistical significance tests are essential for validating hypotheses, comparing datasets, and drawing conclusions from data.

Python’s SciPy library provides a robust set of tools for performing various significance tests via the scipy.stats module.

In this article, you will learn:

  • What significance tests are

  • The most common types of significance tests

  • How to use scipy.stats for t-tests, chi-squared tests, ANOVA, and more

  • Code examples, tips, and common pitfalls


What Are Significance Tests?

Significance tests help determine whether the observed results in your data could have occurred by random chance. These tests typically return:

  • A test statistic: a numerical value summarizing the difference between groups

  • A p-value: the probability of obtaining results as extreme as the observed ones under the null hypothesis

If the p-value is less than a chosen significance level (commonly 0.05), the result is considered statistically significant.


Prerequisites

Install SciPy and NumPy if not already installed:

pip install scipy numpy

Import the necessary modules:

from scipy import stats
import numpy as np

1. One-Sample t-Test

Tests if the mean of a single sample is significantly different from a known value.

data = np.array([2.9, 3.0, 2.5, 2.6, 3.2])
result = stats.ttest_1samp(data, popmean=3)
print("t-statistic:", result.statistic)
print("p-value:", result.pvalue)

Use When: Comparing a sample mean to a population mean.


2. Two-Sample (Independent) t-Test

Tests if two independent groups have significantly different means.

group1 = np.array([20, 22, 19, 24])
group2 = np.array([30, 29, 31, 33])

result = stats.ttest_ind(group1, group2)
print("t-statistic:", result.statistic)
print("p-value:", result.pvalue)

Use When: Comparing means from two independent samples.


3. Paired t-Test

Used for related samples (e.g., before and after measurements).

before = np.array([100, 102, 98, 105])
after = np.array([110, 108, 103, 107])

result = stats.ttest_rel(before, after)
print("t-statistic:", result.statistic)
print("p-value:", result.pvalue)

Use When: Comparing means from the same group at different times.


4. Chi-Square Test (Goodness-of-Fit)

Checks if observed frequencies match expected frequencies.

observed = [20, 30, 50]
expected = [25, 25, 50]

chi2, p = stats.chisquare(f_obs=observed, f_exp=expected)
print("Chi-square:", chi2)
print("p-value:", p)

Use When: You want to test distributions of categorical variables.


5. Chi-Square Test of Independence

Used to determine if two categorical variables are related.

# Contingency table
data = np.array([[10, 20], [20, 40]])
chi2, p, dof, expected = stats.chi2_contingency(data)

print("Chi-square:", chi2)
print("p-value:", p)

Use When: You have a contingency table and want to test independence.


6. One-Way ANOVA

Tests if the means of three or more independent groups are different.

group1 = [23, 21, 19]
group2 = [30, 32, 29]
group3 = [25, 28, 24]

f_stat, p_val = stats.f_oneway(group1, group2, group3)
print("F-statistic:", f_stat)
print("p-value:", p_val)

Use When: Testing differences between more than two group means.


7. Mann-Whitney U Test

A non-parametric test for comparing two independent samples.

x = [14, 15, 16]
y = [20, 21, 22]

stat, p = stats.mannwhitneyu(x, y)
print("U statistic:", stat)
print("p-value:", p)

Use When: Your data is not normally distributed.


Interpreting the p-value

p-value Interpretation
< 0.01 Strong evidence against null
< 0.05 Moderate evidence against null
> 0.05 Weak evidence; fail to reject

Always define your null hypothesis and alternative hypothesis before testing.


✅ Tips for Performing Significance Tests

Tip Explanation
Check assumptions Some tests require normality or equal variance
Visualize data Use histograms or box plots to check distributions
Use non-parametric tests when necessary e.g., Mann-Whitney, Wilcoxon
Report effect size p-values don't tell you the size of the effect
Beware multiple testing Use corrections like Bonferroni for many tests

⚠️ Common Pitfalls

Pitfall Solution
Blindly trusting p-values Always combine with domain knowledge and effect size
Not checking test assumptions Validate normality and variance conditions
Using the wrong test Choose based on data type, distribution, and sample relation
Misinterpreting non-significance Lack of significance ≠ no effect

Summary Table

Test Use Case SciPy Function
One-sample t-test Compare sample to population mean ttest_1samp()
Two-sample t-test Compare means of two independent samples ttest_ind()
Paired t-test Compare related samples ttest_rel()
Chi-square goodness-of-fit Compare observed vs expected chisquare()
Chi-square independence Test relationship in contingency table chi2_contingency()
ANOVA Compare more than two means f_oneway()
Mann-Whitney U Non-parametric two-sample test mannwhitneyu()

Final Thoughts

Understanding and correctly applying significance tests is critical for statistical analysis and scientific reporting. SciPy's scipy.stats module makes it simple and effective to conduct these tests in Python.

Whether you're testing hypotheses in academic research, validating business metrics, or building data-driven applications, SciPy helps you stay statistically sound.