Open In App

Anderson-Darling Test

Last Updated : 18 Jun, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

The Anderson-Darling test is a statistical method used to check whether a dataset follows a specific distribution, such as the normal distribution. It measures how well the data fits the chosen distribution by comparing the actual data to what would be expected if the data perfectly followed that distribution. The test works by calculating a number (called the test statistic) that shows the difference between the two. A smaller value means the data fits well, while a larger value suggests it does not.

The two hypotheses for the test are

Anderson-Darling Statistic Formula

AD = -n - \frac{1}{n} \sum_{i=1}^{n} (2i - 1) \left[ \ln F(X_i) + \ln \left( 1 - F(X_{n+1-i}) \right) \right]

Where:

  • n: Sample size (total number of observations).
  • X(i): Ordered sample values (sorted data points in ascending order).
  • F: CDF of the specified distribution (theoretical cumulative distribution function).

Performing Anderson-Darling Test in Python

We first generate a random dataset following a normal distribution with specific mean and standard deviation. The anderson() function from scipy.stats is used to perform the test against the normal distribution. After computing the test statistic, we compare it with multiple critical values corresponding to different significance levels. Based on these comparisons, the code prints out a neatly formatted table indicating whether the null hypothesis is rejected or not at each significance level, making the results easy to interpret.

Python
import numpy as np
from scipy.stats import anderson

sample_size = 200
mean = 5
std_dev = 2
data = np.random.normal(loc=mean, scale=std_dev, size=sample_size)

ad_result = anderson(data, dist='norm')

print(f"Anderson-Darling Statistic (A²): {ad_result.statistic:.4f}")

print("\nSignificance Level (%) | Critical Value | Test Decision")
print("-" * 50)
for significance, critical_value in zip(ad_result.significance_level, ad_result.critical_values):
    decision = "Reject H₀" if ad_result.statistic > critical_value else "Fail to Reject H₀"
    print(f"{significance:>21}% | {critical_value:>14.4f} | {decision}")

Output:

Screenshot-2025-06-16-154612
Output

Interpretation of the Output

After running the code, we receive the Anderson-Darling statistic (A²) followed by a table displaying the significance levels, their corresponding critical values, and the test decisions.

  • The A² statistic tells us how much our sample data deviates from the specified distribution (in this case, the normal distribution).
  • For each significance level (15%, 10%, 5%, 2.5%, 1%), we compare the A² statistic with its corresponding critical value.
  • If the A² statistic is less than the critical value, we conclude: "Fail to Reject H₀", which means there is not enough evidence to say that the data is different from the normal distribution.
  • If the A² statistic is greater than the critical value, we conclude: "Reject H₀", indicating that the data likely does not follow the normal distribution.
  • As we move to stricter significance levels (lower percentages), the test becomes more conservative, making it harder to reject the null hypothesis.

When to Use the Anderson-Darling Test

We use the Anderson-Darling (A-D) test when we want to check how well our sample data fits a specific theoretical distribution. Unlike some other goodness-of-fit tests, the A-D test gives more weight to the tails of the distribution, making it especially useful when we care about extreme values.

We should consider using the Anderson-Darling test in the following situations:

  • When we need to verify if data follows a normal distribution, especially before performing statistical tests that assume normality (e.g., t-tests, ANOVA).
  • When we want to test goodness-of-fit for other continuous distributions such as exponential, logistic, or Gumbel.
  • When we want a test that is more sensitive to deviations in the tails compared to tests like the Kolmogorov-Smirnov Test.
  • When working with small to medium-sized samples, as the A-D test tends to perform well across different sample sizes.
  • When distributional assumptions are important for model building, hypothesis testing, or data simulation.

Limitations of the Anderson-Darling Test

  • The test assumes that the distribution parameters are known or accurately estimated; incorrect parameter estimation may affect the results.
  • It is primarily designed for continuous distributions and may not be appropriate for discrete data.
  • In very large sample sizes, the test can become too sensitive and may detect small, practically insignificant deviations from the distribution.

Related Articles:

How to Conduct an Anderson-Darling Test in R

Kolmogorov-Smirnov Test (KS Test)


Next Article

Similar Reads