Skip to contents

Performs the Pearson chi-square test for normality to assess whether a given sample comes from a normal distribution. This test divides the data into classes and compares observed frequencies with expected frequencies under the normal distribution.

Usage

pearson.test(x, n.classes = ceiling(2 * (n^(2/5))), adjust = TRUE)

Arguments

x

A numeric vector of data values. Missing values will be automatically removed.

n.classes

Number of classes to use for the chi-square test. Defaults to ceiling(2 * (n^(2 / 5))), where n is the sample size. This formula provides a data-driven approach to determining the number of classes.

adjust

Logical indicating whether to adjust the degrees of freedom. If TRUE (default), 2 degrees of freedom are subtracted to account for estimated parameters (mean and standard deviation).

Value

A list with class "htest" containing the following components:

  • statistic - The Pearson chi-square test statistic (P)

  • p.value - The p-value for the test

  • method - The name of the method ("Pearson chi-square normality test")

  • data.name - The name of the data used in the test

  • n.classes - The number of classes used in the test

  • df - The degrees of freedom used for the chi-square distribution

Details

The Pearson chi-square normality test is a classical goodness-of-fit test that compares the observed frequency distribution with the expected frequency distribution under the normal distribution assumption.

The test procedure involves:

  1. Estimating the mean and standard deviation from the sample

  2. Dividing the range of the normal distribution into n.classes equal-probability intervals

  3. Counting the number of observations falling into each interval

  4. Calculating the chi-square statistic: $$P = \sum \frac{(O_i - E_i)^2}{E_i}$$ where \(O_i\) is the observed frequency and \(E_i\) is the expected frequency

  5. Comparing the test statistic to a chi-square distribution with n.classes - 1 - dfd degrees of freedom, where dfd is 2 if adjust = TRUE (accounting for estimated parameters) or 0 otherwise

The default number of classes follows the recommendation: ceiling(2 * (n^(2 / 5))), which adapts to the sample size.

References

This implementation is based on the nortest package:

Pearson, K. (1900). On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is such that it can be Reasonably Supposed to have Arisen from Random Sampling. Philosophical Magazine, 50(302), 157-175.

Moore, D.S. (1986). Tests of the chi-squared type. In: D'Agostino, R.B. and Stephens, M.A. (eds.), Goodness-of-Fit Techniques, Marcel Dekker, New York.

See also

chisq.test for the general chi-square test of independence. ad.test for the Anderson-Darling normality test. cvm.test for the Cramer-von Mises normality test. shapiro.test for the Shapiro-Wilk normality test.

Other normality_test: ad.test(), cvm.test(), dagostino.test(), jb.test.modified(), sf.test()

Examples

if (FALSE) { # \dontrun{
# Test a sample from normal distribution
set.seed(123)
normal_data <- stats::rnorm(100)
pearson.test(normal_data)

# Test with custom number of classes
pearson.test(normal_data, n.classes = 10)

# Test without degrees of freedom adjustment
pearson.test(normal_data, adjust = FALSE)

# Test a sample from non-normal distribution
exponential_data <- stats::rexp(50)
pearson.test(exponential_data)
} # }