Pearson Chi-Square Normality Test

Performs the Pearson chi-square test for normality to assess whether a given sample comes from a normal distribution. This test divides the data into classes and compares observed frequencies with expected frequencies under the normal distribution.

Usage

pearson.test(x, n.classes = ceiling(2 * (n^(2/5))), adjust = TRUE)

Arguments

x: A numeric vector of data values. Missing values will be automatically removed.
n.classes: Number of classes to use for the chi-square test. Defaults to ceiling(2 * (n^(2 / 5))), where n is the sample size. This formula provides a data-driven approach to determining the number of classes.
adjust: Logical indicating whether to adjust the degrees of freedom. If TRUE (default), 2 degrees of freedom are subtracted to account for estimated parameters (mean and standard deviation).

Value

A list with class "htest" containing the following components:

statistic - The Pearson chi-square test statistic (P)
p.value - The p-value for the test
method - The name of the method ("Pearson chi-square normality test")
data.name - The name of the data used in the test
n.classes - The number of classes used in the test
df - The degrees of freedom used for the chi-square distribution

Details

The Pearson chi-square normality test is a classical goodness-of-fit test that compares the observed frequency distribution with the expected frequency distribution under the normal distribution assumption.

The test procedure involves:

Estimating the mean and standard deviation from the sample
Dividing the range of the normal distribution into n.classes equal-probability intervals
Counting the number of observations falling into each interval
Calculating the chi-square statistic: $$P = \sum \frac{(O_i - E_i)^2}{E_i}$$ where $O_i$ is the observed frequency and $E_i$ is the expected frequency
Comparing the test statistic to a chi-square distribution with n.classes - 1 - dfd degrees of freedom, where dfd is 2 if adjust = TRUE (accounting for estimated parameters) or 0 otherwise

The default number of classes follows the recommendation: ceiling(2 * (n^(2 / 5))), which adapts to the sample size.

References

This implementation is based on the nortest package:

Pearson, K. (1900). On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is such that it can be Reasonably Supposed to have Arisen from Random Sampling. Philosophical Magazine, 50(302), 157-175.

Moore, D.S. (1986). Tests of the chi-squared type. In: D'Agostino, R.B. and Stephens, M.A. (eds.), Goodness-of-Fit Techniques, Marcel Dekker, New York.

Examples

if (FALSE) { # \dontrun{
# Test a sample from normal distribution
set.seed(123)
normal_data <- stats::rnorm(100)
pearson.test(normal_data)

# Test with custom number of classes
pearson.test(normal_data, n.classes = 10)

# Test without degrees of freedom adjustment
pearson.test(normal_data, adjust = FALSE)

# Test a sample from non-normal distribution
exponential_data <- stats::rexp(50)
pearson.test(exponential_data)
} # }