Performs the Pearson chi-square test for normality to assess whether a given sample comes from a normal distribution. This test divides the data into classes and compares observed frequencies with expected frequencies under the normal distribution.
Usage
pearson.test(x, n.classes = ceiling(2 * (n^(2/5))), adjust = TRUE)Arguments
- x
A numeric vector of data values. Missing values will be automatically removed.
- n.classes
Number of classes to use for the chi-square test. Defaults to
ceiling(2 * (n^(2 / 5))), where n is the sample size. This formula provides a data-driven approach to determining the number of classes.- adjust
Logical indicating whether to adjust the degrees of freedom. If
TRUE(default), 2 degrees of freedom are subtracted to account for estimated parameters (mean and standard deviation).
Value
A list with class "htest" containing the following components:
statistic- The Pearson chi-square test statistic (P)p.value- The p-value for the testmethod- The name of the method ("Pearson chi-square normality test")data.name- The name of the data used in the testn.classes- The number of classes used in the testdf- The degrees of freedom used for the chi-square distribution
Details
The Pearson chi-square normality test is a classical goodness-of-fit test that compares the observed frequency distribution with the expected frequency distribution under the normal distribution assumption.
The test procedure involves:
Estimating the mean and standard deviation from the sample
Dividing the range of the normal distribution into
n.classesequal-probability intervalsCounting the number of observations falling into each interval
Calculating the chi-square statistic: $$P = \sum \frac{(O_i - E_i)^2}{E_i}$$ where \(O_i\) is the observed frequency and \(E_i\) is the expected frequency
Comparing the test statistic to a chi-square distribution with
n.classes - 1 - dfddegrees of freedom, wheredfdis 2 ifadjust = TRUE(accounting for estimated parameters) or 0 otherwise
The default number of classes follows the recommendation:
ceiling(2 * (n^(2 / 5))), which adapts to the sample size.
References
This implementation is based on the nortest package:
Pearson, K. (1900). On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is such that it can be Reasonably Supposed to have Arisen from Random Sampling. Philosophical Magazine, 50(302), 157-175.
Moore, D.S. (1986). Tests of the chi-squared type. In: D'Agostino, R.B. and Stephens, M.A. (eds.), Goodness-of-Fit Techniques, Marcel Dekker, New York.
See also
chisq.test for the general chi-square test of independence.
ad.test for the Anderson-Darling normality test.
cvm.test for the Cramer-von Mises normality test.
shapiro.test for the Shapiro-Wilk normality test.
Other normality_test:
ad.test(),
cvm.test(),
dagostino.test(),
jb.test.modified(),
sf.test()
Examples
if (FALSE) { # \dontrun{
# Test a sample from normal distribution
set.seed(123)
normal_data <- stats::rnorm(100)
pearson.test(normal_data)
# Test with custom number of classes
pearson.test(normal_data, n.classes = 10)
# Test without degrees of freedom adjustment
pearson.test(normal_data, adjust = FALSE)
# Test a sample from non-normal distribution
exponential_data <- stats::rexp(50)
pearson.test(exponential_data)
} # }