Non-Parametric Statistics
Statistical inference often relies on parametric assumptions, specifically that the population from which the sample is drawn follows a known probability distribution, typically the normal distribution, characterized by a set of parameters (e.g., mean and variance ). Non-parametric statistics, in contrast, provide procedures for inferring properties of populations that do not rely on restrictive assumptions regarding the underlying parameterized probability distributions.
These methods are essential when sample sizes are small, data are ordinal or nominal, or severe departures from normality are evident. While non-parametric tests are more robust to distributional violations, they generally possess less statistical power compared to their parametric counterparts when the parametric assumptions are actually met.
The Sign Test
The sign test is one of the simplest non-parametric tests, used to assess whether the median of a continuous distribution equals a hypothesized value . It is the non-parametric alternative to the one-sample t-test.
Let be a random sample from a continuous distribution with median . We wish to test the null hypothesis .
The test statistic is defined as the number of sample observations strictly greater than . Under , each observation has a 0.5 probability of being greater than , assuming continuity. Thus, follows a binomial distribution: where is the effective sample size, discarding any ties where .
For large (typically ), a normal approximation can be used: A continuity correction of is often applied to for greater accuracy.
Why might the sign test discard observations equal to the hypothesized median $M_0$?
Wilcoxon Signed-Rank Test
The sign test ignores the magnitude of the differences between the observations and the hypothesized median. The Wilcoxon signed-rank test incorporates this magnitude, requiring the assumption that the underlying continuous distribution is symmetric about its median. It serves as a more powerful non-parametric alternative to the paired Student’s t-test or the one-sample t-test.
Given pairs of observations for , compute the differences .
- Discard pairs where . Let be the reduced sample size.
- Rank the absolute differences from smallest to largest. Ties are assigned the average of the ranks they would have received. Let be the rank of .
- Calculate the test statistic , which is the sum of the signed ranks: Alternatively, calculate the sum of ranks for positive differences () and negative differences (). The test statistic is often defined as .
Under (symmetric distribution about 0), the expected value and variance of are: For large , is approximately normally distributed, permitting the use of a -test.
Mann-Whitney U Test (Wilcoxon Rank-Sum Test)
When comparing two independent samples to determine if they originate from the same population, the Mann-Whitney U test (or Wilcoxon rank-sum test) offers a non-parametric alternative to the independent two-sample t-test. It assumes the two distributions are identical in shape but potentially shifted in location.
Let and be independent samples.
- Combine all observations and rank them from to .
- Compute the sum of the ranks for sample 1 () and sample 2 ().
- The statistics are calculated as: Note that . The test statistic is .
Under the null hypothesis that and have the same distribution, the expectation and variance of are: Ties in the data require an adjustment to the variance formula: where is the number of tied groups and is the number of observations in the -th tied group.
What condition reduces the power of the Mann-Whitney U test relative to an independent two-sample t-test?
Kruskal-Wallis one-way analysis of variance
The Kruskal-Wallis H test extends the Mann-Whitney U test to more than two independent groups. It is the non-parametric equivalent of the one-way ANOVA, testing whether independent samples originate from the same distribution.
Given groups with sample sizes and total observations :
- Rank all observations jointly from to .
- Compute the sum of ranks for each group .
- The test statistic is:
If the null hypothesis is true (all samples come from the same population) and the sample sizes are sufficiently large (typically ), is approximately distributed as a chi-square distribution with degrees of freedom: If the null hypothesis is rejected, post-hoc procedures like Dunn’s test are utilized for pairwise comparisons to isolate the specific stochastic dominance among groups.
Spearman’s Rank Correlation Coefficient
Evaluating the strength and direction of association between two continuous or ordinal variables without assuming linearity relies on Spearman’s rank correlation coefficient ( or ). It evaluates the monotonic relationship between two variables, contrasting with Pearson’s correlation which evaluates linear relationships.
For pairs of observations , convert the raw scores to ranks and . Spearman’s is computed analogously to Pearson’s correlation coefficient, but applied to the ranks: where is the difference between the ranks of corresponding variables.
If there are identical values (ties), the simplified formula utilizing becomes inaccurate, and the standard Pearson correlation formula must be applied directly to the ranked variables.
Values of vary from to , indicating perfect negative or positive monotonic associations, respectively.
Bootstrap and Resampling Methods
Modern computational power enables simulation-based non-parametric approaches, most notably bootstrapping. Introduced by Bradley Efron, bootstrapping relies on random sampling with replacement from the original dataset.
If we possess a sample drawn from an unknown distribution , we construct an empirical distribution function . By drawing repeated samples of size , with replacement, from , we generate bootstrap samples .
For a sample statistic estimating a parameter , we compute the statistic for each bootstrap sample: . The distribution of approximates the sampling distribution of , enabling the construction of confidence intervals and hypothesis testing lacking parametric form.
The bootstrap standard error is the standard deviation of the bootstrap replicates: where is the mean of the bootstrap estimates. Resampling procedures eliminate reliance on asymptotic normality assumptions, providing robust inferences particularly suitable for complex estimators or small sample sizes limit conventional asymptotic theory.
Kernel Density Estimation
Kernel Density Estimation (KDE) establishes a non-parametric perspective on estimating the probability density function of a continuous random variable. Parametric estimation fits a predetermined shape (e.g., normal, gamma) parameterized by equations. KDE estimates the density entirely from data.
Let be independent and identically distributed samples drawn from some distribution with an unknown density . The kernel density estimator is: where constitutes the kernel (a non-negative function integrating to one) and denotes a smoothing parameter known as the bandwidth. The bandwidth heavily influences the estimator. Small induces undersmoothing, yielding high variance (spurious fluctuations), whereas large evokes oversmoothing, yielding high bias (obscuring structural features of the distribution). Standard choices for include the Gaussian, Epanechnikov, and uniform kernels.
Histograms and KDEs both attempt to model data density non-parametrically. Consider a dataset of highly clustered continuous physical measurements. A histogram forces boundaries at arbitrary bin edges. A KDE smooths out data without fixed bins.