Time Series Analysis - Probability & Statistics

A time series is a sequence of data points indexed in time order. Formally, a time series is a stochastic process $(X_t)$ for $t \in T$ , where $T$ is an index set, typically $\mathbb{Z}$ or $\mathbb{N}$ for discrete-time time series. Analysis of time series involves understanding the underlying structure and function that produced the data, often for the purpose of forecasting future values.

The foundational assumption in many time series models is stationarity. A time series $(X_t)$ is strictly stationary if the joint distribution of $(X_{t_1}, X_{t_2}, \dots, X_{t_k})$ is identical to that of $(X_{t_1+\tau}, X_{t_2+\tau}, \dots, X_{t_k+\tau})$ for all $t_1, \dots, t_k, \tau \in T$ .

In practice, strict stationarity is often too restrictive. Weak stationarity (or wide-sense stationarity) requires only that the first two moments are invariant with respect to time translation:

$\mathbb{E}[X_t] = \mu$ for all $t \in T$ .
$\text{Cov}(X_t, X_{t+\tau}) = \gamma(\tau)$ for all $t, \tau \in T$ . The function $\gamma(\tau)$ is the autocovariance function at lag $\tau$ . The autocorrelation function (ACF) is defined as $\rho(\tau) = \frac{\gamma(\tau)}{\gamma(0)}$ .

Foundational Time Series Processes

White Noise

A sequence of uncorrelated random variables $(w_t)$ with mean zero and finite, constant variance $\sigma_w^2$ is termed a white noise process, denoted $w_t \sim WN(0, \sigma_w^2)$ . The autocovariance function for white noise is given by $\gamma(\tau) = \sigma_w^2$ if $\tau = 0$ , and $0$ otherwise. When the process $w_t$ consists of independent and identically distributed (i.i.d.) random variables, it is termed strictly white noise. Gaussian white noise assumes $w_t \sim \mathcal{N}(0, \sigma_w^2)$ .

Random Walk

A random walk is defined by the process $X_t = X_{t-1} + w_t$ , where $w_t \sim WN(0, \sigma_w^2)$ . Expanding this equation yields $X_t = \sum_{j=1}^t w_j$ (assuming $X_0 = 0$ ). The expected value is $\mathbb{E}[X_t] = 0$ , but the variance is $\text{Var}(X_t) = t \sigma_w^2$ . Because the variance is strictly dependent on $t$ , a random walk is non-stationary. The covariance between $X_t$ and $X_s$ (where $t > s$ ) is $s \sigma_w^2$ .

Which of the following processes is strictly stationary?

Linear Models: AR, MA, and ARMA

Linear time series models capture the linear dependencies between observations.

Autoregressive (AR) Models

An autoregressive model of order $p$ , denoted AR( $p$ ), models the current value $X_t$ as a linear combination of its $p$ previous values plus a white noise term: $X_t = c + \phi_1 X_{t-1} + \phi_2 X_{t-2} + \dots + \phi_p X_{t-p} + w_t$ Using the backshift operator $B$ , where $B^k X_t = X_{t-k}$ , the AR( $p$ ) model implies: $\Phi(B) X_t = c + w_t$ where $\Phi(B) = 1 - \phi_1 B - \phi_2 B^2 - \dots - \phi_p B^p$ is the autoregressive polynomial. For an AR( $p$ ) process to be stationary, all roots of the characteristic equation $\Phi(z) = 0$ must lie outside the unit circle in the complex plane ( $|z| > 1$ ). For an AR(1) process $X_t = \phi_1 X_{t-1} + w_t$ , the condition simplifies to $|\phi_1| < 1$ , yielding ACF $\rho(\tau) = \phi_1^{|\tau|}$ .

Moving Average (MA) Models

A moving average model of order $q$ , denoted MA( $q$ ), expresses $X_t$ as a linear combination of the current and $q$ previous white noise terms: $X_t = \mu + w_t + \theta_1 w_{t-1} + \dots + \theta_q w_{t-q}$ Using the moving average polynomial $\Theta(B) = 1 + \theta_1 B + \dots + \theta_q B^q$ , this is written as $X_t = \mu + \Theta(B) w_t$ . Every finite-order MA process is stationary because it is a finite linear combination of stationary white noise processes. The autocovariance $\gamma(\tau) = 0$ for $|\tau| > q$ , dictating that the ACF cuts off after lag $q$ .

Invertibility of an MA process ensures that it can be uniquely expressed as an infinite-order AR process. An MA( $q$ ) model is invertible if all roots of $\Theta(z) = 0$ lie outside the unit circle.

ARMA and ARIMA Models

Combining AR and MA concepts forms the Autoregressive Moving Average model, ARMA( $p, q$ ): $\Phi(B) X_t = c + \Theta(B) w_t$

Stationarity and invertibility of the ARMA process depend on the roots of $\Phi(z)$ and $\Theta(z)$ respectively. Time series exhibiting non-stationarity in the mean, such as trends, require differencing. First-order differencing $\nabla X_t = X_t - X_{t-1} = (1-B)X_t$ removes linear trends; second-order removes quadratic trends. Applying $d$ differences produces an Autoregressive Integrated Moving Average model, ARIMA( $p, d, q$ ): $\Phi(B) (1-B)^d X_t = c + \Theta(B) w_t$

Modeling Exchange Rate Fluctuations

You are building a time series model for daily foreign exchange rates between USD and EUR. The log daily prices P_t exhibit a wandering behavior resembling a random walk. When you plot the differences X_t = log(P_t) - log(P_{t-1}), the resulting series mean-reverts to zero. The ACF of X_t shows significant spikes at lags 1 and 2, but vanishes to zero afterwards. The Partial Autocorrelation Function (PACF) gradually decays toward zero.

Based on the properties of X_t, what ARIMA model structure best represents the log price process P_t?

Partial Autocorrelation Function (PACF)

While the ACF measures the linear dependence between $X_t$ and $X_{t+\tau}$ inclusive of intermediate effects, the Partial Autocorrelation Function (PACF) isolates the direct correlation. The PACF at lag $\tau$ , denoted $\phi_{\tau\tau}$ , represents the correlation between $X_t$ and $X_{t+\tau}$ after removing the linear dependence of both variables on the intermediate values $X_{t+1}, \dots, X_{t+\tau-1}$ .

For an AR( $p$ ) process, the PACF cuts off strictly after lag $p$ ( $\phi_{\tau\tau} = 0$ for $\tau > p$ ). Conversely, for an MA( $q$ ) process, the PACF tails off gradually. This dualistic behavior provides the foundation for the Box-Jenkins model identification methodology.

Spectral Analysis

Time domain analysis emphasizes serial correlations over time lags. Spectral analysis (frequency domain analysis) decomposes the variance of a time series over a continuous spectrum of angular frequencies $\omega \in [-\pi, \pi]$ . For a stationary process with autocovariance function $\gamma(\tau)$ , the spectral density function $f(\omega)$ represents the Fourier transform of the autocovariance sequence: $f(\omega) = \frac{1}{2\pi} \sum_{\tau=-\infty}^\infty \gamma(\tau) e^{-i \omega \tau}$ The total variance of the process corresponds to the integral over the frequency band: $\gamma(0) = \int_{-\pi}^\pi f(\omega) d\omega$ A peak at a specific frequency $\omega_0$ in the spectral density plot implies periodic behavior with cycle length $\frac{2\pi}{\omega_0}$ . For Gaussian white noise, $\gamma(\tau)$ is absolute zero at all $\tau \neq 0$ , rendering the spectral density perfectly flat: $f(\omega) = \frac{\sigma_w^2}{2\pi}$ .

Filtering Operations in the frequency domain allow straightforward manipulation of time series signals. An LTI (Linear Time-Invariant) filter defined by sequence $(a_j)$ applies the convolution $Y_t = \sum_j a_j X_{t-j}$ . The frequency response function of the filter is $A(\omega) = \sum_j a_j e^{-i \omega j}$ . The spectral density of the filtered output modifies according to: $f_Y(\omega) = |A(\omega)|^2 f_X(\omega)$

Multivariate Time Series and Vector Autoregression (VAR)

When assessing joint dynamics of multiple interrelated time series $\mathbf{X}_t = (X_{1t}, X_{2t}, \dots, X_{kt})^\top$ , univariate ARIMA models are insufficient. The Vector Autoregressive model of order $p$ , VAR( $p$ ), generalizes the AR structure to dimension $k$ : $\mathbf{X}_t = \mathbf{c} + \mathbf{\Phi}_1 \mathbf{X}_{t-1} + \dots + \mathbf{\Phi}_p \mathbf{X}_{t-p} + \mathbf{w}_t$ where $\mathbf{\Phi}_i$ are $k \times k$ coefficient matrices and $\mathbf{w}_t$ is a $k$ -dimensional multivariate white noise zero-mean vector strictly characterized by the covariance matrix $\mathbf{\Sigma}$ .

Stationarity in a VAR system demands that roots of the determinant equation $|\mathbf{I}_k - \mathbf{\Phi}_1 z - \dots - \mathbf{\Phi}_p z^p| = 0$ fall strictly outside the complex unit circle. VAR models naturally represent Granger causality: $X_1$ Granger-causes $X_2$ if the past observations of $X_1$ statistically improve the prediction horizon for $X_2$ compared to strict reliance on the isolated past of $X_2$ .

State-Space Models and the Kalman Filter

A more generalized analytic framework is provided by State-Space Modeling. A state-space model characterizes observation dynamics through an underlying, unobserved state variable sequence $\mathbf{\alpha}_t$ . The process divides into deterministic functional dependencies:

Measurement Equation: Links observed data $\mathbf{y}_t$ to the unobserved state. $\mathbf{y}_t = \mathbf{Z}_t \mathbf{\alpha}_t + \mathbf{\epsilon}_t, \quad \mathbf{\epsilon}_t \sim \mathcal{N}(0, \mathbf{H}_t)$
State Equation (Transition Equation): Governs Markovian state evolution over sequence steps. $\mathbf{\alpha}_{t+1} = \mathbf{T}_t \mathbf{\alpha}_t + \mathbf{\eta}_t, \quad \mathbf{\eta}_t \sim \mathcal{N}(0, \mathbf{Q}_t)$

Here, $\mathbf{\epsilon}_t$ specifies observation measurement noise, and $\mathbf{\eta}_t$ structural transition disturbance. Matrices $\mathbf{Z}_t, \mathbf{T}_t, \mathbf{H}_t, \mathbf{Q}_t$ configure the parameters of dynamic correlation.

The Kalman filter supplies a recursive mechanism for determining the optimal minimum mean-squared error (MMSE) estimator for the state vector $\mathbf{\alpha}_t$ given the accrued observation sequence up to time $t$ , $Y_t = y_1, ..., y_t$ . The calculation iterates between the prediction step and optimal update (correction) computation involving the Kalman gain component modifying the prediction based on observed innovation error.

In a generic linear state-space model evaluated using the Kalman filter framework, which sequence step incorporates information exclusively derived from novel observations y_t not previously included structurally?

Structural Breakpoints and Non-Linearities

Standard parametric assumptions often fail mapping prolonged macroeconomic sequences due to fundamental shifts in generating mechanisms. A structural breakpoint models definitive shifts within the parameter spaces governing stationary dynamics. Formally evaluating structural sequence integrity requires analyzing sequence partitions mapping varying ARMA polynomials strictly restricted within designated time indices corresponding to systemic shocks.

Alternatively, Arch/GARCH frameworks directly model phenomena demonstrating localized heteroskedasticity. The Generalized Autoregressive Conditional Heteroskedasticity framework models the distinct variance sequence $\sigma_t^2$ dynamically: $X_t = \sigma_t z_t \quad (z_t \sim WN(0, 1))$ $\sigma_t^2 = \omega + \sum_{i=1}^q \alpha_i X_{t-i}^2 + \sum_{j=1}^p \beta_j \sigma_{t-j}^2$ The GARCH formulation precisely quantifies volatility clustering characterizations fundamentally essential to contemporary financial risk modeling frameworks.

Advanced paradigms increasingly rely upon threshold autoregressive paradigms (TAR) addressing non-linear functional manifestations, or fractional integration models (ARFIMA) structurally designed for mapping processes exhibiting exceptionally protracted long-range dependency characterized by exceptionally slowed hyperbolic ACF exponential decay functions.