What is the difference between the Kalman filter and the Wiener filter?

The Wiener filter operates in the frequency domain as a batch process and handles only stationary processes, while the Kalman filter uses a state-space representation to perform sequential processing in the time domain, accommodating non-stationary, time-varying, and multivariable systems. In steady state, the Kalman filter becomes equivalent to the Wiener filter.

What must be known to use the Kalman filter?

The state transition matrix F, observation matrix H, process noise covariance Q, observation noise covariance R, and initial values are required. For time-invariant systems, these are constant matrices determined once and used at all time steps. When parameters are inaccurate, adaptive Kalman filters can be used; for nonlinear systems, EKF/UKF; and for non-Gaussian noise, particle filters.

Kalman Filter

Q: What is the Kalman filter?

The Kalman filter is an algorithm that optimally estimates the internal state of a system from noisy observations. Based on a state-space model, it sequentially performs two steps: prediction and update. It provides the minimum variance estimate for linear Gaussian systems.

What is the Kalman Filter

The Kalman filter is an algorithm that optimally estimates the internal state of a system from noisy observations. It is based on the 1960 paper by Rudolf E. Kalman.

[Original paper] R. E. Kalman, "A New Approach to Linear Filtering and Prediction Problems", ASME Journal of Basic Engineering, 82(1): 35–45 (1960).
[Continuous-time extension] R. E. Kalman, R. S. Bucy, "New Results in Linear Filtering and Prediction Theory", ASME Journal of Basic Engineering, 83(1): 95–108 (1961).

The Kalman filter has the following characteristics:

Sequential processing: updates estimates in real time as each new observation arrives
Optimality: provides the minimum variance estimate for linear Gaussian systems
Computational efficiency: does not need to store all past data, making it memory-efficient
Prediction capability: can predict future states in addition to estimating the current state

Historical background

The Kalman filter was published by Rudolf E. Kalman in 1960 and became widely known after being adopted for orbit estimation in the Apollo program during the 1960s. Since then, it has been used in virtually every modern control and estimation system, including GPS, autonomous driving, and robotics.

State-Space Model

Figure 1: State-space model

The Kalman filter assumes the following state-space model.

\begin{align} \text{State equation} : \quad \boldsymbol{x}_k &= \boldsymbol{F}_k \boldsymbol{x}_{k-1} + \boldsymbol{w}_k \\ \text{Observation equation} : \quad \boldsymbol{y}_k &= \boldsymbol{H}_k \boldsymbol{x}_k + \boldsymbol{v}_k \end{align}

where:

$\boldsymbol{x}_k$: state vector at time $k$ (the unknown quantity to be estimated)
$\boldsymbol{y}_k$: observation vector at time $k$ (data obtained from sensors)
$\boldsymbol{F}_k$: state transition matrix (system dynamics)
$\boldsymbol{H}_k$: observation matrix (mapping from state to observation)
$\boldsymbol{w}_k$: process noise (unmodeled disturbances, $\boldsymbol{Q}_k = E[\boldsymbol{w}_k\boldsymbol{w}_k^T]$)
$\boldsymbol{v}_k$: observation noise (sensor measurement error, $\boldsymbol{R}_k = E[\boldsymbol{v}_k\boldsymbol{v}_k^T]$)

Structure of the Kalman Filter

The Kalman filter alternates between two steps: prediction and update.

Prediction (Predict): predicts the current state from the previous estimate
Update (Update): corrects the prediction using the observation

Role of the Kalman Gain $\boldsymbol{K}_k$

The quantity $\boldsymbol{e}_k = \boldsymbol{y}_k - \hat{\boldsymbol{y}}_k$ output by the subtractor in Figure 2 is called the innovation, representing "how much the prediction was off." The Kalman gain $\boldsymbol{K}_k$ determines how much to trust this innovation when correcting the estimate.

The objective to minimize is not the innovation $\boldsymbol{e}_k$ itself, but rather the trace of the estimation error covariance (the total variance of estimation errors across all state variables):

$$\text{tr}(\boldsymbol{P}_{k|k}) \to \min, \quad \boldsymbol{P}_{k|k} = E\!\left[(\boldsymbol{x}_k - \hat{\boldsymbol{x}}_{k|k})(\boldsymbol{x}_k - \hat{\boldsymbol{x}}_{k|k})^T\right]$$

The optimal gain derived from this minimization is (see Derivation for details):

$$\boldsymbol{K}_k = \boldsymbol{P}_{k|k-1}\boldsymbol{H}_k^T(\boldsymbol{H}_k\boldsymbol{P}_{k|k-1}\boldsymbol{H}_k^T + \boldsymbol{R}_k)^{-1}$$

Intuitively:

Large observation noise (large $\boldsymbol{R}_k$) → small $\boldsymbol{K}_k$ → trust the prediction more, rely less on the observation
Large prediction uncertainty (large $\boldsymbol{P}_{k|k-1}$) → large $\boldsymbol{K}_k$ → trust the observation more and make a larger correction

Comparison with the Wiener filter

Comparing with the Wiener filter's optimal solution $G(\omega) = H^*P_S(|H|^2P_S + P_N)^{-1}$, the Kalman gain is based on the same principle of weighting by the ratio of "signal uncertainty" to "noise uncertainty." While the Wiener filter uses the signal-to-noise ratio in the frequency domain, the Kalman filter uses the ratio of covariance matrices in the time domain.

For details, see Algorithm and Derivation.

Prerequisites and Limitations

Required Known Parameters

To apply the basic Kalman filter, the following parameters must be known (or set in advance):

Parameter	Meaning	Practical handling
$\boldsymbol{F}_k$	State transition matrix	Derived from physical laws (e.g., equations of motion)
$\boldsymbol{H}_k$	Observation matrix	Determined from sensor characteristics
$\boldsymbol{Q}_k$	Process noise covariance	Magnitude of model error (requires tuning)
$\boldsymbol{R}_k$	Observation noise covariance	Determined from sensor specifications or measurements
$\hat{\boldsymbol{x}}_0,\, \boldsymbol{P}_0$	Initial estimate and covariance	From prior knowledge or initialized with large $\boldsymbol{P}_0$

About the subscript $k$

The subscript $k$ in the table is included for generality, indicating that the filter can handle time-varying systems. In practice, many systems are time-invariant (LTI), where $\boldsymbol{F}, \boldsymbol{H}, \boldsymbol{Q}, \boldsymbol{R}$ are constant matrices determined once and used at all time steps (e.g., GPS, inertial navigation, most sensor fusion applications). However, when the system dynamics change over time — such as during a rocket's flight (mass decreases as fuel is consumed) or an aircraft's turn (equations of motion change) — $\boldsymbol{F}_k$ must be computed from the physical model at each time step $k$.

In practice, it is rare for all of these to be known exactly. In particular, $\boldsymbol{Q}_k$ and $\boldsymbol{R}_k$ are difficult to derive theoretically and often require trial-and-error tuning.

What happens when parameters are inaccurate

$\boldsymbol{Q}_k$ too small → the filter overtrusts the model and ignores observations (risk of divergence)
$\boldsymbol{Q}_k$ too large → the filter follows observations too closely and fails to remove noise
$\boldsymbol{R}_k$ too small → overtrusts noisy observations
$\boldsymbol{R}_k$ too large → ignores observations and relies too much on prediction
$\boldsymbol{F}_k$ inaccurate → predictions themselves become off target

What happens with non-Gaussian noise

The Kalman filter still works even when the noise is non-Gaussian. However, the optimality guarantee changes:

Gaussian noise: the Kalman filter is optimal among all estimators (minimum variance estimate)
Non-Gaussian noise: the Kalman filter is optimal among linear estimators (BLUE: Best Linear Unbiased Estimator), but may be outperformed by nonlinear estimators

Noise type	Kalman filter performance	More suitable methods
Gaussian	Optimal (cannot be improved)	None
Heavy-tailed (many outliers)	Pulled by outliers	Particle filter, Robust KF
Multimodal (multiple hypotheses)	Breaks down due to single-Gaussian approximation	Particle filter, Mixture KF
Uniform distribution	Works but inefficient	Particle filter

In other words, it does not "stop working" but rather "loses optimality." In practice, if the noise is approximately Gaussian, the filter performs well enough. When non-Gaussianity is significant, nonlinear methods such as the particle filter are effective.

The Leap from the Wiener Filter

With the above prerequisites and limitations in mind, let us revisit the historical significance of the Kalman filter. Before the Kalman filter, optimal filtering was dominated by the Wiener filter, developed by Norbert Wiener in the 1940s. The Wiener filter is formulated in the frequency domain and provides the optimal linear filter in the minimum mean square error (MMSE) sense for stationary processes. However, it had several fundamental limitations in practice.

Characteristic	Wiener filter	Kalman filter
Processing mode	Typically batch processing (requires all data)	Sequential processing (real-time update with each observation)
Applicable systems	Stationary processes only	Non-stationary and time-varying systems
Formulation	Frequency domain (power spectra)	State space (time domain, naturally extends to multivariable)
Memory	Entire observation history (or autocorrelation function)	Only the current estimate and covariance
Multivariable	Difficult to extend (MIMO Wiener-Hopf equation)	Naturally handles multivariable via matrix operations
Uncertainty quantification	Residual error power only	Covariance matrix tracks estimation confidence over time

The essential innovation of the Kalman filter was reformulating the optimal filtering problem from "spectral decomposition" to "sequential state estimation update." While the Wiener filter "designs the optimal frequency response after seeing all data," the Kalman filter "optimally corrects the estimate each time a new observation arrives." This paradigm shift made it possible to address real-world demands: real-time control, non-stationary systems, and multivariable problems.

Relationship between the two

For stationary processes as $k \to \infty$, the Kalman gain $\boldsymbol{K}_k$ converges to a constant value, and the steady-state Kalman filter becomes equivalent to the Wiener filter. In other words, the Wiener filter can be viewed as the steady-state limit of the Kalman filter.

How one researcher achieved so many advantages at once

What Kalman did in his 1960 paper "A New Approach to Linear Filtering and Prediction Problems" was not to design each advantage one by one, but to fundamentally change the formulation of the problem. Specifically, he reformulated Wiener's problem setting (frequency domain, stationary, infinite time) using the state-space representation from control theory (time domain, finite-dimensional, sequential update).

As a result, all the advantages listed in the table above follow automatically:

Advantage	Why it follows automatically from the state-space representation
Sequential processing	The state equation $\boldsymbol{x}_k = \boldsymbol{F}_k\boldsymbol{x}_{k-1} + \boldsymbol{w}_k$ has a one-step-at-a-time update structure
Non-stationary support	$\boldsymbol{F}_k, \boldsymbol{H}_k$ carry time subscript $k$; time-varying is not a generalization but built in from the start
Time-domain formulation	The state-space representation is inherently a time-domain framework
Memory efficiency	The state vector $\boldsymbol{x}_k$ is a sufficient statistic that compresses all past observations
Multivariable support	A multidimensional state vector is assumed from the outset
Covariance tracking	The update equation for the error covariance $\boldsymbol{P}_k$ is naturally derived within the state-space framework

In essence, Kalman's fundamental contribution was not inventing multiple advantages individually, but rather that "a change of perspective brought everything along." It is a prime example of the power of formulation in mathematics, where the right abstraction solves many problems at once.

Intellectual context

The development of control theory in the 1950s, especially R.E. Bellman's dynamic programming (a sequential formulation of optimal control)
The limitations of Wiener-Hopf theory (difficulty extending to non-stationary and multivariable cases) were widely recognized
Kalman himself had deep expertise in both control theory and probability/optimization

Extensions for Imperfect Models

Many extension methods have been developed to relax the constraints of the basic Kalman filter (linearity, Gaussianity, known model parameters):

Method	What it estimates/handles	Approach
Adaptive Kalman filter	$\boldsymbol{Q}_k$, $\boldsymbol{R}_k$	Monitors innovation $\boldsymbol{e}_k$ statistics to adjust covariances online
Extended Kalman filter (EKF)	Nonlinear state transition/observation	Linearizes via Jacobian (partial derivative matrix) at each time step
Unscented Kalman filter (UKF)	Same as above	Approximates nonlinear functions via sigma points without linearization
State augmentation	Unknown parameters in $\boldsymbol{F}$, $\boldsymbol{H}$	Adds unknown parameters to the state vector for joint estimation
EM algorithm	$\boldsymbol{F}$, $\boldsymbol{H}$, $\boldsymbol{Q}$, $\boldsymbol{R}$	Maximum likelihood estimation of model parameters in batch mode
Particle filter	Arbitrary nonlinear/non-Gaussian	Removes the Gaussian assumption via Monte Carlo methods

Practical guidelines

$\boldsymbol{F}_k$, $\boldsymbol{H}_k$ are linear and known → basic Kalman filter
$\boldsymbol{Q}_k$, $\boldsymbol{R}_k$ are uncertain → adaptive Kalman filter for dynamic adjustment
System is nonlinear → EKF (mild nonlinearity) or UKF (strong nonlinearity)
Model structure itself is unknown → state augmentation or EM
Non-Gaussian noise → particle filter

Applications

GPS/GNSS

High-precision estimation of position and velocity from satellite signals

Robotics

SLAM (Simultaneous Localization and Mapping)

Autonomous Driving

Vehicle position and attitude estimation via sensor fusion

Aerospace

Orbit estimation and attitude control for aircraft and spacecraft

Finance

Trend estimation and forecasting of time series data

Image Processing

Object tracking and video stabilization

Kalman Filter

What is the Kalman Filter

State-Space Model

Structure of the Kalman Filter

Role of the Kalman Gain $\boldsymbol{K}_k$

Prerequisites and Limitations

Required Known Parameters

What happens when parameters are inaccurate

What happens with non-Gaussian noise

The Leap from the Wiener Filter

How one researcher achieved so many advantages at once

Extensions for Imperfect Models

Articles

Derivation

Algorithm

Examples & Implementation

Extensions

Applications

GPS/GNSS

Robotics

Autonomous Driving

Aerospace

Finance

Image Processing

Kalman Filter

What is the Kalman Filter

State-Space Model

Structure of the Kalman Filter

Role of the Kalman Gain $\boldsymbol{K}_k$

Prerequisites and Limitations

Required Known Parameters

What happens when parameters are inaccurate

What happens with non-Gaussian noise

The Leap from the Wiener Filter

How one researcher achieved so many advantages at once

Extensions for Imperfect Models

Articles

Derivation

Algorithm

Examples & Implementation

Extensions

Applications

GPS/GNSS

Robotics

Autonomous Driving

Aerospace

Finance

Image Processing

Related Topics