Exploring Kalman Filters

Measurement Update

Exercise 3-2 from Probabilistic Robotics*

This article develops the measurement update for a simple Kalman filter per exercise 3-2 (page 81) from Probabilistic Robotics by Thrun, Bergard, and Fox. Exercise 3-2 is a continuation of exercise 3-1. Read the article on exercise 3-1 before reading this article.

1. Scenario Description

Exercise 3-2

  1. We will now add measurements to our Kalman filter. Suppose at time $t$, we can receive a noisy observation of $x$. In expectation, our sensor measures the true location. However this measurement is corrupted by Gaussian noise with covariance $\sigma^2 = 10$.

Discussion

Remember that $x$ refers to the position of our robot on it's axis. In this example, our measurement does not include the robot's acceleration. The phrase in expectation, or sensor measures the true location means the average measurement (i.e., expected value) will approach the true location as the number of measurements increase. In other words, there is no systematic bias to the measurement.

2. Measurement Model

Exercise 3-2(a)

(a) Define the measurement model. Hint: You need to define a matrix $\mathbf{C}$ and another matrix $\mathbf{Q}$ (c.f., Equation (3.6) and Table 3.1).


As determined in exercise 3-1, the state vector consists of position and acceleration:

$$ \tag{1} \vec x_t = \left\lgroup \begin{matrix}x_t \cr \dot x_t \end{matrix} \right\rgroup $$

The measurement at time $t$ is represented by the vector $\vec z_t$. In this exercise, $\vec z_t$ has length 1 and consists only of the measured position $x$ at time $t$:

$$ \tag{2} \vec z_t = \lgroup \begin{matrix} x_t \end{matrix} \rgroup $$

The matrix $\mathbf{C_t}$ defines the linear transformation between $\vec z_t$ and $\vec x_t$:

$$ \tag{3} \vec z_t = \mathbf{C_t} \vec x_t + \vec \delta_t$$

$\mathbf{C_t}$ is the matrix that converts $\vec x_t$ to $\vec z_t$. Since $\mathbf{C_t}$ does not vary with time $t$, I will drop the $t$ subscript:

$$ \tag{4} \mathbf{C} = \lgroup \begin{matrix}1 & 0 \end{matrix} \rgroup $$

Since the measurement vector is a scalar value, the covariance matrix simplifies to a scalar consisting of the variance $\delta^2 = 10$.

3. Measurement Update

Exercise 3-2(b)

(b) Implement the measurement update. Suppose at time $t = 5$, we observe a measurement $z = 5$. State the parameters of the Gaussian estimate before and after updating the KF. Plot the uncertainty ellipse before and after incorporating the measurement (see above [exercise 3-1] for instruction as to how to plot an uncertainty ellipse).


Kalman Gain

First we must calculate the Kalman gain, $\mathbf{K_t}$, which specifies to what extent the measurement will be incorporated into the estimate of position and acceleration.

$$ \tag{6} \mathbf{K_t} = \mathbf{\overline{\Sigma}}_t \mathbf{{C_t}^T} (\mathbf{C_t} \mathbf{\overline{\Sigma}}_t \mathbf{{C_t}^T} + \mathbf{Q_t})^{-1} $$

We'll use the Python sympy package to calculate the Kalman gain matrix. First, we'll create the matrices $\mathbf{Q}$, $\mathbf{C}$, and $\mathbf{\overline{\Sigma}_5}$ in Python.

In [2]:
# Import the Sympy Python package for symbolic mathematics
import sympy
sympy.init_printing()  # Attractively format matrices

# Create Q scalar
Q = sympy.Matrix([[10]])

# Create matrix C
C = sympy.Matrix([[1, 0]])
C
Out[2]:
$$\left[\begin{matrix}1 & 0\end{matrix}\right]$$

From exercise 3-1 we have the prediction at time $t=5$: $$ \tag{7} \mathbf{\overline{\Sigma}}_5 = \left\lgroup \begin{matrix} 41.25 & 12.5 \cr 12.5 & 5 \end{matrix} \right\rgroup $$

In [3]:
# Create Sigma_5
Sigma_5 = sympy.Matrix([[41.25, 12.5], [12.5, 5]])
Sigma_5
Out[3]:
$$\left[\begin{matrix}41.25 & 12.5\\12.5 & 5\end{matrix}\right]$$

Now we'll calculate $\mathbf{\Sigma_5}$. I'll perform the calculation in small steps to help visualize the matrix calculations.

Here is the first term, $\mathbf{\overline{\Sigma}}_5 \mathbf{C}^T$:

In [4]:
Sigma_5 * C.T
Out[4]:
$$\left[\begin{matrix}41.25\\12.5\end{matrix}\right]$$

Now for the beginning of the second term, $\mathbf{C} \mathbf{\overline{\Sigma}}_5$:

In [5]:
C * Sigma_5
Out[5]:
$$\left[\begin{matrix}41.25 & 12.5\end{matrix}\right]$$

$\mathbf{C} \mathbf{\overline{\Sigma}}_5 \mathbf{C}^T$:

In [6]:
C * Sigma_5 * C.T
Out[6]:
$$\left[\begin{matrix}41.25\end{matrix}\right]$$

$\mathbf{C} \mathbf{\overline{\Sigma}}_5 \mathbf{C}^T + \mathbf{Q} $:

In [7]:
C * Sigma_5 * C.T + Q
Out[7]:
$$\left[\begin{matrix}51.25\end{matrix}\right]$$

Now take the inverse, $(\mathbf{C} \mathbf{\overline{\Sigma}}_5 \mathbf{C}^T + \mathbf{Q})^{-1} $:

In [8]:
(C * Sigma_5 * C.T + Q).inv()
Out[8]:
$$\left[\begin{matrix}0.0195121951219512\end{matrix}\right]$$

Put it all together: $\mathbf{K_5} = \mathbf{\overline{\Sigma}}_5 \mathbf{{C}^T} (\mathbf{C} \mathbf{\overline{\Sigma}}_5 \mathbf{{C}^T} + \mathbf{Q})^{-1}$

In [9]:
K5 = Sigma_5 * C.T * (C * Sigma_5 * C.T + Q).inv()
K5
Out[9]:
$$\left[\begin{matrix}0.804878048780488\\0.24390243902439\end{matrix}\right]$$

Updating the Mean

The mean is updated using the following formula:

$$ \tag{8} \vec \mu_t = \overline{\vec \mu}_t + \mathbf{K}_t (\vec z_t - \mathbf{C}_t \overline{\vec \mu}_t) $$

$\overline{\vec \mu}_t$ is the robot's estimated state at time $t$, prior to taking a measurement. We are given $z_t$ = 5. Per the results of exercise 3-2:

$$ \tag{9} \overline{\vec \mu}_5 = \left\lgroup \begin{matrix} 0 \cr 0 \end{matrix} \right\rgroup $$

In [10]:
z_5 = sympy.Matrix([[5]])
mubar_5 = sympy.Matrix([[0], [0]])
mubar_5
Out[10]:
$$\left[\begin{matrix}0\\0\end{matrix}\right]$$
In [11]:
z_5 + C * mubar_5
Out[11]:
$$\left[\begin{matrix}5\end{matrix}\right]$$
In [12]:
mu_5 = mubar_5 + K5 * (z_5 + C * mubar_5)
mu_5
Out[12]:
$$\left[\begin{matrix}4.02439024390244\\1.21951219512195\end{matrix}\right]$$

The Kalman filter has decided that the robot is probably a bit closer to the center of the track than 5, the result of our measurement. Also note that the filter has picked a non-zero value of acceleration, even though no estimate of acceleration was provided by the measurement.

Updating the Covariance Matrix

The updated covariance matrix is provided by the formula:

$$ \tag{10} \mathbf{\Sigma}_t = (\mathbf{I} - \mathbf{K}_t \mathbf{C}) \mathbf{\overline\Sigma}_t $$

$\mathbf{I}$ is the identity matrix, for which all diagonal elements are equal to 1 and all other elements are equal to 0.

In [13]:
K5
Out[13]:
$$\left[\begin{matrix}0.804878048780488\\0.24390243902439\end{matrix}\right]$$
In [14]:
C
Out[14]:
$$\left[\begin{matrix}1 & 0\end{matrix}\right]$$
In [15]:
K5 * C
Out[15]:
$$\left[\begin{matrix}0.804878048780488 & 0\\0.24390243902439 & 0\end{matrix}\right]$$
In [16]:
I = sympy.Matrix([[1, 0], [0, 1]])
I
Out[16]:
$$\left[\begin{matrix}1 & 0\\0 & 1\end{matrix}\right]$$
In [17]:
I - K5 * C
Out[17]:
$$\left[\begin{matrix}0.195121951219512 & 0\\-0.24390243902439 & 1\end{matrix}\right]$$

Finally, the updated covariance matrix:

In [18]:
sigma_5p = (I - K5 * C) * Sigma_5
sigma_5p
Out[18]:
$$\left[\begin{matrix}8.04878048780488 & 2.4390243902439\\2.4390243902439 & 1.95121951219512\end{matrix}\right]$$

Let's compare the updated covariance matrix to the matrix before the measurement:

In [19]:
Sigma_5
Out[19]:
$$\left[\begin{matrix}41.25 & 12.5\\12.5 & 5\end{matrix}\right]$$

All variances and covariances have dropped significantly, meaning we have much less uncertainty in our position and acceleration after the measurement.

Plotting the 95% Uncertainty Ellipse

We'll use the following Python functions to plot the ellipse:

In [20]:
import math
import matplotlib
matplotlib.use('nbAgg')
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import matplotlib.lines as lines
# import numpy as np

def get_eigen_data(matrix):
    """Extracts eigenvalues and eigenvectors from covariance matrix.
    """
    eigenvects = matrix.eigenvects()
    if eigenvects[0][0] > eigenvects[1][0]:
        idx_max = 0
        idx_min = 1
    else:
        idx_max = 1
        idx_min = 0
        
    return {"eigvec_major": list(eigenvects[idx_max][2][0]),
            "eigvec_minor": list(eigenvects[idx_min][2][0]),
            "eigval_major": eigenvects[idx_max][0],
            "eigval_minor": eigenvects[idx_min][0],
            "theta_major": math.atan(eigenvects[idx_max][2][0][1]/
                               eigenvects[idx_max][2][0][0])*180/math.pi,
            "theta_minor": math.atan(eigenvects[idx_min][2][0][1]/
                               eigenvects[idx_min][2][0][0])*180/math.pi,
           }


def plot_ellipse(mu_pre, cov_pre, mu_post, cov_post):
    plt.close()
    
    # Setup plot
    fig = plt.figure()
    ax = fig.add_subplot(111)
  
    # Set plot size
    hor_cov = float(cov_pre.row(0).col(0)[0])
    ver_cov = float(cov_pre.row(1).col(1)[0])
    scale = 3
    ax.set_xlim(-hor_cov*scale, hor_cov*scale)
    ax.set_ylim(-ver_cov*scale, ver_cov*scale)
    ax.set_aspect('equal', adjustable='datalim')
    
    # Add pre-measurement ellipse
    eig_pre = get_eigen_data(cov_pre)
    ell_pre = mpatches.Ellipse((mu_pre[0], mu_pre[1]),
                               math.sqrt(eig_pre["eigval_major"]*5.992)*2,
                               math.sqrt(eig_pre["eigval_minor"]*5.992)*2,
                               eig_pre["theta_major"], fill=False,
                               edgecolor="blue")

    # Add posterior ellipse
    eig_post = get_eigen_data(cov_post)
    ell_post = mpatches.Ellipse((mu_post[0], mu_post[1]),
                                math.sqrt(eig_post["eigval_major"]*5.992)*2,
                                math.sqrt(eig_post["eigval_minor"]*5.992)*2,
                                eig_post["theta_major"], fill=False,
                                edgecolor="red")
    ax.add_patch(ell_pre)
    ax.add_patch(ell_post)
    
    # Add labels and title
    title = ("Robot Position at Time t = 5\nPre and Post Measurement")
    plt.title(title)
    plt.xlabel("Position")
    plt.ylabel("Velocity")
    plt.show()
    
plot_ellipse([0, 0], Sigma_5, mu_5, sigma_5p)

The blue ellipse represents the prediction, or in other words, the robot's belief prior to incorporating the measurement. The red ellipse is the belief after the measurement. Both ellipses are 95% error ellipses. Note that the belief has moved to the right and shrunk. Also note that even though the measurement had no information on acceleration, the Kalman filter algorithm was able to incorporate knowledge of the covariance to provide a reasonable estimate of acceleration.