When diving into the field of attitude detection from sensor readings a couple of answers seem omnipresent:
- You need an accelerometer and a magnetometer.
- You need to tilt compensate.
- You need to low-pass filter.
- Or use the complimentary filter.
- Or better still, use the mystical Kalman filter (but it’s too complicated to explain).
Starting from the above and having an accelerometer and a magnetometer at hand, the old question a wise man once asked is:
To tilt compensate, or not to tilt compensate, that is the question—
Whether ’tis Nobler in the mind to suffer
The Slings and Arrows of outrageous StackOverflow answers,
Or to take Arms against a Sea of troubles,
And by using the attitude matrix, end them? To tilt: to roll.
In other words: Is tilt compensation required?
The Tilt Compensation
To make a long story short, the common way to calculate yaw angle \(\psi\) (read: psi) by tilt-compensating a 3-axis magnetometer using pitch angle \(\theta\) (read: theta) and roll angle \(\phi\) (read: phi) is by re-mapping the sensor readings to the reference X-Y plane by calculating
\(\begin{align}
\begin{bmatrix}
\hat{m}_x \\ \hat{m}_y
\end{bmatrix} =
\begin{bmatrix}
cos(\theta) && sin(\phi) \cdot sin(\theta) && cos(\phi) \cdot sin(\theta) \\
0 && cos(\phi) && -sin(\theta)
\end{bmatrix} \cdot
\begin{bmatrix}
m_x \\ m_y \\ m_z
\end{bmatrix}
\end{align}
\)
In other words:
\(\begin{align}
\hat{m}_x &= m_x \cdot cos(\theta) + m_y \cdot sin(\phi) sin(\theta) + m_z \cdot cos(\phi) sin(\theta) \\
\hat{m}_y &= m_y \cdot cos(\phi) - m_z \cdot sin(\phi)
\end{align}
\)
And then taking the arctangent to get the yaw angle:
\(\begin{align}
\psi &= tan^{-1}(-\frac{\hat{m}_y}{\hat{m}_x}) \\
&= atan2(-\hat{m}_y, \hat{m}_x)
\end{align}
\)
Obviously, \(\theta\) and \(\phi\) would be needed before doing so, and they may be obtained from the accelerometer — assuming aeronautic reference system, i.e. Tait-Bryan XYZ — by calculating
\(\begin{align}
\phi &= tan^{-1}(\frac{a_y}{a_z}) \\
\theta &= tan^{-1}(-\frac{a_x}{\sqrt{a_x^2 + a_z^2}})
\end{align}
\)
Since the denominator in the \(\theta\) calculation is never negative, the angle is constrained to a range of \(\pm 90^\circ\) , whereas \(\phi\) and \(\psi\) may contain values in the range of \(\pm 180^\circ\) .
The question is, since we do have two axes pointing down and somewhat north or forward — just don’t be fooled to assume that a magnetometer will magically do that, or you will be badly surprised by the readings it’ll give you — so, since we have these axes, why the need to tilt compensate?
A good Attitude
The short answer is: We don’t need to, and the solution smells a lot like the TRIAD algorithm. (That might be due to the fact that it is designed to sort of do what we want.)
The solution takes advantage out of two facts:
- The cross product of two vectors is a vector orthogonal to them (in layman’s terms: It is in a \(90^\circ\) angle to both of them)
- The dot product of two vectors is the cosine of the angle between them.
Assuming the accelerometer measures positive up, i.e. gives a negative reading for the gravity when the top side is actually on the top, and that both sensors are somewhat calibrated to not totally bork up our calculations, then here’s the general procedure:
First, we invert the accelerometer vector so that it indeed points up. After normalization, it then forms our new local \(\vec{z}\) axis:
\(\begin{align}
\vec{z} = -\frac{\vec{a}}{|\vec{a}|} = \frac{-\vec{a}}{\sqrt{a_x^2 + a_y^2 + a_z^2}}
\end{align}
\)
We then cross \(\vec{z}\) with the magnetometer vector \(\vec{m}\) and normalize in order to get our local \(\vec{y}\) axis:
\(\begin{align}
\vec{y} = \frac{\vec{z} \times \vec{m}}{|\vec{z} \times \vec{m}|}
\end{align}
\)
Since \(\vec{y}\) is orthogonal to \(\vec{m}\) and \(\vec{z}\) , the actual pitch of the magnetometer vector does not matter as long as \(\vec{m}\) is not collinear with \(\vec{z}\) . In that case you are standing either on the north or the south pole and there really is no defined north. In addition, you are probably freezing and have other problems.
Since \(\vec{y}\) and \(\vec{z}\) are now well defined (that is, orthonormal and pointing in the general direction of success), we cross them again to get an orthogonal \(\vec{x}\) axis:
\(\begin{align}
\vec{x} = \vec{y} \times \vec{z}
\end{align}
\)
Renormalization is suggested for practical reasons, but not required mathematically.
We now need to define a set of reference vectors, say \(\vec{X}\) , \(\vec{Y}\) and \(\vec{Z}\) in upper case notation. To make life easy, we’ll simply assume them to be unit vectors, that is
\(\begin{align}
\vec{X} =
\begin{bmatrix}
1 \\ 0 \\ 0
\end{bmatrix} \quad
\vec{Y} =
\begin{bmatrix}
0 \\ 1 \\ 0
\end{bmatrix} \quad
\vec{Z} =
\begin{bmatrix}
0 \\ 0 \\ 1
\end{bmatrix} \\
\end{align}
\)
Once this is done, we can build the attitude (or direction cosine) matrix \(\underline{A}\) from the dot products of the vectors (remember that the dot product of two vectors is the cosine of the angle between them), as follows:
\(\begin{align}
\underline{A} &= \begin{bmatrix}
\vec{x} \cdot \vec{X} && \vec{y} \cdot \vec{X} && \vec{z} \cdot \vec{X} \\
\vec{x} \cdot \vec{Y} && \vec{y} \cdot \vec{Y} && \vec{z} \cdot \vec{Y} \\
\vec{x} \cdot \vec{Z} && \vec{y} \cdot \vec{Z} && \vec{z} \cdot \vec{Z}
\end{bmatrix}
\end{align}
\)
TRIAD formulates this as
\(\begin{align}
\underline{A} &=
\begin{bmatrix}
\vec{x} \; \vdots \; \vec{y} \; \vdots \; \vec{z}\end{bmatrix} \cdot
\begin{bmatrix}
\vec{X} \; \vdots \; \vec{Y} \; \vdots \; \vec{Z}\end{bmatrix} \\
&=
\begin{bmatrix}
x_x && y_x && z_x \\
x_y && y_y && z_y \\
x_z && y_z && z_z
\end{bmatrix} \cdot
\begin{bmatrix}
X_x && Y_x && Z_x \\
X_y && Y_y && Z_y \\
X_z && Y_z && Z_z
\end{bmatrix}
\end{align}
\)
Now since we just defined \(\vec{X}\) , \(\vec{Y}\) and \(\vec{Z}\) to be unit vectors, we have
\(\begin{align}
\underline{A} &=
\begin{bmatrix}
x_x && y_x && z_x \\
x_y && y_y && z_y \\
x_z && y_z && z_z
\end{bmatrix} \cdot
\begin{bmatrix}
1 \; && 0 \; && 0 \\
0 \; && 1 \; && 0 \\
0 \; && 0 \; && 1
\end{bmatrix} = \begin{bmatrix}
x_x && y_x && z_x \\
x_y && y_y && z_y \\
x_z && y_z && z_z
\end{bmatrix}
\end{align}
\)
It can’t get more convenient than that. No matter how we get there, this matrix \(\underline{A}\) contains all angular relationships between all axes, so it is just a matter of extracting them. Similar to the method described in the tilt compensation algorithm, we can get our angles from the matrix as follows:
\(\begin{align}
\theta &= -sin^{-1}(\underline{A}_{1, 3}) \\
\phi &= tan^{-1}(\underline{A}_{2, 3}, \underline{A}_{3, 3}) = atan2(\underline{A}_{2, 3}, \underline{A}_{3, 3}) \\
\psi &= tan^{-1}(\underline{A}_{1, 2}, \underline{A}_{1, 1}) = atan2(\underline{A}_{1, 2}, \underline{A}_{1, 1})
\end{align}
\)
This gives us
\(\begin{align}
\theta &= -sin^{-1}(\vec{z} \cdot \vec{X}) \\
\phi &= atan2(\vec{z} \cdot \vec{Y}, \vec{z} \cdot \vec{Z}) \\
\psi &= atan2(\vec{y} \cdot \vec{X}, \vec{x} \cdot \vec{X})
\end{align}
\)
And, again, since our reference axes are unit vectors, that reduces to:
\(\begin{align}
\theta &= -sin^{-1}(z_x) \\
\phi &= atan2(z_y, z_z) \\
\psi &= atan2(y_x, x_x)
\end{align}
\)
Again, because of the (arc)sine in the \(\theta\) calculation, pitch angle cannot exceed \(\pm 90^\circ\) .
So, did we tilt compensate? Sort of, by encoding roll and pitch information in \(\vec{y}\) and \(\vec{x}\) (mind you, we hat to calculate these). But then again we didn’t. In a way.
Final Words
One interesting (read: nasty) behaviour is that heading and roll angle seem to jump about \(180^\circ\) when the device’s actual pitch angle exceeds \(\pm 90^\circ\) , that is, rolls over. In this case, the calculated pitch angle will go down or up again, respectively, since it’s, by definition, constrained to a \(\pm 90^\circ\) range. The heading and roll angles, on the other hand, will be flipped by \(180^\circ\) , since the device is now — essentially — in a reverse position. If you never rotate a device continously for \(180^\circ\) and watch the the output, this will not be a problem. But if you do, say, if you intend to use your derived angles to update a Kalman filter, then simply take care to handle this case in order to not mess up it’s state estimate with seemingly bogus readings.