Sadly, running
conda install nb_conda
in the environment of your choice. After that, start up your notebook and select the Kernel you want either when creating a new notebook or from the notebook’s Kernel menu:
There we go.
Du betrachtest das Archiv der Kategorie Data Science.
Sadly, running
conda install nb_conda
in the environment of your choice. After that, start up your notebook and select the Kernel you want either when creating a new notebook or from the notebook’s Kernel menu:
There we go.
The Baum-Welch algorithm determines the (locally) optimal parameters for a Hidden Markov Model by essentially using three equations.
One for the initial probabilities:
$$
\begin{align}
\pi_i &= \frac{E\left(\text{Number of times a sequence started with state}\, s_i\right)}{E\left(\text{Number of times a sequence started with any state}\right)}
\end{align}
$$
Another for the transition probabilities:
$$
\begin{align}
a_{ij} &= \frac{E\left(\text{Number of times the state changed from}\, s_i \, \text{to}\,s_j\right)}{E\left(\text{Number of times the state changed from}\, s_i \, \text{to any state}\right)}
\end{align}
$$
And the last one for the emission probabilities:
$$
\begin{align}
b_{ik} &= \frac{E\left(\text{Number of times the state was}\, s_i \, \text{and the observation was}\,v_k\right)}{E\left(\text{Number of times the state was}\, s_i\right)}
\end{align}
$$
If one had a fully labeled training corpus representing all possible outcomes, this would be exactly the optimal solution: Count each occurrence, normalize and you’re good. If, however, no such labeled training corpus is available – i.e. only observations are given, no according state sequences – the expected values \(E(c)\) of these counts would have to be estimated. This can be done (and is done) using the forward and backward probabilities \(\alpha_t(i)\) and \(\beta_t(i)\), as described below.
Weiterlesen »
While planning an eleven-day trip through the Hardangervidda in Norway, I came across the age old problem of estimating the walking time for a given path on the map. While one is easily able to determine the times for the main west-east and north-south routes from a travel guide, there sadly is no information about those self-made problems (i.e. custom routes). Obviously, a simple and correct solution needs to be found.
Of course, there is no such thing. When searching for hiking time rules, two candidates pop up regularly: Naismith’s rule (including Tranter’s corrections), as well as Tobler’s hiking function.
William W. Naismith’s rule – and I couldn’t find a single scientific source – is more a rule of thumb than it is exact. It states:
For every 5 kilometres, allow one hour. For every 600 metres of ascend, add another hour.
which reads as
$$
\begin{align}
\theta &= \tan^{-1}(\frac{\Delta a}{\Delta s}) \\
t &= \Delta s \left( \frac{1\mathrm{h}}{5\mathrm{km}} \right) + \Delta a \left( \frac{1 \mathrm{h}}{0.6 \mathrm{km}} \right) \\
|\vec{w}| &= \frac{\Delta s}{t}
\end{align}
$$
where \(|\vec{w}|\) is the walking speed, \(\Delta s\) the length on the horizontal plane (i.e. “forward”), \(\Delta a\) the ascend (i.e. the difference in height) and \(\theta\) the slope.
function [w, t, slope] = naismith(length, ascend) slope = ascend/length; t = length*(1/5) + ascend*(1/0.6); w = length./t; end
That looks like
Interestingly, this implies that if you climb a 3 km mountain straight up, it will take you 5 hours. By recognising that \(5 \textrm{km} / 0.6 \textrm{km} \approx 8.3 \approx 8\), the 8 to 1 rule can be employed, which allows the transformation of any (Naismith-ish) track to a flat track by calculating
$$
\begin{align}
\Delta s_{flat} &= \Delta s + \frac{5 \mathrm{km}}{0.6 \mathrm{km}} \cdot \Delta a\\
&\approx \Delta s + 8 \cdot \Delta a
\end{align}
$$
So a track of \(20 \textrm{km}\) in length with \(1 \textrm{km}\) of ascend would make for \(20 \mathrm{km} + 8 \cdot 1 \mathrm{km} = 28 \mathrm{km}\) of total track length. Assuming an average walking speed of \(5 \mathrm{km/h}\), that route will take \(28 \mathrm{km} / 5 \mathrm{km/h} = 5.6 \mathrm{h}\), or 5 hours and 36 minutes. Although quite inaccurate, somebody found this rule to be accurate enough when comparing it against times of men running down hills in Norway. Don’t quote me on that.
Robert Aitken assumed that 5 km/h might be too much and settled for 4 km/h on all off-track surfaces. Unfortunately the Naismith rule still didn’t state anything about descent or slopes in general, so Eric Langmuir added some refinements:
When walking off-track, allow one hour for every 4 kilometres (instead of 5 km). When on a small decline of 5 to 12°, subtract 10 minutes per 300 metres (1000 feet). For any steeper decline (i.e. over 12°), add 10 minutes per 300 metres of descent.
Now that’s the stuff wonderfully non-differentiable functions are made of:
That is:
$$
\begin{align}
\theta &= \tan^{-1}(\frac{\Delta a}{\Delta s}) \\
t &= \Delta s \left( \frac{1\mathrm{h}}{5\mathrm{km}} \right) + \begin{cases}
+\Delta a \left( \frac{1 \mathrm{h}}{0.6 \mathrm{km}} \right) , & \text{if $\theta > -5^\circ$} \\
-|\Delta a| \left( \frac{\left(10/60\right) \mathrm{h}}{0.3 \mathrm{km}} \right) , & \text{if $-12^\circ \le \theta \le -5^\circ$} \\
+|\Delta a| \left( \frac{\left(10/60\right) \mathrm{h}}{0.3 \mathrm{km}} \right) , & \text{if $\theta < -12^\circ$}
\end{cases} \\
|\vec{w}| &= \frac{\Delta s}{t}
\end{align}
$$
It should be clear that 12 km/h is an highly unlikely speed, even on roads.
function [w, t, slope] = naismith_al(length, ascend, base_speed) if ~exist('base_speed', 'var') base_speed = 4; % km/h end slope = ascend/length; t = length*(1/base_speed); if slope >= 0 t = t + ascend*(1/0.6); elseif atand(slope) <= -5 && atand(slope) >= -12 t = t - abs(ascend)*((10/60)/0.3); elseif atand(slope) < -12 t = t + abs(ascend)*((10/60)/0.3); end w = length./t; end
So Waldo Tobler came along and developed his “hiking function”, an equation that assumes a top speed of 6 km/h with an interesting feature: It – though still indifferentiable – adapts gracefully to the slope of the ground. That function can be found in his 1993 report “Three presentations on geographical analysis and modeling: Non-isotropic geographic modeling speculations on the geometry of geography global spatial analysis” and looks like the following:
It boils down to the following equation of the walking speed \(|\vec{w}|\) “on footpaths in hilly terrain” (with \(s=1\)) and “off-path travel” (with \(s=0.6\)):
$$
\begin{align}
|\vec{w}| = s \cdot 6e^{-3.5 \cdot | \tan(\theta) + 0.05 |}
\end{align}
$$
where \(\tan(\theta)\) is the tangent of the slope (i.e. vertical distance over horizontal distance). By taking into account the exact slope of the terrain, this function is superior to Naismith’s rule and a much better alternative to the Langmuir bugfix, especially when used on GIS data.
function [w] = tobler(slope, scaling) w = scaling*6*exp(-3.5 * abs(slope+0.05)); end
It however lacks the one thing that makes the Naismith rule stand out: Tranter’s corrections for fatigue and fitness. (Yes, I know it gets weird.) Sadly these corrections seem to only exists in the form of a mystical table that looks, basically, like that:
Fitness in minutes | Time in hours according to Naismith’s rule | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 12 | 14 | 16 | 18 | 20 | 22 | 24 | |
15 (very fit) | 1 | 1½ | 2 | 2¾ | 3½ | 4½ | 5½ | 6¾ | 7¾ | 10 | 12½ | 14½ | 17 | 19½ | 22 | 24 |
20 | 1¼ | 2¼ | 3¼ | 4½ | 5½ | 6½ | 7¾ | 8¾ | 10 | 12½ | 15 | 17½ | 20 | 23 | ||
25 | 1½ | 3 | 4¼ | 5½ | 7 | 8½ | 10 | 11½ | 13¼ | 15 | 17½ | |||||
30 | 2 | 3½ | 5 | 6¾ | 8½ | 10½ | 12½ | 14½ | ||||||||
40 | 2¾ | 4¼ | 5¾ | 7½ | 9½ | 11½ | ||||||||||
50 (unfit) | 3¼ | 4¾ | 6½ | 8½ |
where the minutes are a rather obscure measure of how fast somebody is able to hike up 300 metres over a distance of 800 metres ($20^\circ$). With that table the rule is: If you get into nastier terrain, drop one fitness level. If you suck at walking, drop a fitness level. If you use a 20 kg backpack, drop one level. Sadly, there’s no equation to be found, so I had to make up one myself.
hours = [2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 18, 20, 22, 24]; fitness = [15, 20, 25, 30, 40, 50]; table = [1, 1.5, 2, 2.75, 3.5, 4.5, 5.5, 6.75, 7.75, 10, 12.5, 14.5, 17, 19.5, 22, 24; 1.25, 2.25, 3.25, 4.5, 5.5, 6.7, 7.75, 8.75, 10, 12.5, 15, 17.5, 20, 23, NaN, NaN; 1.5, 3, 4.25, 5.5, 7, 8.5, 10, 11.5, 13.25, 15, 17.5, NaN, NaN, NaN, NaN, NaN; 2, 3.5, 5, 6.75, 8.5, 10.5, 12.5, 14.5, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN; 2.75, 4.25, 5.75, 7.5, 9.5, 11.5, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN; 3.25, 4.75, 6.5, 8.5, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN];
By looking at the table and the mesh plot it seems that each time axis for a given fitness is logarithmic.
I did a log-log plot and it turns out that the series not only appear to be logarithmic in time, but also in fitness. By deriving the (log-log-)linear regression for each series, the following equations can be found:
$$
\begin{align}
t_{15}(t) &= e^{1.35 \,ln(t) – 1.08} \\
t_{20}(t) &= e^{1.24 \,ln(t) – 0.55} \\
t_{25}(t) &= e^{1.25 \,ln(t) – 0.33} \\
t_{30}(t) &= e^{1.31 \,ln(t) – 0.21} \\
t_{40}(t) &= e^{1.14 \,ln(t) + 0.20} \\
t_{50}(t) &= e^{1.05 \,ln(t) + 0.44} \\
\end{align}
$$
These early approximations appear to be quite good, as can be seen in the following linear plot. The last three lines \(t_{30}\), \(t_{40}\) and \(t_{50}\) however begin to drift away. That’s expected for the last two ones due to the small number of samples, but the \(t_{30}\) line was irritating.
My first assumption was that the \(t_{40}\) and \(t_{50}\) lines simply are outliers and that the real coefficient for the time variable is the (outlier corrected) mean of $1.2215 \pm 0.11207$. This would imply, that the intersect coefficient is the variable for fitness.
Unfortunately, this only seems to make things better in the log-log plot, but makes them a little bit worse in the linear world.
Equi-distant intersect coefficients also did not do the trick. Well, well. In the end, I decided to give the brute force method a chance and defined several fitting functions for the use with genetic algorithm and pattern search solvers, including exponential, third-order and sigmoidal forms. The best version I could come up with was
$$
\begin{align}
t_{corrected}(t, f) &= 0.31381 \,e^{1.2097 \,ln(t) + 0.81328 \,ln(f) – 1.7307}
\end{align}
$$
This function results in a least squared error of about 21.35 hours over all data points. The following shows the original surface from the table and the synthetic surface from the function.
A maximum deviation of about 1 hour can be seen clearly in the following error plot for the $t_{30}$ line, which really seems to be an outlier.
For comparison (here’s the original table), this is the synthetic correction table:
Fitness in minutes | Time in hours according to Naismith’s rule | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 12 | 14 | 16 | 18 | 20 | 22 | 24 | |
15 (very fit) | 1¼ | 2 | 2¾ | 3½ | 4½ | 5¼ | 6¼ | 7¼ | 8¼ | 10¼ | 12¼ | 14½ | 16½ | 18¾ | 21¼ | 23½ |
20 | 1½ | 2½ | 3½ | 4½ | 5½ | 6¾ | 7¾ | 9 | 10¼ | 12¾ | 15½ | 18¼ | 21 | 23¾ | ||
25 | 1¾ | 3 | 4 | 5¼ | 6¾ | 8 | 9½ | 10¾ | 12¼ | 15½ | 18½ | |||||
30 | 2 | 3¼ | 4¾ | 6¼ | 7¾ | 9¼ | 11 | 12½ | ||||||||
40 | 2½ | 4¼ | 6 | 7¾ | 9¾ | 11¾ | ||||||||||
50 (unfit) | 3 | 5 | 7¼ | 9½ |
Fundstück aus der MATLAB-Doku: Korrelierte normalverteilte Zufallsprozesse. Gegeben seien die Erwartungswerte \(\mu\) der Variablen, sowie deren Kovarianzen \(P\).
Mittels R = mvnrnd(µ, P, N);
können dann N Werte gezogen werden.
Zwei unkorrelierte Prozesse mit \(\mu_1 = E(X_1) = 0\) und \(\mu_2 = E(X_2) = 10\), sowie Varianzen \(\sigma^2_1 = 1\) und \(\sigma^2_2 = 2\) ließen sich wie folgt erzeugen:
N = 5000; mu = [0 10]; P = [1 0; 0 2]; r = mvnrnd(mu, P, N);
Der entsprechende Plot ergibt das typische Bild:
plot(r(:,1), r(:, 2), 'r+'); axis square;
Zwei korrelierte Prozesse mit denselben Erwartungswerten und Varianzen, aber negativer Kovarianz (\(cov(X_1, X_2) = -1\)) werden analog wie folgt erstellt:
P = [1 -1; -1 2]; r = mvnrnd(mu, P, N);
Dies liefert die erwartete “schräge” Verteilung.