## Schlüsselwortarchiv

Du betrachtest das Archiv des Tags Optimization.

• ## Quadratic interpolation given two points and one derivative

While read­ing up on line search algo­rithms in non­lin­ear opti­miza­tion for neur­al net­work train­ing, I came across this prob­lem: Giv­en a func­tion $f(x)$ , find a qua­drat­ic inter­polant $q(x) = ax^2 + bx + c$ that ful­fills the con­di­tions $f(x_0) = q(x_0)$ , $f(x_1) = q(x_1)$ and $f'(x_0) = q'(x_0)$ . Basi­cal­ly this:

So I took out my scrib­bling pad, wrote down some equa­tions and then, after two pages of non­sense, decid­ed it real­ly wasn’t worth the has­sle. It turns out that the sim­ple sys­tem

\begin{align}
f(x_0) &= ax_0^2 + bx_0 + c\\
f(x_1) &= ax_1^2 + bx_1 + c\\
f'(x_0) &= 2ax_0 + b
\end{align}

for

\begin{align}
q(x) &= ax^2 + bx + c
\end{align}

has the solu­tion

\begin{align}
a &= - \frac{f(x_0) - f(x_1) - x_0 f'(x_0) + x_1 f'(x_0)}{(x_0 - x_1)^2} \\
b &= - \frac{x_0^2 f'(x_0) - x_1^2 f'(x_0) - 2x_0 f(x_0) + 2x_0 f(x_1)}{(x_0 - x_1)^2} \\
c &= \frac{x_0^2 f(x_1) + x_1^2 f(x_0) - 2x_0 x_1 f(x_0) - x_0 x_1^2 f'(x_0) + x_0^2 x_1 f'(x_0)}{(x_0 - x_1)^2}
\end{align}

Instead of ruin­ing your time on the paper, it can be obtained more eas­i­ly in Mat­lab using

syms a b c x_0 x_1 f(x_0) f(x_1) df(x_0)
[a, b, c] = solve(...
f(x_0) == a*x_0^2 + b*x_0 + c, ...
f(x_1) == a*x_1^2 + b*x_1 + c, ...
df(x_0) == 2*a*x_0 + b, ...
a, b, c);

syms q(x)
q(x) = simplify(a*x^2 + b*x + c);


Obvi­ous­ly, the whole pur­pose of this oper­a­tion is to find an approx­i­ma­tion to the local min­i­miz­er of $f'(x)$ . This gives

\begin{align}
0 &\overset{!}{=} q'(x_{min}) \\
x_{min} &= -\frac{1}{2} \frac{x_0^2 f'(x_0) -x_1^2 f'(x_0) - 2 x_0 f(x_0) + 2 x_0 f(x_1)} {f(x_0) - f(x_1) - x_0 f'(x_0) + x_1 f'(x_0)}
\end{align}

We also would need to check the interpolant’s sec­ond deriv­a­tive $q''(x_{min})$ to ensure the approx­i­mat­ed min­i­miz­er is indeed a min­i­mum of $q(x)$ by requir­ing $q''(x_{min}) > 0$ , with the sec­ond deriv­a­tive giv­en as:

\begin{align}
q''(x) &= - 2 \frac{f(x_0) - f(x_1) - x_0 f'(x_0) + x_1 f'(x_0)}{\left( x_0 - x_1 \right)^2 }
\end{align}

The premise of the line search in min­i­miza­tion prob­lems usu­al­ly is that the search direc­tion is already a direc­tion of descent. By hav­ing $0 > f'(x_0)$ and $f'(x_1) > 0$ (as would typ­i­cal­ly be the case when brack­et­ing the local min­i­miz­er of $f(x)$ ), the inter­polant should always be (strict­ly) con­vex. If these con­di­tions do not hold, there might be no solu­tion at all: one obvi­ous­ly won’t be able to find a qua­drat­ic inter­polant giv­en the ini­tial con­di­tions for a func­tion that is lin­ear to machine pre­ci­sion. In that case, watch out for divi­sions by zero.

Last but not least, if the objec­tive is to min­i­mize $\varphi(\alpha) = f(\vec{x}_k + \alpha \vec{d}_k)$ using $q(\alpha)$ , where $\vec{d}_k$ is the search direc­tion and $\vec{x}_k$ the cur­rent start­ing point, such that

\begin{align}
\varphi(0) &= f(\vec{x}_k) \\
\varphi'(0) &= \nabla f(\vec{x}_k)' \vec{d}_k
\end{align}

then the above for­mu­las sim­pli­fy to

\begin{align}
a &= - \frac{\varphi(0) - \varphi(\alpha) + \alpha \varphi'(0)}{\alpha^2} \\
b &= \frac{\alpha^2 \varphi'(\alpha)}{\alpha^2} \\
c &= \frac{\alpha^2 \varphi(0)}{\alpha^2}
\end{align}

and, more impor­tant­ly, the local (approx­i­mat­ed) min­i­miz­er at $\alpha_{min}$ sim­pli­fies to

\begin{align}
\alpha_{min} &= \frac{1}{2} \frac{\alpha^2 \varphi'(0)}{\varphi(0)-\varphi(\alpha)+\alpha\varphi'(0)}
\end{align}

If $q(\alpha)$ is required to be strong­ly con­vex, then we’ll observe that

\begin{align}
q''(\alpha) &= 2a \overset{!}{\succeq} m
\end{align}

for an $m > 0$ , giv­ing us that $a$ must be greater than zero (or $\epsilon$ , for that mat­ter), which is a triv­ial check. The fol­low­ing pic­ture visu­al­izes that this is indeed the case:

Con­vex­i­ty of a parabo­la for dif­fer­ent high­est-order coef­fi­cients a with pos­i­tive b (top), zero b (mid­dle) and neg­a­tive b (bot­tom). Low­est-order coef­fi­cient c is left out for brevi­ty.

• ## Linear (binary) integer programming in Matlab

So, sup­pose you’re in uni­ver­si­ty, it’s that time of the year again (i.e. exams) and you already have writ­ten some of them. Some are still left though and you won­der: How hard can you fail — or: how good do you have to be — in the fol­low­ing exams giv­en that you do not want your mean grade to be worse than a giv­en val­ue.

## Linear programming

Say you’re in Ger­many and the pos­si­ble grades are [1, 1.3, 1.7, 2, .. 4] (i.e. a closed inter­val) with 1 being the best grade and 4 being only a minor step to a major fuck­up. Giv­en that you’ve already writ­ten four exams with the grades 1, 1, 1.3 and 1 and that you do not want to have a mean grade worse than 1.49 in the end (because 1.5 would be round­ed to 2 on your diplo­ma), but there still are 9 more exams to write, the ques­tion is: Which worst-case grades can you have in the fol­low­ing exams and what would that imply for the oth­ers?

This is what’s known to be a lin­ear pro­gram­ming or lin­ear opti­miza­tion prob­lem, and since the val­ues (here: the num­ber of grades) are con­strained to be dis­crete val­ues, it’s inte­ger pro­gram­ming.

The goal of lin­ear pro­gram­ming is to find the argu­ments $x$ of the objec­tive func­tion $f(x)$ such that $f(x)$ is max­i­mized, giv­en some con­straints on $x$ . In Mat­lab, all lin­ear pro­gram­ming func­tions try to min­i­mize the cost func­tion, so the prob­lem is for­mu­lat­ed as

\begin{align}
\underset{x}{\mathrm{min}} \, f(\vec{x}) \, \mathrm{such}\,\mathrm{that} \, \left\{\begin{matrix} \underline{A}\cdot \vec{x} \leq \vec{b} \\ \underline{A}_{eq} \cdot \vec{x} = \vec{b}_{eq} \end{matrix}\right.
\end{align}

Obvi­ous­ly, max­i­miz­ing an objec­tive func­tion is the same as min­i­miz­ing it’s neg­a­tive, so $\mathrm{max} \, f(\vec{x}) = -\mathrm{min} \left(\, -f(\vec{x}) \right)$ . In Mat­lab, these kind of prob­lems can be solved with the linprog func­tion.