Notes
$$ \xdef\scal#1#2{\langle #1, #2 \rangle} \xdef\norm#1{\left\lVert #1 \right\rVert} \xdef\dist{\rho} \xdef\and{\&}\xdef\AND{\quad \and \quad}\xdef\brackets#1{\left\{ #1 \right\}} \xdef\parc#1#2{\frac {\partial #1}{\partial #2}} \xdef\mtr#1{\begin{pmatrix}#1\end{pmatrix}} \xdef\bm#1{\boldsymbol{#1}} \xdef\mcal#1{\mathcal{#1}} \xdef\vv#1{\mathbf{#1}}\xdef\vvp#1{\pmb{#1}} \xdef\ve{\varepsilon} \xdef\l{\lambda} \xdef\th{\vartheta} \xdef\a{\alpha} \xdef\vf{\varphi} \xdef\Tagged#1{(\text{#1})} \xdef\tagged*#1{\text{#1}} \xdef\tagEqHere#1#2{\href{#2\#eq-#1}{(\text{#1})}} \xdef\tagDeHere#1#2{\href{#2\#de-#1}{\text{#1}}} \xdef\tagEq#1{\href{\#eq-#1}{(\text{#1})}} \xdef\tagDe#1{\href{\#de-#1}{\text{#1}}} \xdef\T#1{\htmlId{eq-#1}{#1}} \xdef\D#1{\htmlId{de-#1}{\vv{#1}}} \xdef\conv#1{\mathrm{conv}\, #1} \xdef\cone#1{\mathrm{cone}\, #1} \xdef\aff#1{\mathrm{aff}\, #1} \xdef\lin#1{\mathrm{Lin}\, #1} \xdef\span#1{\mathrm{span}\, #1} \xdef\O{\mathcal O} \xdef\ri#1{\mathrm{ri}\, #1} \xdef\rd#1{\mathrm{r}\partial\, #1} \xdef\interior#1{\mathrm{int}\, #1} \xdef\proj{\Pi} \xdef\epi#1{\mathrm{epi}\, #1} \xdef\grad#1{\mathrm{grad}\, #1} \xdef\gradT#1{\mathrm{grad}^T #1} \xdef\gradx#1{\mathrm{grad}_x #1} \xdef\hess#1{\nabla^2\, #1} \xdef\hessx#1{\nabla^2_x #1} \xdef\jacobx#1{D_x #1} \xdef\jacob#1{D #1} \xdef\grads#1#2{\mathrm{grad}_{#1} #2} \xdef\subdif#1{\partial #1} \xdef\co#1{\mathrm{co}\, #1} \xdef\iter#1{^{[#1]}} \xdef\str{^*} \xdef\spv{\mcal V} \xdef\civ{\mcal U} \xdef\other#1{\hat{#1}} \xdef\prox{\mathrm{prox}} \xdef\sign#1{\mathrm{sign}\, #1} \xdef\brackets#1{\left( #1 \right)} $$
Počítejme $$ (Y - X \xi)^T (Y - X \xi) = \ (Y^T - \xi^T X^T) (Y - X \xi) = \ Y^T Y - Y^T X \xi - \xi^T X^T Y + \xi^T X^T X \xi $$
a pak $\grad{}$ tohoto výrazu je
$$ 0 - (Y^T X)^T - X^T Y + (X^T X + (X^T X)^T) \xi \ 0 - X^T Y - X^T Y + (X^T X + X^T X) \xi \ -2 X^T Y + 2 X^T X \xi $$
Viz přednáška z lineárních statistických modelů
Proximální operátor $l_1$-normy
Víme, že $\prox_ {\l \norm{\cdot}_ 1}$ je řešení minimalizačního problému $$ \min_{\vv x} \brackets{ \lambda \norm{\vv x}_1 + \frac 1 {2} \norm{\vv x - \vv v}_2^2 } $$
Uvědomme si, že $$ \grads{\vv x}{\norm{\vv x}_1} = \mtr{ \parc {|x_1|} {x_1} \ \vdots \ \parc {|x_n|} {x_n} }, $$ proto $$ \brackets{\grads{\vv x}{\norm{\vv x}_1}}_i = \begin{cases} \sign x_i & x_i \neq 0 \end{cases} $$ a také $$ \grads{\vv x}{\norm{\vv x - \vv v}_2^2} = \grads{\vv x}{\brackets{ \sum_{i = 1}^n (x_i - v_i)^2 }} = 2 \mtr{ (x_1 - v_1) \ \vdots \ (x_n - v_n) } $$ Tedy stacionární bod $\hat{\vv x}$ našeho problému musí splňovat $$ \lambda \mtr{ \parc {|x_1|} {x_1} \ \vdots \ \parc {|x_n|} {x_n} } $$