# De-sparsified lasso

De-sparsified lasso contributes to construct confidence intervals and statistical tests for single or low-dimensional components of a large parameter vector in high-dimensional model.[1]

## 1 High-dimensional linear model

${\displaystyle Y=X\beta ^{0}+\epsilon }$ with ${\displaystyle n\times p}$ design matrix ${\displaystyle X=:[X_{1},...,X_{p}]}$ (${\displaystyle n\times p}$ vectors ${\displaystyle X_{j}}$), ${\displaystyle \epsilon \sim N_{n}(0,\sigma _{\epsilon }^{2}I)}$ independent of ${\displaystyle X}$ and unknown regression ${\displaystyle p\times 1}$ vector ${\displaystyle \beta ^{0}}$.

The usual method to find the parameter is by Lasso: ${\displaystyle {\hat {\beta }}^{n}(\lambda )={\underset {\beta \in \mathbb {R} ^{p}}{argmin}}\ {\frac {1}{2n}}\left\|Y-X\beta \right\|_{2}^{2}+\lambda \left\|\beta \right\|_{1}}$

The de-sparsified lasso is a method modified from the Lasso estimator which fulfills the Karush-Kuhn-Tucker conditions[2] is as follows:

${\displaystyle {\hat {\beta }}^{n}(\lambda ,M)={\hat {\beta }}^{n}(\lambda )+{\frac {1}{n}}MX^{T}(Y-X{\hat {\beta }}^{n}(\lambda ))}$

where ${\displaystyle M\in R^{p\times p}}$ is an arbitrary matrix. The matrix ${\displaystyle M}$ is generated using a surrogate inverse covariance matrix.

## 2 Generalized linear model

Desparsifying ${\displaystyle l_{1}}$-norm penalized estimators and corresponding theory can also be applied to models with convex loss functions such as generalized linear models.

Consider the following ${\displaystyle 1\times p}$vectors of covariables ${\displaystyle x_{i}\in \chi \subset R^{p}}$ and univariate responses ${\displaystyle y_{i}\in Y\subset R}$ for ${\displaystyle i=1,...,n}$

we have a loss function ${\displaystyle \rho _{\beta }(y,x)=\rho (y,x\beta )(\beta \in R^{p})}$ which is assumed to be strictly convex function in ${\displaystyle \beta \in R^{p}}$

The ${\displaystyle l_{1}}$-norm regularized estimator is ${\displaystyle {\hat {\beta }}={\underset {\beta }{argmin}}(P_{n}\rho _{\beta }+\lambda \left\|\beta \right\|_{1})}$

Similarly, the Lasso for node wise regression with matrix input is defined as follows: Denote by ${\displaystyle {\hat {\Sigma }}}$ a matrix which we want to approximately invert using nodewise lasso.

The de-sparsified ${\displaystyle l_{1}}$-norm regularized estimator is as follows: ${\displaystyle {\hat {\gamma _{j}}}:={\underset {\gamma \in R^{p-1}}{argmin}}({\hat {\Sigma }}_{j,j}-2{\hat {\Sigma }}_{j,/j}\gamma +\gamma ^{T}{\hat {\Sigma }}_{/j,/j}\gamma +2\lambda _{j}\left\|\gamma \right\|_{1}}$

where ${\displaystyle {\hat {\Sigma }}_{j,/j}}$ denotes the ${\displaystyle j}$th row of ${\displaystyle {\hat {\Sigma }}}$ without the diagonal element ${\displaystyle (j,j)}$, and ${\displaystyle {\hat {\Sigma }}_{/j,/j}}$ is the sub matrix without the ${\displaystyle j}$th row and ${\displaystyle j}$th column.

## References

1. ^ GEER, SARA VAN DE; BUHLMANN, PETER; RITOV, YA' ACOV; DEZEURE, RUBEN (2014). "ON ASYMPTOTICALLY OPTIMAL CONFIDENCE REGIONS AND TESTS FOR HIGH-DIMENSIONAL MODELS". The Annals of Statistics. 42: 1162–1202. arXiv:. doi:10.1214/14-AOS1221.
2. ^ Tibshirani, Ryan; Gordon, Geoff. "Karush-Kuhn-Tucker conditions" (PDF).