3

The context is ordinary multivariate regression with $k$ (>1) regressors, i.e. $Y = X\beta + \epsilon$, where $Y \in \mathbb{R}^{n \times 1}$ vector of predicted variable, $X \in \mathbb{R}^{n \times (k+1)}$ matrix of regressor variables(including ones in the first column) $\beta \in \mathbb{R}^{(k+1) \times 1}$ vector of coefficients, including intercept.

Say, I have already estimated $\beta$ as $\hat{\beta} = (X'X)^{-1} X'Y.$

I have to solve the following program:

minimize $f(B) = L\beta$ ( $L$ is a fixed $1\times (k+1)-$vector ) such that: $[(\beta-\hat{\beta})' \cdot X'X \cdot (\beta-\hat{\beta})] / [(Y - X\hat{\beta})' (Y - X\hat{\beta}) ]$ is less than a given value $c$.

Note that this is a linear optimization program with respect to $\beta$ with quadratic constraints.

I don't understand how we can solve this optimization - I was going through some online resources, each of which involve manually computing gradients of the objective as well as constraint functions - which I want to avoid (at least manually doing this).

Can you please help solve this optimization problem in R ? The inputs would be: $X$, $Y$, $\hat{\beta}$, $L$ and $c$

Please let me know if any further information is required - the set-up is pretty general.

2 Answers2

4

We have the quadratically constrained linear program (QCLP) in $\beta \in \mathbb R^p$

$$\begin{array}{ll} \text{minimize} & \langle \mathrm w, \beta \rangle\\ \text{subject to} & \| \mathrm X (\beta - \hat{\beta}) \|_2^2 \leq \gamma^2 \| \mathrm y - \mathrm X \hat{\beta} \|_2^2\end{array}$$

where $\gamma > 0$. Let

$$r := \gamma \| \mathrm y - \mathrm X \hat{\beta} \|_2$$

and

$$\mathrm z := \mathrm X (\beta - \hat{\beta})$$

If $\mathrm X$ has full column rank, then

$$\beta = \hat{\beta} + (\mathrm X^{\top} \mathrm X)^{-1} \mathrm X^{\top} \mathrm z$$

Hence, we have the following QCLP in $\mathrm z \in \mathbb R^n$

$$\begin{array}{ll} \text{minimize} & \langle \mathrm w, (\mathrm X^{\top} \mathrm X)^{-1} \mathrm X^{\top} \mathrm z \rangle\\ \text{subject to} & \| \mathrm z \|_2^2 \leq r^2\end{array}$$

which can be rewritten as

$$\begin{array}{ll} \text{minimize} & \langle \tilde{\mathrm w}, \mathrm z \rangle\\ \text{subject to} & \| \mathrm z \|_2^2 \leq r^2\end{array}$$

where $\tilde{\mathrm w} := \mathrm X (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w$. Since the objective function is linear and the feasible region is convex, the minimum should be attained on the boundary of the Euclidean ball of radius $r$ centered at the origin. Hence, let us solve the following equality-constrained (non-convex) QCLP instead

$$\begin{array}{ll} \text{minimize} & \langle \tilde{\mathrm w}, \mathrm z \rangle\\ \text{subject to} & \| \mathrm z \|_2^2 = r^2\end{array}$$

We define the Lagrangian

$$\mathcal L (\mathrm z, \lambda) := \langle \tilde{\mathrm w}, \mathrm z \rangle - \frac{\lambda}{2} (\mathrm z^{\top} \mathrm z - r^2)$$

which produces the minimizer

$$\mathrm z_{\min} := - r \, \frac{\tilde{\mathrm w} \,\,}{\| \tilde{\mathrm w} \|_2} = - r \, \frac{\mathrm X (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \,\,}{\| \mathrm X (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \|_2}$$

Thus, the minimizer is

$$\boxed{\beta_{\min} := \hat{\beta} + (\mathrm X^{\top} \mathrm X)^{-1} \mathrm X^{\top} \mathrm z_{\min} = \hat{\beta} - r \, \frac{(\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \,\,}{\| \mathrm X (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \|_2}}$$

and the minimum is

$$\begin{array}{rl} \langle \mathrm w, \beta_{\min} \rangle &= \langle \mathrm w, \hat{\beta} \rangle - r \, \dfrac{\mathrm w^{\top} (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \,\,}{\| \mathrm X (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w \|_2}\\\\ &= \langle \mathrm w, \hat{\beta} \rangle - r \, \dfrac{\mathrm w^{\top} (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w}{\sqrt{\mathrm w^{\top} (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w}}\\\\ &= \langle \mathrm w, \hat{\beta} \rangle - r \sqrt{\mathrm w^{\top} (\mathrm X^{\top} \mathrm X)^{-1} \mathrm w}\end{array}$$

2

The nloptpr package lets you handle such problem. You can specify an algorithm there. Many of the algorithms do require gradients. However, the COBYLA-algorithm (Constraint Optimization by Linear Approximation) does not. It will approximate the solution until it is reasonably close to the optimum (you can specify this) or as soon as it reachs the maximum number of iterations you allow it to do. An R-Code for this could look like this:

library(nloptr)
eval_f <- #here you specify the Objective function.

x0<- #specify a starting point. You can use c(0,...,0) for example

#These options select the algorithm, allow it to use a max number of 200 iterations and make it stop after rel. change becomes lower than 1.0e-8. #You can also add the option xtol_abs. Feel free to change the numbers as needed:

opts <- list("algorithm" = "NLOPT_LN_COBYLA", "xtol_rel" = 1.0e-8 , "maxeval " = 200 )

eval _ g _ ineq <- #here you add your inequality constraint

Specify upper and lower bounds of the variables

lb <- c (0,0,...,0 ) #if beta is unbounded, remove these two ub <- c ( ... )

#Finally: Use nloptr () to solve the problem results <- nloptr ( x0 = x0 , eval_f = eval_f , eval_g_ineq = eval_g_ineq , lb = lb , ub = ub , opts = opts )

As you did not provide a lot of information about the data, I could not give any more detailed information concerning the constraints and the objective.

YukiJ
  • 2,579