10

While solving PhD entrance exams I have faced the following problem:

Minimize the function $f(x)=- \sum_{i=1}^n \ln(\alpha_i +x_i)$ for fixed $\alpha_i >0$ under the conditions: $\sum_{i=1}^n x_i =1$ and $x_i \ge0$.

I was trying to use KKT multipliers - generalized Lagrange multipliers, yet I have some difficulties, namely we will get the following system of conditions:

$$\frac{1}{\alpha_i + x_i}=-\mu_i + \lambda$$ $$\sum_{i=1}^n x_i = 1$$ $$\mu_i x_i =0$$ $$\mu_i \ge 0$$ $$x_i \ge 0.$$ I can't even show it always has a solution not saying about giving explicit solution in terms of $\alpha_i$ (one can determine $x_i$'s from first and plug into second and third yet then we have to guess which $\mu_i$'s will be $0$) which will deeply depend on what they are. My second doubt is that it look to complicated as for entrance exam so perhaps one sees an easier - more clever solution?

J.E.M.S
  • 2,718

4 Answers4

6

This is the standard water-filling problem (As an example, see http://www.comm.utoronto.ca/~weiyu/loading_icc.pdf ). You would see this problem in any communications textbook including https://people.eecs.berkeley.edu/~dtse/book.html - Chapter 5, (5.39)-(5.40).

Rather than repeating steps, I would direct you to http://www.net.in.tum.de/fileadmin/TUM/NET/NET-2011-07-2/NET-2011-07-2_20.pdf - If you look at Section 5.3, you will see the exact same problem that you asked here. The details and the solution are well explained here.

The solution becomes $x_i = \left(\frac{1}{\lambda} - \alpha_i\right)^+$, where $d^+\triangleq \max(d,0)$. $\lambda$ is found such that $\sum x_i = 1$. Thus, $\lambda$ is the solution to $\sum_i \left(\frac{1}{\lambda} - \alpha_i\right)^+ = 1$ which is an equation in single variable. Using this, $x_i$ is determined.

Vaneet
  • 1,503
3

Let's do it in a few steps.

Firstly, there is a solution: you minimize a continuous function over a nonempty compact domain, so a solution exists by the Extreme Value Theorem.

Secondly, it is unique by looking at the convexity properties of the goal function and domain.

Thirdly, it must satisfy your KKT (or Fritz John) conditions by a standard application of the constraint qualification: for any feasible point, the gradients of the binding constraints are linearly independent.

Fourthly, how can you think about solving the KKT conditions?

Your first collection of KKT conditions for the different coordinates look a lot alike, and this helps to realize that large $\alpha_i$ lead to small $x_i$. Formally, if $\alpha_i \geq \alpha_j$, then $x_i \leq x_j$!

Why? Well, the inequality $x_i \leq x_j$ is surely true if $x_i$ is zero, so suppose it is positive. Then its multiplier $\mu_i$ must be zero and $$ \frac{1}{\alpha_i + x_i} = \lambda - \mu_i = \lambda \geq \lambda - \mu_j = \frac{1}{\alpha_j + x_j}. $$ So $\alpha_j + x_j \geq \alpha_i + x_i$. Rewriting $x_i - x_j \leq \alpha_j - \alpha_i \leq 0$, so $x_i \leq x_j$.

If we assume without loss of generality that $$ \alpha_1 \leq \alpha_2 \leq \cdots \leq \alpha_n, $$ it follows that $$ x_1 \geq x_2 \geq \cdots \geq x_n. $$ So the first couple of coordinates are positive, the remaining ones might be zero. Formally, there is some $k$ in $\{1, \ldots, n\}$ such that $x_1, \ldots, x_k$ are positive and $x_{k+1}, \ldots, x_n$ are zero. But which $k$?

This is a bit of simple linear algebra. For each possible $k$, you know exactly what the coordinates $x_1, \ldots, x_k$ are (and the others are zero!): by complementary slackness, $\mu_1 = \cdots = \mu_k = 0$, so $$ \frac{1}{\alpha_1 + x_1} = \frac{1}{\alpha_i + x_i} $$ implies that $x_i = x_1 + (\alpha_1 - \alpha_i)$ for $i = 1, \ldots, k$. And $$ 1 = x_1 + \cdots + x_k = k x_1 + \sum_{i=1}^k (\alpha_1 - \alpha_i). $$ Rewriting gives $$ x_1 = \frac{1 + \sum_{i=1}^k (\alpha_i - \alpha_1)}{k}. $$ Since we expressed the other $x_i$ in terms of $x_1$, we now know all coordinate values. Also remember that $x_k$ was the smallest of the positive coordinates, so we must have $$ x_k = x_1 + \alpha_1 - \alpha_k = \frac{1+\sum_{i=1}^k (\alpha_i - \alpha_k)}{k} > 0, $$ or, equivalently since $k$ is positive: $$ 1+\sum_{i=1}^k (\alpha_i - \alpha_k) > 0, $$ which tells you pretty precisely what the candidates for $k$ are. Substituting these very few candidates into the goal function will show you that you need to choose this $k$ as large as possible. I will leave those computations to you: the main insights are that large $\alpha_i$ lead to small $x_i$ and that you can easily express your vector $(x_1, \ldots, x_n)$ in terms of these $\alpha_i$ and $k$, which is the main thing: you don't need to worry overly about the exact value of the Lagrange multipliers.

Mark
  • 532
1

The Lagrangian for the problem is $$L(x,\lambda) = - \sum_{i=1}^n \ln (a_i + x_i) - \lambda \left(1- \sum_{i=1}^n x_i\right) = - \lambda + \sum_{i=1}^n \left( \lambda x_i - \ln(a_i + x_i) \right).$$ According to the Lagrangian sufficiency theorem (see Theorem 2.1 in [1]), if there exist $\lambda^{\ast}$ and $x^{\ast}$ such that

  1. $x^{\ast}_i \geq 0$
  2. $L(x^{\ast}, \lambda^{\ast}) \leq L(x, \lambda^{\ast} )~\forall x \geq 0$, and
  3. $\sum_i x^{\ast}_i = 1$,

then $x^{\ast}$ is optimal for the original constrained minimisation problem. For the specific problem, we can see this because $$f(x^{\ast}) = L(x^{\ast},\lambda^{\ast}) \leq \inf_{x \geq 0 } L(x,\lambda^{\ast}) \leq \inf_{x \geq 0, \sum_i x_i = 1 } L(x,\lambda^{\ast}) = \inf_{x \geq 0, \sum_i x_i = 1 } f(x)$$ where the first equality is by (3) and the first inequality is by (2). The proof of the general LST is similar.

To satisfy (2) $x^{\ast}$ must minimize the Lagrangian subject to $x^{\ast} \geq 0$. Looking at the rightmost expression of the Lagrangian, we see that it will do so if and only if $x^{\ast}_i$ minimizes $(\lambda x_i - \ln(a_i + x_i) )$ subject to $x^{\ast}_i \geq 0$ for each $i$. $$\frac{d}{d x_i} \left( \lambda x_i - \ln(a_i + x_i) \right) = \lambda - \frac{1}{a_i+x_i}$$ which is zero when $x_i = \frac{1}{\lambda} - a_i$, positive when $x_i$ is greater than this value, and negative otherwise. Therefore, for any $\lambda^{\ast}$, (1) and (2) are satisfied if we set $x_i^{\ast} = \max \left\{ \frac{1}{\lambda^{\ast}} - a_i , 0 \right\}$ and then (3) is also satisfied if we set $\lambda^{\ast}$ such that $$ \sum_i \max \left\{ \frac{1}{\lambda^{\ast}} - a_i , 0 \right\} = 1$$ (as stated in Vaneet's answer).

This is a piecewise linear function of $\frac{1}{\lambda^{\ast}}$, which is zero for $0 < \frac{1}{\lambda^{\ast}} \leq \min_i \alpha_i $ and strictly increasing for all larger values of $\frac{1}{\lambda^{\ast}}$, so it always has a unique solution. Figure 6 in Vaneet's third link shows "Water-filling" interpretation of the solution.

[1] Richard Weber's lecture notes on Optimization: http://www.statslab.cam.ac.uk/~rrw1/opt/O.pdf

wdm81
  • 441
0

Let

$$f(x) := - \sum_{i=1}^n \ln(\alpha_i +x_i) = - \ln\left(\prod_{i=1}^n \alpha_i + x_i\right)$$

be the objective function. Using the AM-GM inequality,

$$\frac{1}{n}\sum_{i=1}^n \alpha_i + x_i = \frac{1}{n}\sum_{i=1}^n \alpha_i + \frac{1}{n}\underbrace{\sum_{i=1}^n x_i}_{=1} \geq \left(\prod_{i=1}^n \alpha_i + x_i\right)^{\frac{1}{n}}$$

Thus,

$$\left[ \frac{1}{n} \left(1 + \sum_{i=1}^n \alpha_i \right)\right]^n \geq \prod_{i=1}^n \alpha_i + x_i$$

As the logarithm is monotonically increasing,

$$n \cdot \ln\left( \dfrac{1 + \sum_{i=1}^n \alpha_i}{n}\right) \geq \ln \left(\prod_{i=1}^n \alpha_i + x_i\right)$$

Reversing the sign,

$$-n \cdot \ln\left( \dfrac{1 + \sum_{i=1}^n \alpha_i}{n}\right) \leq -\ln \left(\prod_{i=1}^n \alpha_i + x_i\right) = f (x)$$

Hence,

$$f (x) \geq n \cdot \ln (n) -n \cdot \ln\left( 1 + \sum_{i=1}^n \alpha_i\right)$$

  • 2
    Yes you have give an estimation yet not found the minima since equality in Am-Gm holds only iff all components are equal in our case they may have no such an opportunity since differences betwen $\alpha-i$'s might be huge and $x_i$'s will nat be able to compensate that since they are from $[0,1]$. – J.E.M.S Jun 04 '16 at 10:18