2

Fix $n > 1$ and let $[n] := \{ 1, 2, \dots, n \}$. Which probability mass function (PMF) over $[n]$ has the largest $2$-norm?


Doodling for the cases $n \in \{2,3\}$ does suggest that

  • the maximal $2$-norm is attained when the support of the PMF is a singleton, i.e., at the "corners" of the probability simplex

  • the minimal $2$-norm is attained when the PMF is uniform over $[n]$

In essence, we have the following (non-convex) quadratic program (QP)

$$ \begin{array}{ll} \underset {{\bf x} \in \Bbb R^n} {\text{maximize}} & \| {\bf x} \|_2^2 \\ \text{subject to} & {\bf 1}_n^\top {\bf x} = 1 \\ & {\bf x} \geq {\bf 0}_n \end{array} $$

One way of getting rid of ${\bf x} \geq {\bf 0}_n$ is to introduce $n$ new variables $y_i^2 := x_i$ and rewrite the QP above as follows

$$ \begin{array}{ll} \underset {{\bf y} \in \Bbb R^n} {\text{maximize}} & \sum\limits_{i=1}^n y_i^4\\ \text{subject to} & \sum\limits_{i=1}^n y_i^2 = 1 \end{array} $$

where there is a single equality constraint. We define the Lagrangian

$$ \mathcal L ({\bf y}, \mu) := \frac14 \sum\limits_{i=1}^n y_i^4 - \frac{\mu}{2} \left( \sum\limits_{i=1}^n y_i^2 - 1 \right) $$

Differentiating and finding where the partial derivatives vanish, we obtain

$$ \begin{aligned} y_1 \left( y_1^2 - \mu \right) &= 0 \\ &\vdots \\ y_n \left( y_n^2 - \mu \right) &= 0 \\ \sum\limits_{i=1}^n y_i^2 &= 1 \end{aligned} $$

Note that $y_i = 0$ or $y_i^2 = \mu$. Let $\operatorname{card} ({\bf y})$ denote the cardinality of the support, i.e., the number of non-zero entries of $\bf y$. Hence,

$$\sum\limits_{i=1}^n y_i^2 = \mu \operatorname{card} ({\bf y}) = 1$$

and, thus, $\color{blue}{x_i \in \left\{ 0, \dfrac{1}{\operatorname{card} ({\bf x})} \right\}}$. Note that $\| {\bf x} \|_2^2 = \dfrac{1}{\operatorname{card} ({\bf x})}$, which is maximal when $\color{blue}{\operatorname{card} ({\bf x}) = 1}$.


Is this correct? If so, is there a more elegant way of showing that ${\bf x}_{\max} \in \{ {\bf e}_1, {\bf e}_2, \dots, {\bf e}_n \}$?


Related

1 Answers1

3

Let $p_1,\dots,p_n$ be the probabilities of the elements $\{x_1,\dots,x_n\}$. We have that $$\sum_i p_i^2 \leq \bigg(\sum_i p_i\bigg)^2=1$$ and hence the maximal $2$-norm the distribution can have is $1$. This is achieved by a singleton.

On the other hand, we have $$\frac{1}{n}=\sum_{i}p_i\frac{1}{n} \leq \bigg(\sum_ip_i^2\bigg)^{1/2}\cdot \bigg(\sum_i\frac{1}{n^2}\bigg)^{1/2}$$ which gives the lower bound $$\frac{1}{n} \leq \sum_ip_i^2$$ which is achieved by the uniform distribution $p_1=\dots=p_n$.

Small Deviation
  • 2,442
  • 1
  • 6
  • 24