1

Be a set of numbers $v=(a_1, a_2, \ldots, a_n)$

I want to form the following average vector $\mu = (\frac{\sum a_i}{n}, \frac{\sum a_i}{n}, \ldots, \frac{\sum a_i}{n})$.

If I do it iteratively step by step, in each step we pick three nonequal components, $a_i,a_j$ and $a_k$, and we replace them by their mean, $s_1=\frac{a_i+a_j+a_k}{3}$, to obtain $\mu_1 = (a_1, \ldots, a_{i-1}, s_1, a_{i+1}, \ldots, a_{j-1}, s_1, a_{j+1}, \ldots, a_{k-1}, s_1, a_{k+1}, \ldots, a_n)$.

In the next step, we select three other nonequal compounents and compute $\mu_2$

By iterating, how to pick up the three elements in each step to insure $\mu_n \rightarrow \mu$?

Is this "partial averaging" has a particular name/theorem in number theory?

John
  • 251
  • 1
  • 8

3 Answers3

1

As it is, it is not really well-defined: how do you pick the set of three numbers? Depending on this, I gather any deterministic choice can be tricked into non-convergence to the "right" $\mu$. As a simple illustration, consider the choice of picking the first ("leftmost") triple of non-equal values, and look at the following, for any small constant $\epsilon > 0$: $$(1-\epsilon,1,1,1,1+\epsilon)$$ Clearly, we hope to get $\mu=(1,1,1,1,1)$; yet, the algorithm will never get towards the desired vector, as it will always stay within the first 4 components: $$(1-\frac{\epsilon}{3},1-\frac{\epsilon}{3},1-\frac{\epsilon}{3},1,1+\epsilon)$$ $$(1-\frac{2\epsilon}{9},1-\frac{2\epsilon}{9},1-\frac{\epsilon}{3},1-\frac{2\epsilon}{9},1+\epsilon)$$ $$(1-\frac{7\epsilon}{9},1-\frac{7\epsilon}{9},1-\frac{7\epsilon}{9},1-\frac{2\epsilon}{9},1+\epsilon)$$ $$[...]$$ and with this rule the last component will never be changed at all.

Clement C.
  • 68,437
  • Ok, my question wasn't clear enough. I edited the question. – John Dec 26 '14 at 13:33
  • OK. Basically, the point is that you either pick a triple of non-equal values at random, or do it deterministically according to some rule defined by your algorithm. The (not quite proven) claim/conjecture illustrated above (here, for a specific deterministic rule) is that any deterministic rule will fail on some specific instance of your initial vector $v$; that is, the algorithm can be forced to converge towards something which is not the desired vector $\mu$. I may be mistaken, but I'd be willing to bet that a working strategy should be randomized. – Clement C. Dec 26 '14 at 13:36
  • What if we consider a sliding windows of size 3, that moves from the left to the right, by sliding by one position in each iteration... – John Dec 26 '14 at 13:43
1

Your final result looks like it is $A_n =\frac1{n}\sum_{k=1}^n a_k $.

To compute this iteratively, let $A_1 =a_1 $ and $A_i =\frac1{i}\sum_{k=1}^i a_k $.

Then, for $i=2$ to $n$, $iA_i =(i-1)A_{i-1}+a_i $ or $A_i =(1-\frac1{i})A_{i-1}+\frac{a_i}{i} $.

At every step, this will have the average of this many elements.

However, if the elements are of widely differing values, this may compute an inaccurate value. Look up the work of Kahan on accurately computing means and variances.

marty cohen
  • 110,450
0

You have not specified how you choose the three nonequal entries. Suppose the initial vector is $$v = (0,1,2,3,4,100,101,102,103,104).$$ You alternately pick three entries all from the first half of the vector and then all from the second half. Then $$\mu_n\to(2,2,2,2,2,102,102,102,102,102).$$


I believe (without proof so far) that $\mu_n\to\mu$ if and only if the operations are chosen to "mix" all of the entries, that is, for any subset $S$ of the entries of the vector, there are infinitely many operations that average at least one entry in $S$ with at least one entry not in $S$.

A simple strategy that can be proved to work is to first average $a_1$, $a_2$, and $a_3$, then average $a_2$, $a_3$, and $a_4$, then $a_3$, $a_4$, and $a_5$, and so on.


The exact rate of convergence will be determined by the second-largest singular value of the matrix $$M = \begin{bmatrix} \tfrac13 & \tfrac13 & \tfrac13 & & & \\ \tfrac13 & \tfrac13 & \tfrac13 & & & \\ \tfrac13 & \tfrac13 & \tfrac13 & & & \\ & & & 1 & & \\ & & & & 1 & \\ & & & & & \ddots \end{bmatrix} \begin{bmatrix} 1 & & & & & \\ & \tfrac13 & \tfrac13 & \tfrac13 & & \\ & \tfrac13 & \tfrac13 & \tfrac13 & & \\ & \tfrac13 & \tfrac13 & \tfrac13 & & \\ & & & & 1 & \\ & & & & & \ddots \end{bmatrix} \begin{bmatrix} 1 & & & & & \\ & 1 & & & & \\ & & \tfrac13 & \tfrac13 & \tfrac13 & \\ & & \tfrac13 & \tfrac13 & \tfrac13 & \\ & & \tfrac13 & \tfrac13 & \tfrac13 & \\ & & & & & \ddots \end{bmatrix} \cdots$$ but I don't know how to compute that. Instead, here is a very crude proof of convergence. I will consider the slightly simpler case of averaging only two entries at a time.

Define the mean $m=\sum_i a_i/n$ and the variance $s^2=\sum_i(a_i-m)^2/n$. One can verify that each averaging operation preserves $m$ and does not increase $s^2$. Then, we only need to show that $s^2\to0$.

The range of the data is at least $2s$. Therefore, there is at least one pair of consecutive entries $a_j$, $a_{j+1}$ with $|a_j-a_{j+1}|\ge 2s/(n-1)$. Replacing them by two copies of $(a_j+a_{j+1})/2$ decreases the variance from $s^2$ to $s^2 - |a_j-a_{j+1}|^2/(2n) \le s^2 - 2s^2/\big(n(n-1)^2\big) = \Big(1 - 2/\big(n(n-1)^2\big)\Big)s^2$. Denote this factor by $c = 1 - 2/\big(n(n-1)^2\big)$. Each pass over the vector decreases the variance to at most $c$ times its previous value. After $k$ passes, it is at most $c^ks^2$. Now $c<1$ so $c^k\to0$ as $k\to\infty$, and we are done.

  • Yeah that "sliding window" is what I tought about, but couldn't prove it, and couldn't estimate how many iterations are needed... – John Dec 26 '14 at 13:45
  • @John: See my update for a very crude convergence proof. –  Dec 27 '14 at 04:07