For what values does the geothmetic meandian converge?

Question

The geothmetic meandian, $G_{MDN}$ is defined in this XKCD as

$$F(x_1, x_2, ..., x_n) = \left(\frac{x_1 +x_2+\cdots+x_n}{n}, \sqrt[n]{x_1 x_2 \cdots x_n}, x_{\frac{n+1}{2}} \right)$$

$$G_{MDN}(x_1, x_2, \ldots, x_n) = F(F(F(\ldots F(x_1, x_2, \ldots, x_n)\ldots)))$$

The comic also (correctly) claims that $G_{MDN}(1, 1, 2, 3, 5) \approx 2.089$

There are two convergence questions I'm interested in:

For what values of $(x_1, x_2, \ldots, x_n)$ does $G_{MDN}$ converge to a single number?
For what values of $(x_1, x_2, \ldots, x_n)$ does $G_{MDN}$ converge, but not to a single number?

I've written up Python 3 code so that you can test out numbers yourself by changing the values at the bottom of the code. If the code never seems to stop running, then $G_{MDN}$ does not converge for your input. (In these situations, you can set verbose=True to see what's happening.)

# assumes Python 3 because I assume that "/" always means float division
from typing import Iterable, Tuple, Union
from decimal import Decimal
import math
from functools import reduce  # Required in Python 3
import operator
from https://stackoverflow.com/a/48648756
def prod(iterable):
    return reduce(operator.mul, iterable, 1)
def geothmetic_meandian(nums: Iterable[float], verbose=False) -> Tuple[bool, Tuple[float, float, float]]:
    def inner_geothmetic_meandian(nums: Iterable[float]) -> Tuple[float]:
        arithmetic_mean = sum(nums)/len(nums)
        geometric_mean = prod(nums)**(1 / len(nums))
        sorted_nums = sorted(list(nums))
        if len(nums) % 2 == 0:
            # even number of numbers
            higher_median_index = int(len(nums) / 2)
            lower_median_index = higher_median_index - 1
            median = (sorted_nums[higher_median_index] + sorted_nums[lower_median_index]) / 2
        else:
            # odd number of numbers
            median = sorted_nums[int((len(nums) - 1) / 2)]
    return (arithmetic_mean, geometric_mean, median)

last_ans = None
ans = inner_geothmetic_meandian(nums)
converged = True
while not (ans[0] == ans[1] == ans[2]):
    if ans == last_ans:
        converged = False
        break
    last_ans = ans
    ans = inner_geothmetic_meandian(ans)

    if verbose:
        print(ans)

return converged, ans



if name == "main":
    verbose = False
    values = (1, 1, 3, 2, 5)
    converged, results = geothmetic_meandian(values, verbose=verbose)
    if converged:
        print(f"The geothmetic meandian of {values} converged to: {results[0]}")
    else:
        print(f"The geothmetic meandian of {values} did not converge to a single value:\nArithmetic Mean: {results[0]}\nGeometric Mean:  {results[1]}\nMedian:          {results[2]}")

I've tested this code to verify that $G_{MDN}(1, 1, 2, 3, 5) \approx 2.089$.

I have also not found any inputs that cause the program to not converge at all.

However, I have found that $G_{MDN}(1, 2, 3, 4, 5) = (2.8993696858822964, 2.899369685882296, 2.8993696858822964)$ which would mean that $(1, 2, 3, 4, 5)$ is in the second convergence class, where it converged, just not all to the same number. But I'm worried that this result is due to a rounding error in Python itself. (I quickly tried and failed to use the Decimal class due to the nth root operation.)

Thus, my hunch is that all inputs converge to a single number, but I have not been able to prove this yet. It looks like an epsilon-delta proof.

It seems likely at least at first glance that convergence for all $x_i$ is a straightforward corollary of the convergence of the arithmetic-geometric mean. — Steven Stadnicki, Mar 11 '21 at 03:29
The hover text on the comic says "Pythagorean means are nice and all, but throwing the median in the pot is really what turns this into random forest statistics: applying every function you can think of, and then gradually dropping the ones that make the result worse" which makes me think that it might not be completely straightforward, but I'm not entirely sure. — Pro Q, Mar 11 '21 at 03:35
You have two values. The arithmetic and geometric means should be somewhere in between, and the median should be the repeated value. So only one of the three values should survive to the next round--but all three do. Counting the digits, it seems you're at about the limit of precision for 64-bit IEEE-754 numbers, so you're looking at a rounding effect. — David K, Mar 11 '21 at 04:42
Don't compare float using strict equality, use epsilon ball test instead. — jlandercy, Mar 11 '21 at 05:38
Regarding the numeric experiment, the Julia code using Statistics, StatsBase; F(x) = (mean(x), geomean(x), median(x)); Fn(x, n) = ∘(fill(F, n)...)(x); Fn(BigFloat[1,2,3,4,5], 100) shows convergence of the three values to the same number up to the 48th decimal after 100 iterations (using the default BigFloat precision of 256 bits). — sijo, Mar 12 '21 at 13:18

Mike Earnest · Accepted Answer · 2021-03-12T21:03:16.557

Let $a_n,g_n$ and $m_n$ denote the three means computed at stage $n$, for $n\in \{1,2,3,\dots\}$. Let $$\begin{align}l_n&=\min(a_n,g_n,m_n)\\u_n&=\max(a_n,g_n,m_n)\end{align}$$ Since $u_n$ is a non-increasing sequence, and $l_n$ is a non-decreasing sequence, in order to prove convergence, it suffices to prove that $u_n-l_n\to 0$. We do this by showing $(u_{n+2}-l_{n+2})\le \frac79 (u_n-l_n).$

Note that $$a_{n+1}=\tfrac13(a_n+g_n+m_n)\le \tfrac{2}3u_n+\tfrac13 l_n,$$which follows by replacing the smallest of $\{a_n,g_n,m_n\}$ with $l_n$, and upper bounding the other two with $u_n$. Then, by the AM-GM inequality, $g_{n+1}\le \frac{2}3u_n+\frac13 l_n$ as well. The last two inequalities imply $$ m_{n+2}\le \tfrac23u_n+\tfrac13l_n. $$ We have given an upper bound for one of the means in the $(n+2)^{nd}$ stage. For the other two, note $$ \begin{align} g_{n+2}\le a_{n+2} &=\frac{a_{n+1}+g_{n+1}+m_{n+1}}3 \\ &\le \frac{2a_{n+1}+m_{n+1}}3\tag{AM-GM} \\ &=\frac{2a_n+2g_n+2m_n+3m_{n+1}}{9} \\ &\le\frac{2a_n+2g_n+2m_n+3u_{n}}{9}\tag{$m_{n+1}\le u_n$} \\ &\le\frac{7u_n+2l_{n}}{9} \end{align} $$ For the last step, we are replacing the smallest of $a_n,g_n,m_n$ with $l_n$, and upper bounding the other two with $u_n$.

We have shown that all three of $m_{n+2},g_{n+2}$, and $a_{n+2}$ are at most $\tfrac79u_n+\tfrac29u_n$, so we conclude that $$ u_{n+2}\le \tfrac79u_n+\tfrac29l_n$$ which implies $$ u_{n+2}-l_{n+2}\le u_{n+2}-l_n\le \tfrac79(u_n-l_n) $$

Note that this shows that $(u_n-l_n)\in O\left(c^n\right)$, where $c=\frac{\sqrt7}3$. It is actually possible to prove that for any $\epsilon>0$ that $$ u_n-l_n\in O\left(\frac1{3^n} \right). $$

Nice! A lot of answers seem to rely on the range changing at each iteration, which is really unreliable b/c (epislon,x,x) goes to (epsilon, 2x/3, x), this counterexample has broken every proof except yours. :) — Ryan Yang-Liu, Jul 20 '21 at 20:19

score 7 · Answer 2 · answered Mar 11 '21 at 05:17

Let $$a_k=\min(F^{(k)}(x_1, x_2, ..., x_n))$$ and $$b_k=\max(F^{(k)}(x_1, x_2, ..., x_n))$$ It follows from their definitions that the arithmetic and geometric means and the median of $F^{(k)}(x_1, x_2, ..., x_n)$ are all $\ge a_k$ and $\le b_k$, so $a_{k+1}\ge a_k$ and $b_{k+1}\le b_k$. We also know that $a_k$ and $b_k$ are bounded, so they must be convergent.

Let $\lim_{k\to\infty}a_k=a$ and $\lim_{k\to\infty}b_k=b$. Suppose for contradiction that $a\neq b$. Using the definition of limits, it’s fairly easy to show that you can pick $k$ such that the arithmetic and geometric means of $F^{(k)}(x_1, x_2, ..., x_n)$ are both greater than $a$ and less than $b$, which means that at least one of $a<a_{k+1}<b$ or $a<b_{k+1}<b$ is true, which contradicts $a_k$ is non-decreasing, $b_k$ is non-increasing, and their limits are $a$ and $b$. Thus $a=b$, and $G_{MDN}$ converges to a single number for all possible values.

I don't get how you can find $k$ such that AM and GM are both in $(a,b)$ — Exodd, Mar 11 '21 at 08:38
For any $\epsilon\ge 0$, we can choose $N$ such that for all $k\ge N$, $a-\epsilon<a_k\le a$ and $b\le b_k<b+\epsilon$. Let $a_k, v_k, b_k$ be the three elements of $F^{(k)}(x_1, x_2,...,x_n)$ for some $k\ge N$. We want to pick $\epsilon$ such that $\frac{a_k+b_k+v_k}{3}\in(a,b)$. By definition of $a_k$ and $b_k$ we know $a_k\le v_k\le b_k$, so $\frac{a_k+b_k+v_k}{3}\le\frac{a_k+2b_k}{3}\le\frac{a_k+2(b+\epsilon)}{3}$. Rearranging this gets an $\epsilon$ such that the arithmetic mean is less than $b$, and we can use a similar process to show it’s greater than $a$, and for the geometric mean. — iosce, Mar 11 '21 at 11:08

score -2 · Answer 3 · edited May 06 '21 at 07:05

-2

I mean, trivially, a list of an odd amount of elements all greater or equal then 1 should not encounter any problems, however if it contains a $0$ or has an even number of elements (for which the median isn't really defined), then you have a problem. If it contains a $0$, then the geometric part will always be $0$ so the whole thing will converge toward $0$. Otherwise, like others already stated, you shouldn't run into any significant problems.

edited May 06 '21 at 07:05

Siong Thye Goh

153,832

answered May 06 '21 at 06:41

JoDa

1

1

If you could explain why it is trivial, this would constitute a much better answer. – Pro Q May 07 '21 at 07:07
And the median is generally defined for an even number of elements (see my code in the question, which works for even $n$ inputs) – Pro Q May 07 '21 at 07:08

For what values does the geothmetic meandian converge?

from https://stackoverflow.com/a/48648756

3 Answers3