I have a question about the proof of Lemma 1 in this paper by Josephy, which discusses necessary and sufficient conditions on $g$ such that $f\circ g$ (respectively $g\circ f$) is a function of bounded variation for all $f$ of bounded variation. I have copied the statement of the lemma and a part of the proof here for convenience.
Definition: A function $f : [0, 1]\rightarrow\mathbb{R}$ is said to be of $N$-bounded variation if $f^{-1}([a, b])$ can be written as a union of $N$ or fewer intervals (where we allow singletons as degenerate closed intervals) for all $[a, b]\subseteq\mathbb{R}$.
Lemma 1: Let $h$ be a function of $N$-bounded variation such that $|h(x)|\leq M$ for all $x\in[0, 1]$. Then $\text{Var}\:h\leq 4M(N+1)$.
Proof: If $\text{Var}\:h > 4M(N+1)$, then there exists a partition $\{x_{0},\dots, x_{n}\}$ of $[0, 1]$ with $$\sum_{i = 1}^{n}|h(x_{i})-h(x_{i-1})| > 4M(N+1).$$ Let $J_{i}$ denote the closed interval with endpoints $h(x_{i-1})$ and $h(x_{i})$. Some $[a', b']\subseteq [-M, M]$ with $a' < b'$ is covered more than $2(N+1)$ times by intervals of the form $J_{i}$.
I am having trouble proving the last statement above. Proving this amounts to showing that there exist indices $i_{1} < \cdots < i_{\ell}$ with $\ell > 2(N+1)$ such that $\bigcap_{k = 1}^{\ell}J_{i_{k}}$ is a non-degenerate closed interval. (Note that the displayed equation above implies that $n > 2(N+1)$.)
My thoughts:
I have tried using the inclusion-exclusion principle, but it does not seem to help. One thing to be noted is that consecutive intervals $J_{i}$ and $J_{i+1}$ share at least one endpoint. With this in mind, I have tried examining the level sets of the piecewise-linear function $f : [0, 1]\rightarrow \mathbb{R}$ with $f(x_{i}) = h(x_{i})$ for each $i$. Intuitively, since $\sum_{i =1}^{n}|J_{i}|$ is large, if the function is initially monotonically increasing (say), then it must eventually decrease, and so on. (Another way of thinking about this is that we're trying to pack a long string of length $\sum_{i = 1}^{n} |J_{i}|$ in a "box" of length $2M$ by folding it. We need the string to overlap sufficiently many times.) However, I have not been able to translate these ideas into a rigorous proof.
Edit: Here's another observation that might help: we may assume without loss of generality that $h(x_{0}) < h(x_{1})$, $h(x_{1}) > h(x_{2})$, $h(x_{2}) < h(x_{3})$, and so on.
Any help is appreciated. Thank you.