9

Suppose that you have a video file which pixel order has been shuffled once. That is, a random order have been defined once and applied to all frames.

Does it exist some known approach for retrieving the initial order of pixels?

I have some ideas around retrieving the initial topology by placing pixels which values are correlated in space and time closer together. I wonder if this has been studied and if efficient algorithms were published.

Also this problem can be thought of as a way to project to a 2D matrix a set of values varying in time in order to be able to apply computer vision techniques (like CNN), with the assumption that these values are indeed somehow correlated.

Ethan
  • 1,657
  • 9
  • 25
  • 39

2 Answers2

5

A general solution to this does not exist, even if we add some assumptions about the distribution of e.g. colours and shapes in the images or temporal coupling such as consecutive frames being similar.

Problem

Let $F_1,\dots,F_i$ be the $n$ original frames, each with $m$ pixels. Let $P$ be the permutation that is applied to the pixels of each frame before we get them. You can think of $P$ as the enemy's code-book.

Now, as input we are receiving $P(F_1),\dots,P(F_n)$. The goal is to find the inverse permutation $Q$ to restore the images. Thus $QP=I$ is the identity map and for example $Q(P(F_1))=F_1$. Note that we do not know any of the correct frames $F_i$.

Let $Q_1,...,Q_{m!}$ be the $m!$ possible permutation functions of the $m$ pixels.

The goal is to select the unique $j\in\{1,\dots,m!\}$ so that $Q_jP=I$.

No General Solution

Under our statistical model this means selecting the $Q_j$ which maximises the likelihood that $Q_j(P(F_i))$ is drawn from the same distribution as the reference statistics for images and the temporal statistics between consecutive frames $Q_j(P(F_{i})$ and $Q_j(P(F_{i+1})$ which is our prior knowledge.

There is a canonical counter-example where the enemy gives you a scrambled movie with two frames where all the pixels are the same colour, so $n=2$, $F_1=F_2$ and $Q_j(F_1)=Q_j(F_2)=F1=F2$ for every $j$. Thus, for all $j$, the in-frame and inter-frame statistics are equiprobable for each $j$ and give us no information to select the maximum likelihood permutation $Q_j$ (except in the degenerate case where $m!=1$).

Thus, we cannot guarantee uniqueness and the problem is unsolvable without further assumptions.

Further Assumptions

It is interesting to see if we can solve the problem by adding more constraints.

If we restrict the enemy to only sending us "real" movies and assuming there are enough different pixels and frames to so that a unique $Q_j$ with maximum likelihood exists, we would still have to calculate statistics for $O(m! \times n)$ permuted frames to find the maximum.

This is brute force code-breaking.

In order to benefit from neural networks, and back-propagation in particular, we would need a differentiable loss function with respect to the input (which is an encoding of $j$ or our permutation $Q_j$). The question then, would be to see if such a function can be found.

Otherwise the problem is more similar to cryptanalysis in the special case where we know that the enemy's code book is a permutation of the clear-text (or clear-image).

mjul
  • 307
  • 2
  • 8
4

This is a fascinating combinatorial problem. I would featuring each pixel using its full temporal trajectory, then embed them in a grid using the k nearest neighbors. The real goal is to maximize the likelihood of the video being a sequence of natural (real life) images, which you can test with a classifier, but you might be able to get away with just a smoothness cost; say, the sum of differences between adjacent pixels. Once you have started filling in the grid, smoothness constraints will reduce the search space (since a pixel will have to be close to multiple other pixels), thus speeding things up, assuming you are using an efficient data structure for querying the nearest neighbors; see for example http://www.itu.dk/people/pagh/SSS/ann-benchmarks/

Emre
  • 10,541
  • 1
  • 31
  • 39