1

In the problem of pair-sum we are given a multiset $A$ and a number $\alpha$. We are asked to find whether there is a pair ($2$ numbers) of $A$ s.t. their sum is $\alpha$. Here all numbers are small/constant, $O(1)$, sum of $2$ small numbers requires $O(1)$ actions and their size is $O(1)$ bits for representation.

I'd like to analyse an efficient algorithm for this. The algorithm sorts $A$ and then iterates over the endpoints. If their sum is less than $\alpha$, we will check the second smallest element and the largest. If their sum is larger, we will check the smallest and the second largest. So on and on.

This algorithm takes $O(n\log n)$ due to sorting. However, trying to write it as a single-work tape TM, I'm having trouble. Assuming we have $2$ tapes: one read only tape for the input, and another read/write for the working process.

Sorting the array and writing it on the TM takes $O(n\log n) $ by merge sort, using a single tape. However, what about the process itself of comparison?

If we had $2$ tapes, or a single tape with $2$ access heads, we could have done trivially in $O(n)$. But having a single tape seems to be problematic, as I might run back and forth many times, and running back and forth might take $O(n)$.

My question follows: is there a way to implement this algorithm, or any other algorithm for pair-sum, such that it will need $O(n\cdot \log n)$ runtime, on a TM with $2$ tapes: the first is read only and the second is read-write?

1 Answers1

3

Summary: There is no need to sort the given numbers since whether there are two numbers in $A$ such that their sum is $\alpha$ depends on the set of numbers in $A$. Since the choices for the set of numbers in $A$ is $O(1)$, there is an $O(n)$-time algorithm/TM with one read-only tape.


Assume that multiset $A$ and a number $\alpha$ are given as $c_{a_i}\square c_{a_2}\square\cdots\square c_{a_m}\square' c_{\alpha}$ as input on the tape, where

  • $c_{a_i}$ stands for the cells that represent $a_i$, the $i$-th number in $A$ as a binary number.
  • $c_{\alpha}$ stands for the cells that represent $\alpha$ as a binary number.
  • $\square$ and $\square'$ are two field separators (neither of them appear in $c_*$).

Since "their size is $O(1)$ bits for representation", there is a constant $c\in \mathbb N$ such that each number in $A$ uses at most $c$ cells, i.e., $a_i\in [2^c] = \{0,...,2^c−1\}$.

Let us specify Turing machine (TM) $M$ as follows.

Given the input as described above, TM $M$ will,
    for each number $x\in[2^c]$:
        check whether $x$ is in $A$. If yes, for each number $y\in \{x+1, x+2, \cdots, 2^c-1\}$:
            check whether $y$ is in $A$. If yes, check whether $x+y=\alpha$. If still yes, halt and accept.
    for each number $x\in[2^c]$:
        check whether $x$ appears in $A$ at least twice. If yes, check whether $2x=\alpha$. If still yes, halt and accept.
    Halt and reject.

Since $c$ is a constant, we can hardcode all "for" loops, $x$, $y$, the result of each check, $x+y$, $2x$, etc. using states and state transitions of $M$. There is no need to alter any tape cell.

Each "check" above involves moving the head of $M$ from the start of the input to the end of the input, and then back to the start of the input, which takes $O(n)$ time. The total running time of $M$ is no more than $2^c2^c2O(n) + 2\cdot2^cO(n)$, which is $O(n)$ still.

We can improve the algorithm/$M$ so that it could run faster with less states. However, that is another task.

John L.
  • 39,205
  • 4
  • 34
  • 93