2

As an academic exercise, I have to write a parallel algorithm that given a sorted array of $n$ integers computes the mode (i.e. the item with the highest frequency) efficiently using $p$ processors, where $p \le n$ is a constant.

The model we use to describe these algorithms allows lock/unlock primitives to synchronize concurrent access to shared variables.

We cannot use hash tables. Could anyone share any hint on the optimal algorithm to solve this problem?

Edit: rephrased question according to comments.

Edit 2: adding my solution for feedback. While trying to solve the exercise I thought of an algorithm in the lines of

  1. Declare two global variables, mode and modeFrequency and initialize them appropriately;
  2. For i where $1 \le i \le p$, invoke a concurrent process on a portion of the array.
  3. In each concurrent process: 3.1.. Find the local mode of the partition; 3.2. Store the local mode and its frequency in two local variables; 3.3. Compare the local mode with the global one: 3.3.1. if the mode is the same, add the local frequency to the global one; 3.3.2. if the local mode is different than the global one and the local frequency is higher than the global one, set the local mode/frequency to the global ones
  4. Return the global mode.

but I am not convinced of the correctness of the algorithm. Please note that I omitted locks/unlocks for brevity. Also, can the way I partition the array make any difference on the correctness of the algorithm?

1 Answers1

1

Since you asked for a hint:

Partition the array.

Think about that for a while. If you need more of a hint:

If you partitioned the array into $p$ partitions and assigned each partition to a single processor, could you compute the mode of the numbers within a particular partition efficiently? Would that help get you closer to a solution to what you want to achieve?

D.W.
  • 167,959
  • 22
  • 232
  • 500