26

I remember seeing a list of False Proofs when I was taking Discrete Maths and I found it to be very interesting and also helpful.

So, if anyone knows some common proof mistakes students make or some cool sneaky way to trick people in a proof send them my way! Bonus points if they're cs/algorithms related.

Thank you so much!

13 Answers13

19

One of my favourites is the "brothers paradox": https://en.wikipedia.org/wiki/Boy_or_Girl_paradox

I tell it as I learned it*, as follows: in a village, each family has two children, elder and younger. The children are born with probability $\frac12$ for either boy or girl.

I knock on a door, and a boy opens it and says "I'm the elder". What is the probability he has a sister? (clearly $\frac12$).

I knock on another door, and a boy opens it and says "I'm the younger". What is the probability he has a sister? (clearly $\frac12$).

I knock on a third door, and a boy opens it and says nothing. What is the probability he has a sister?

Now the lies start: if I use the law of total probability, I might say something like: $$Pr[sister]=Pr[sister|elder]\cdot Pr[elder]+Pr[sister|younger]\cdot Pr[younger]=\frac{1}{2}Pr[elder]+\frac{1}{2}Pr[younger]=\frac12$$ But on the other hand, I could say that the only case where he does not have a sister is the (boy,boy) setting, and we already know this house is not (girl,girl), so the probability is $\frac23$.

Both are very convincing, if told correctly.

The fallacy is that the sample space is undefined, or put more whimsically "How does the family decide who answers the door?"

*I learned it this way from the legendary Prof. Raz Kupferman.

Shaull
  • 17,814
  • 1
  • 41
  • 67
16

Merge-sort can be done in linear time!

Indeed, the time complexity to sort a list or array of length $n$ verifies$^{(1)}$: $$T(n) = T\left(\left\lfloor\frac{n}2\right\rfloor\right) + T\left(\left\lceil\frac{n}2\right\rceil\right)+ \Theta(n)$$

Let us prove by induction that $T(n) = \Theta(n)$:

  • the base case is $n = 1$, and this is the base case of the merge-sort algorithm, which is indeed done in constant time, so $T(1) = \Theta(1)$.
  • assume $T(k) = \Theta(k)$ for all $k< n$, for some $n> 1$. Then using the previous formula and induction hypotheses: $$T(n) = T\left(\left\lfloor\frac{n}2\right\rfloor\right) + T\left(\left\lceil\frac{n}2\right\rceil\right)+ \Theta(n) = \Theta\left(\left\lfloor\frac{n}2\right\rfloor\right) + \Theta\left(\left\lceil\frac{n}2\right\rceil\right)+ \Theta(n) = \Theta(n)$$

By induction, we indeed got that $T(n) = \Theta(n)$ and we got a comparison sort in linear time.$^{(2)}$

$\phantom{a}$

$^{(1)}$ I will not prove that fact, this is a well-known (correct) formula.

$^{(2)}$ Actually, using the same kind of proof, one could prove that the merge operation is done in $\Theta(1)$, and so, merge-sort can be done in $\Theta(1)$ too.

Nathaniel
  • 18,309
  • 2
  • 30
  • 58
9

I have often seen among undergraduates that they believe that the heaps are constructed in $\Theta(n \log n)$ time.

The standard algorithm for that is to insert an element to a heap one after another. Each insertion takes $O(\log n)$ time. Atleast $n/2$ elements have heapify cost $\Theta(\log n)$. This gives overall running time of the algorithm: $\Theta(n \log n)$.

However, interestingly, the heaps can be constructed in $\Theta(n)$ time. The algorithm places all the numbers contiguously in an array. Then the algorithm calls heapify operation starting from the last element and iteretively do the same till the first element. In each heapify call, we are bubbling the elements down. Note that the first element has heapify cost of $\Theta(\log n)$ and atleast $n/2$ elements at the end of the array have heapify cost of $\Theta(1)$. Using simple inequlities, one can show that the total heapify cost adds to only $\Theta(n)$.

I find it a sneaky proof indeed!

Inuyasha Yagami
  • 6,277
  • 1
  • 12
  • 23
9

This one is regarding $\mathsf{FPT}$ time algorithm.

Suppose an algorithm has time complexity of: $O((\log n)^k \cdot n^{O(1)})$.

Is it an $\mathsf{FPT}$ time algorithm in parameter $k$?

Well obviously no since the time complexity is not of the form $O(f(k) \cdot n^{O(1)})$. This is what most of people think, and even many researchers in the field as well.

However, that is incorrect!

Interestingly, $(\log n)^k$ $= O(2^{k^2/2} \cdot n)$ (See Hint 3.18, Page 74, from the book: Parameterized Algorithms).

Thus, the running time of the algorithm is $O(2^{k^2/2} \cdot n^{O(1)})$ which is indeed $\mathsf{FPT}$ in parameter $k$.

So, the answer is actually yes!

Inuyasha Yagami
  • 6,277
  • 1
  • 12
  • 23
8

This one is classic.

$0$-$1$ Knapsack problem is polynomial time problem since there is dynamic programming solution with running time $O(n W)$ time.

However, it is incorrect.

Note that the input size is $poly(n + \log W)$. Therefore, the running time of $O(n W)$ is exponential in the input size. Infact, the $0$-$1$ Knapsack problem is $\mathsf{NP}$-hard and the running time of $O(n W)$ is pseudo-polynomial.


Similarly, a naive algorithm for testing if a number is a prime number has running time $O(\sqrt{n})$. It is not polynomial running time algorithm.

Inuyasha Yagami
  • 6,277
  • 1
  • 12
  • 23
6

An simple example that I can think of, which is commonly use as introductory topic in amortized analysis, is the analysis of dynamic table. The usual scenario is to analyze the total time needed to insert $n$ elements in a dynamic table with some initial capacity, say 1. Now each time the table is full, it will be resized, say twice the size, by creating a larger table and transferring all current elements to the new table and then proceed with the insertion. Clearly, an insert takes $O(n)$ time in the worst-case. So if there are $n$ elements to be inserted, the total time to insert them all is at most $n \times O(n) = O(n^2) $. This of course is not tight and with some careful analysis, e.g. summing-up the actual time of each insert instead of relying on the worst-case, it can be shown that many of these inserts actually takes $O(1)$ time, hence the time needed to perform the sequence is at most $O(n) $.

Russel
  • 2,788
  • 1
  • 9
  • 16
6

Induction often yields great wrong proofs because there are many things which can fail:

  • The induction base $P(0)$ can be false, or it may be missing at all and hence the rest of the induction is based on a false assumption. If the induction base is false, everything else is an ex falso quodlibet.
  • The induction step $P(k) \Rightarrow P(k+1)$ can have an antecedens which is not true for all $k$. Also in this case, we have an ex falso quodlibet.

So, with an ex falso quodlibet, every nonsense can be "proven".

There are many cases where it is not quite obvious that either $P(0)$ or $P(k)$ is actually false. One of my favourites (involving $P(k)$ not being generally true): Every non-negative integer number is twice as large as itself:

$\forall n\in \mathbb{N}, n \geq 0: n = 2n$

https://math.stackexchange.com/questions/2439032/given-a-wrong-proof

rexkogitans
  • 227
  • 2
  • 6
4

Spaghetti sort can sort numbers in linear time.

Assume without loss of generality that the numbers to be sorted are all in the interval $(0,1)$. A whole spaghetti has length $1$.

  1. For each number cut one spaghetti to the corresponding length. This clearly takes $O(n)$.

  2. Grab all the spaghetti and hit the whole bundle on the table upright. Takes $O(1)$.

  3. Lower your hand slowly on the bundle of spaghetti. The first spaghetti you hit corresponds to the biggest number. Remove it from the bundle. Doing this once takes $O(1)$. Repeat this step $n$ times.

Total time to sort the entire stack is $O(n)$.

Edit: I encountered this 'algorithm' first in my student days and we knew there had to be a problem because sorting numbers should always take at least $O(n\log(n))$. I thought the key problem lies in step 2. This step is not actually constant effort, it takes longer with a bigger number of spaghetti. From discussions in the comments I learned that step 1 is not actually linear either. You need to cut to a certain accuracy and this takes more effort if you need higher accuracy.

quarague
  • 141
  • 3
3

Maybe not exactly what you're looking for, but the Monty Hall problem is famously counterintuitive.

There are three doors. One conceals a car, and the other two goats. You select one door. The host opens another door, and it has a goat. By changing your choice to the other closed door, you double your odds of choosing the car, compared to keeping your original choice.

Many readers of Savant's column refused to believe switching is beneficial and rejected her explanation. After the problem appeared in Parade, approximately 10,000 readers, including nearly 1,000 with PhDs, wrote to the magazine, most of them calling Savant wrong...Paul Erdős, one of the most prolific mathematicians in history, remained unconvinced until he was shown a computer simulation demonstrating Savant's predicted result.

https://en.wikipedia.org/wiki/Monty_Hall_problem

Paul Draper
  • 396
  • 1
  • 7
3

Along the lines of rexkogitans's remark that proofs by induction are fertile ground for false proofs: one of my favorites is that every natural number can be uniquely described in fifteen English words or less.

Let the set of counterexamples be $S$ and assume, for contradiction, that $S$ is nonempty. Every nonempty set of natural numbers must have a least element; call it $x$.

But then $x$ is "the smallest natural number that cannot be uniquely described in fifteen English words or less", and therefore $x$ is not in $S$, a contradiction. So $S$ must be empty and the theorem holds.

user168715
  • 221
  • 1
  • 5
2

As a general idea, maybe a 'proof' of a greedy algorithm as applied to a problem where the failure case is not immediately obvious?

A classic example (used for introducing Ford-Fulkerson) would be always taking the s-t path with max flow (without creating the residual graph), with some hand waving on how another "optimal" solution replacing paths $P_0, \cdots, P_n$ with paths $Q_0, \cdots, Q_n$ must have the cumulative flow of the first set larger or equal to the second set.

For example (an argument I just came up with): all of the Q flows cannot be added to the current optimal solution, and thus each of them shares a bottleneck with one of the P flows (it cannot bottleneck another flow that is unchanged otherwise the solution we are considering is not feasible). Thus, replacing a P flow with the associated Q flows that it bottlenecks will at most maintain the flow because the bottleneck is still there. This could probably seem believable if the final statement is disguised well and given some supporting "evidence".

A greedy algorithm for set cover also seems very intuitive (because of how similar it seems to other greedy problems) and it takes a bit of thinking to find a counter example.

yanjunk
  • 121
  • 1
1

(Too long and formatting-heavy for a comment.)

For each of the false proofs you linked (https://imgur.com/a/SUT0cRX), here is my brief analysis of each. I've spoilered them in case you want to look at each one individually yourself before checking what I said. Please let me know in the comments if I made any mistakes or you have any questions.

Theorem 0.1:

This is true given the condition that for all x there exists some y such that x ~ y.

P.S. #1: This condition is necessary because being reflexive implies this.

P.S. #2: https://en.wikipedia.org/wiki/Equivalence_relation#Connections_to_other_relations confirms both parts of this: "A partial equivalence relation is transitive and symmetric. Such a relation is reflexive if and only if it is total, that is, if for all a, there exists some b such that a ∼ b. [ proof 1: If: Given a, let a ∼ b hold using totality, then b ∼ a by symmetry, hence a ∼ a by transitivity. — Only if: Given a, choose b = a, then a ∼ b by reflexivity. ] Therefore, an equivalence relation may be alternatively defined as a symmetric, transitive, and total relation."

Theorem 0.6:

This works for propositions which only have 2 possible values, true or false. It does not work for predicates that are sometimes true and sometimes false.

Theorem 0.2:

"x times" depends on x, but you treated it as a constant. (Also, you divided by x without stating that it must be non-zero.)

Theorem 0.3:

Splitting the square root only works when the terms are positive.

Theorem 0.4:

The inductive step is invalid when n=1, because there is no overlap between the sets.

Theorem 0.5:

The inductive step is invalid when n=0, because 1 is not less than or equal to 0.

Solomon Ucko
  • 152
  • 1
  • 7
0

Here's a wrong proof, though it only takes a moment's thought to dispel:

"Theorem": $\mathrm{P} \not= \mathrm{NP}$

"proof": There is some oracle $O$ such that $\mathrm{P}^O \not= \mathrm{NP}^O$. But if $\mathrm{P} = \mathrm{NP}$ then $\mathrm{P}^O = \mathrm{NP}^O$ for any oracle $O$! QED.

cody
  • 8,427
  • 33
  • 64