25

Sudoku is a puzzle, with the objective is to fill a 9×9 grid with digits so that each column, each row, and each of the nine 3×3 sub-grids that compose the grid (also "sudoku-blocks") contains all of the digits from 1 to 9.

Let's define block as a 3x3 sub-grid (not necessarily forming one of the sudoku-blocks, but including them) containing all the digits 1 to 9.

Let's define N as a number of all valid blocks on the grid.

For the usual sudoku puzzle $N=9$.

The maximum theoretically possible $N=49$ (7 blocks per row*7 blocks per column).

I found sudoku puzzle with $N=10$, to prove that puzzles with $N>9$ exist. Here's one:

+-------+-------+-------+
| 3 9 6 | 4 1 5 | 2 7 8 |
| 1 2 5 | 7 3 8 | 4 6 9 |
| 4 7 8 | 2 6 9 | 3 1 5 |
+-------+-------+-------+
| 7 5 9 | 6 4 2 | 8 3 1 |
| 8 4 3 | 5 9 1 | 7 2 6 |
| 2 6 1 | 3 8 7 | 5 9 4 |
+-------+-------+-------+
| 5 3 4 | 9 2 6 | 1 8 7 |
| 6 8 7 | 1 5 3 | 9 4 2 |
| 9 1 2 | 8 7 4 | 6 5 3 |
+-------+-------+-------+

The 10th block is in the top right corner:

5 2 7
8 4 6
9 3 1

And here's another with $N=33$ ($N = 3*7 + 3*7 - 9$)

+-------+-------+-------+
| 1 2 3 | 4 5 6 | 7 8 9 |
| 4 5 6 | 7 8 9 | 1 2 3 |
| 7 8 9 | 1 2 3 | 4 5 6 |
+-------+-------+-------+
| 2 3 1 | 5 6 4 | 8 9 7 |
| 5 6 4 | 8 9 7 | 2 3 1 |
| 8 9 7 | 2 3 1 | 5 6 4 |
+-------+-------+-------+
| 3 1 2 | 6 4 5 | 9 7 8 |
| 6 4 5 | 9 7 8 | 3 1 2 |
| 9 7 8 | 3 1 2 | 6 4 5 |
+-------+-------+-------+

Questions:

  1. Does sudoku puzzle with N=49 exist? No
  2. If yes, then what is it?
  3. If no, then what's the maximum possible N? Why?

Update. This update is fully based on @Emisor answer and proof.

Assume $N=49$ possible, let's try generating a puzzle:

+-------+--- 
| 1 2 3 | X 
| 4 5 6 | Y 
| 7 8 9 | Z
+-------+---
| A B C | D

The block on the first figure is to be taken as our "starting" one. Since there are no other numbers on the board, having them ordered makes no difference, for now.

Now, for $N=49$ several conditions must be met:

  1. X, Y, Z must be filled with 1, 4, 7
  2. A, B, C must be filled with 1, 2, 3

Since, X cannot be 1 and A cannot be 1, this statements are also true:

  1. Y or Z must be 1
  2. B or C must be 1

That makes block 56Y,89Z,BCD invalid as it must contain two 1, therefore $N=49$ is impossible.

That makes only one question left:

What's the maximum possible $N$? Why?

rfg
  • 351
  • 2
    Could you be more specific on the definition of the number $N$? I don't see what you mean, why is $N=10$ in your first example and where is this 10th block located in your $9\times 9$-grid? – Mathematician 42 May 22 '16 at 09:08
  • I think I understood the intention of the OP. If I understood right, he counts the $3\times 3$-rectangulars ( not necessarily forming one of the sudoku-blocks ) containing all the digits. – Peter May 22 '16 at 09:10
  • The 10th block is in top right corner, I updated the question. – rfg May 22 '16 at 09:14
  • @Peter yes, any 3x3 rectangular (not necessarily forming one of the sudoku-blocks, but including them) containing all the digits (1-9) – rfg May 22 '16 at 09:15
  • You should further explain, how the number is determined in the second example. A pattern may emerge this way. – Peter May 22 '16 at 09:20
  • @Peter I include the sudoku-blocks (usual non crossing 9 blocks) in the blocks count. The second example has 33 blocks, because 37+37 counts sudoku-blocks twice. – rfg May 22 '16 at 09:21
  • Yes, I just realized that. – Peter May 22 '16 at 09:22
  • Maybe you just invented a new sudoku-variant. You could call it "super-sudoku" and I guess you need much fewer entries to get a unique solution. Imagine how successful the extension from a Latin Square to a Sudoku was. You could be the one to do the next step :). My intuition is that $33$ will be hard to beat. – Peter May 22 '16 at 09:26
  • The N in a second example was determined semi-manually. Let's define three rows and three columns (as shown). Each row and each column contains 7 valid blocks. There are also 16 blocks which are not valid, which are placed on more than one column or row. – rfg May 22 '16 at 09:28
  • 2
    I do not understand the two close-votes. After the problem has been clarified, I do not see any reason for a closure. – Peter May 22 '16 at 09:38
  • If this question is closed, I will vote for reopen. – Peter May 22 '16 at 10:15
  • 1
    @Peter I, likewise. – Patrick Stevens May 22 '16 at 10:30
  • I actually like the idea of generalizing it a bit more, thus we achieve 9 possibilities per row/column. With warping around, you could "increase" the $N$ and maybe prevent misunderstandings between some people think with warping around, others that don't. Since there will be people claiming a $N>33$, with just allowing warping around ("thinking outside of the box", while actually doing nothing). – Patrick Abraham May 22 '16 at 12:03
  • May I suggest a notation for representing the overlapping solution blocks? Boldface, add a prime, or otherwise mark the top left number in each valid solution block. – alexis May 22 '16 at 12:53
  • @alexis "mark the top left number in each valid solution block" (except for base 9 blocks) sounds good. The only problem is that on stack preformatted text can't be bold. – rfg May 22 '16 at 13:28
  • I know.. fortunately there's space for a prime or star or something: *1 2 3. – alexis May 22 '16 at 13:30
  • I'd also mark the 9 basic blocks-- makes it trivial to count $N$, and easier to see patterns. – alexis May 22 '16 at 13:32
  • This question also might be suited for http://puzzling.stackexchange.com. also lots of math people there and you might have a better chance of getting the right answer – Ivo May 22 '16 at 15:09
  • I have validated N=33 is the best, but it was using brute force checking, not a nice proof! – Chris Jefferson May 22 '16 at 18:31
  • @ChrisJefferson I did not expect that brute force would work here. But a proof by brute force is better than none. The magic-knight problem also was solved in the negative sense only with brute force. Furthermore, such a proof saves a lot of time. This time, my intuition was right :) – Peter May 23 '16 at 12:23
  • So far @alexis's Lemma 2 is necessary for the brute force to finish in a reasonable amount of time. – Chris Jefferson May 23 '16 at 14:57

4 Answers4

11

Proof that $N=49$ is impossible:

+-------+---   1) +-------+      2) +-------+      3) +-------+
| 1 2 3 | X       |(1)2 3 |*1       | 1 2 3 |(7)      | 1 2 3 |(4)
| 4 5 6 | Y       |(4)5 6 |*4       | 4 5 6 | 1       |(4)5 6 | 7
| 7 8 9 | Z       |(7)8 9 |*7       |(7)8 9 | 4       | 7 8 9 | 1
+-------+---      +-------+         +-------+         +-------+ 
| A B C | D       | A B C | D       | A*B*C |*D       | A*B*C |*D  

The block on the first figure is to be taken as our "starting" one. Since there are no other numbers on the board, having them ordered makes no difference, for now. Now, there are 3 different ways to fill tiles XYZ with 1 4 and 7, but they all fail. Since:

Case 1: Obviously, there's 3 collisions.

Case 2: BCD must contain 237 on any order. However, 7 can't be on B or C because it collides with our original grid. And it can't be on D either, because it's on the same row as the recently added one.

Case 3: Similar to before, except with 4.

That means there's no way to have those 2 blocks (23X,56Y,89Z and 56Y,89Z,BCD) successfully.

Therefore, $N=49$ is impossible.

Dleep
  • 884
  • Now, there are 3 different ways to fill tiles XYZ with 7 8 and 9. |||| XYZ must be filled with 1,4,7, ABC must be filled with 1,2,3 I think.

    – rfg May 22 '16 at 10:35
  • oops, my bad. Editing – Dleep May 22 '16 at 10:37
  • Good answer! This will drastically decrease the maximum possible value because you can apply this proof to many pairs of blocks. – Peter May 22 '16 at 11:02
  • @Emisor. There is, I think, a simpler way to express your proof (with which I do agree). Edited original question. – rfg May 22 '16 at 11:31
  • 1
    If we don't try to avoid collisions with the numbers in the starting block, there are six ways of putting $1,4,7$ in the three spaces $X,Y,Z$. Of course if we do try to avoid collisions there are only two ways (your cases $2$ and $3$): either $X=4$ or $X=7$, then there's only one place to put the number which is neither $X$ nor $1$ without collision, and finally that leaves only one place to put the $1$. – David K May 22 '16 at 14:53
  • It took me a minute to realize that '7 on B or C collides with our original grid' meant 'it will mean that a different block won't be a valid 9-block' - the block that fails isn't part of the original grid, and the original grid can still be a valid sudoku with 7 on e.g.B - just not one with 49 9-blocks. That might be worth cleaning up a little bit here. – Steven Stadnicki May 22 '16 at 16:32
3

Using @Emisor's construction, we can reduce the upper bound to 39.

First a bit of set-up:

  1. I'll use Cartesian coordinates to refer to positions on the board, with (1,1) at the top left and (9,9) bottom right.

  2. I will identify solution blocks by the coordinates of their top left cell. I'll say the block is "anchored" at these coordinates. The obligatory Sudoku blocks are anchored at (1,1), (1,4), (1,7), (4,1), (4,7), etc.

If we look at @Emisor's construction carefully (actually: at the OP's distillation of it), we can see that it rules out not just the case $N = 49$ but any configuration where four cells in a 2x2 block are all solution anchors. Emisor's construction leads to this lemma:

Lemma 1: Let (x,y) be a solution anchor. Suppose that (x+1,y) and (x,y+1) are also solution anchors. Then (x+1,y+1) cannot be a solution anchor. (I won't copy the proof: Just observe that these are the only assumptions in requires.)

Lemma 2: In any 2x2 group of cells, at most three can be solution anchors. Proof: Just change orientation and apply the construction of Lemma 1.

So let's consider all of the 49 potential solution anchors. We can assign most of them to non-overlapping 2x2 groups, each of which must have at least one non-solution.

Here's a simple assignment into groups, built around the obligatory Sudoku solution blocks (which are capitalized). Each letter denotes the members of one group.

        1 2 3   4 5 6   7 8 9
      +-------+-------+-------+
    1 | A a . | B b c | C . . |
    2 | a a . | b b c | c . . |
    3 | . . . | . . . | . . . |
      +-------+-------+-------+
    4 | D d . | E e f | F . . |
    5 | d d . | e e f | f . . |
    6 | g g . | h h j | j . . |
      +-------+-------+-------+
    7 | G g . | H h j | J . . |
    8 | . . . | . . . | . . . |
    9 | . . . | . . . | . . . |
      +-------+-------+-------+

There are nine groups above, so there can at most be $49-9 = 40$ solution blocks. I.e., $N_{max}$ <= 40. (Lowered to 39 below, keep reading.)

Notes:

  1. The argument is valid for any 2x2 group, but we must choose non-overlapping groups in order to be able to count the non-solutions.

  2. Note that a number of potential anchor cells (column 3 and row 3) have not been assigned to a group. Since the solution anchors must lie in a 7x7 grid, I doubt there's a way to fit in a 10th 2x2 group. (There can only be 3 groups side by side, but I wouldn't call that a complete proof).

  3. There may well be constraints expressible in different ways. Given the number of positions that don't belong to a group, I'm pretty sure the true maximum is lower than 40. Stay tuned for updates (by me or others :-)

Update 1

Based on the above block arrangement, we can shave one more off $N_{max}$. While in general we don't know which cell in a group will be the non-solution, we know that the capital letters must anchor sudoku blocks since our grid is a valid Sudoku. In particular, position (4,4), marked by E above, must anchor a solution block. This means that at least one of the positions (3,3), (3,4), or (4,3) does not anchor a block. The are marked with @ below:

        1 2 3   4 5 6   7 8 9
      +-------+-------+-------+
    1 | A a . | B b c | C . . |
    2 | a a . | b b c | c . . |
    3 | . . @ | @ . . | . . . |
      +-------+-------+-------+
    4 | D d @ | E e f | F . . |
    5 | d d . | e e f | f . . |
    6 | g g . | h h j | j . . |
      +-------+-------+-------+
    7 | G g . | H h j | J . . |
    8 | . . . | . . . | . . . |
    9 | . . . | . . . | . . . |
      +-------+-------+-------+

This means that $N_{max}$ can be at most $49 - 9 - 1 = 39$.

Update 2

In a comment, @rfg shows how to arrange 39 anchor points so that they respect the 2x2 constraint, and include the 9 basic Sudoku blocks. Here's @rfg's sketch; the X's are the positions that are not anchor points. (imgur link)

Of course this does not mean that there is an actual Sudoku with these 39 solution blocks. But it does show that 39 is the limit of this approach: If $N_{max}$ is in fact less than 39, it cannot be shown by exploiting this constraint alone.

alexis
  • 2,117
  • 12
  • 12
  • No, Emisor’s answer does not show that a block at $(x, y)$ rules out a block at $(x + 1, y + 1)$. That is only true if there are blocks at $(x + 1, y)$ and $(x, y + 1)$ as well. Indeed, look at the example in the question for $N = 33$. (For example, there are blocks at $(1, 3)$ and $(2, 4)$.) – Eike Schulte May 22 '16 at 14:18
  • Oh rats, you are right @Eike. Chalk it under "don't use theorems without mastering their proof." – alexis May 22 '16 at 14:21
  • There, this should be correct now. Upper bound at 40. Thanks for identifying the problem, @EikeSchulte! – alexis May 22 '16 at 15:18
  • If that is true, then $N=33$ (in OP post) is the highest possible $N$ because for any of 49 blocks (x,y) in sample either of the following is true: (x,y) is a valid block OR (x±1,y±1) is a valid block => no more valid blocks are possible – rfg May 22 '16 at 15:40
  • @rfg, I don't think that's correct. It's easy to arrange more than 33 points in a 7x7 matrix so that none form a 2x2 group. The solutions in the OP's solution are arranged in a way that you couldn't add a solution without creating a 2x2 group of solutions, but that doesn't prove anything: I can minimally rearrange them and add more anchors. (While there are Sudoku solutions to go with the anchor arrangements is another matter.) – alexis May 22 '16 at 15:46
  • @alexis there can be arranged exactly 40 points in a 7x7 matrix so that none form a 2x2 group. Consider http://i.imgur.com/E3bVCPp.jpg – rfg May 22 '16 at 16:19
  • @rfg you are right, BUT this is not a valid Sudoku: The center block (marked by E in my diagram) must be a solution according to Sudoku rules. I can get 37 while respecting the 9 Sudoku blocks. Whether there is an actual solution to go with these anchors is another question altogether... – alexis May 22 '16 at 16:27
  • @alexis forgot about it on 7x7 matrix. Here's 39 respecting the 9 Sudoku blocks. Bold dots are Sudoku blocks http://i.imgur.com/BwApqkN.jpg – rfg May 22 '16 at 16:50
  • Nice! Now if you could somehow put together a Sudoku with this arrangement, you've established $N_{max} \ge 39$, which (together with my answer) means it is 39. :-). But there might well be other constraints too, that make this arrangement impossible... – alexis May 22 '16 at 17:29
2

UPDATE: After adding condition 3 below (only 3 out of every 4 locations in a 2x2 square can be a block, savile row + lingeling (a SAT solver) prove there is no solution with 34 or more blocks in about 10 minutes. This does not of course provide any kind of nice proof

I have tried modelling this problem in the Savile Row system, which can generate input for SAT solvers and the Minion constraint solver.

This can re-create the N=33 almost instantly, but so far hasn't found anything for N > 33, after a couple of hours.

I plan on working on improving my model. At the moment the only non-obvious facts I am using are:

  • When counting the 3x3 squares which satisfy the condition, remember the original sub-blocks always do

  • We can, without loss of generality, assign the first row to 1,2,...9

  • In any 2x2 block, at most 3 of the blocks with this square as their top left are blocks.

Here is the Savile Row model, for the interested reader (designed to find > 33 squares).

language ESSENCE' 1.0
letting   RANGE be domain int(1..9)
letting   LONGRANGE be domain int(0..9)
letting   VALUES be domain int(0..9)

find field: matrix indexed by [RANGE, RANGE] of RANGE

find squarecheck : matrix indexed by [LONGRANGE, LONGRANGE] of bool

find squares: int(34..100)

such that
  $ all rows have to be different
  forAll row : RANGE .
       allDiff(field[row,..]),

  $ all columns have to be different
  forAll col : RANGE .
     allDiff(field[..,col]),     

$ all 3x3 blocks have to be different
  forAll i,j : int(0..2) . (
    allDiff([field[row1+(i*3), col1+(j*3)] | 
             row1 : int(1..3), col1 : int(1..3) ])
    /\
    squarecheck[i*3,j*3] = true
  ),

  forAll i : int(0..8). forAll j : int(0..8).
    squarecheck[i,j]   + squarecheck[i+1,j]  +
    squarecheck[i,j+1] + squarecheck[i+1,j+1] <= 3,

  forAll i : int(0..9). forAll j : int(0..9) .
    squarecheck[i,j] =  allDiff([field[row1+i, col1+j] | 
             row1 : int(1..3), col1 : int(1..3) ]),

 sum(flatten(squarecheck)) = squares,


 forAll i : int(1..9). field[1,i] = i
  • Please read @alexis answer and comments to that answer. There is a possible block structure for $N=39$ (which is possibly the maximum $N$). As for an model/algorithm, consider this: 1) we fill the first block with numbers 1-9 2) We take any neighbor (horizontally or vertically) block - and we immediately know which numbers we must fill to make this block valid as neighbor blocks share six values 3) We check if sudoku is still valid, if yes, fill the next neighbor, if no, revert back a step. Theoretically this should yield a possible sudoku with $N=39$ if the blocks placement is correct. – rfg May 22 '16 at 18:11
  • I'm trying this too. Your idea for building the solution up incrementally is what systems like Savile Row do, but will I imagine do much better than we would by hand! – Chris Jefferson May 22 '16 at 18:26
1

I confirmed via integer linear programming (in 16 seconds) that the maximum is $33$. Here is SAS code:

proc optmodel;
   /* declare parameters and sets */
   num n = 9;
   set ROWS = 1..n;
   set COLS = ROWS;
   set CELLS = ROWS cross COLS;
   set DIGITS = 1..9;
   num numRegions init 0;
   set REGIONS = 1..numRegions;
   set <num,num> CELLS_r {REGIONS};
   /* rows */
   for {i in ROWS} do;
      numRegions = numRegions + 1;
      CELLS_r[numRegions] = {i} cross COLS;
   end;
   /* columns */
   for {j in COLS} do;
      numRegions = numRegions + 1;
      CELLS_r[numRegions] = ROWS cross {j};
   end;
   /* usual nine 3 by 3 blocks */
   for {fi in 0..2, fj in 0..2} do;
      numRegions = numRegions + 1;
      CELLS_r[numRegions] = {<i,j> in CELLS: floor((i-1)/3) = fi and floor((j-1)/3) = fj};
   end;

/* X[i,j,k] = 1 if cell (i,j) contains digit k; 0 otherwise */ var X {CELLS, DIGITS} binary;

/* each cell contains exactly one digit */ con OneDigitPerCell {<i,j> in CELLS}: sum {k in DIGITS} X[i,j,k] = 1;

/* each region contains each digit exactly once */ con Alldiff {r in REGIONS, k in DIGITS}: sum {<i,j> in CELLS_r[r]} X[i,j,k] = 1;

/* define candidate Sudoku blocks */ set BLOCKS = {<i,j> in CELLS: <i+2,j+2> in CELLS}; set CELLS_block {<i,j> in BLOCKS} = i..i+2 cross j..j+2;

/* IsBlock[b] = 1 if block b has distinct digits; 0 otherwise */ var IsBlock {BLOCKS} binary;

/* maximize the number of Sudoku blocks */ max NumUsedBlocks = sum {<i,j> in BLOCKS} IsBlock[i,j];

/* IsBlock[bi,bj] = 1 implies that each digit appears in the block */ con AlldiffBlock {<bi,bj> in BLOCKS, k in DIGITS}: IsBlock[bi,bj] <= sum {<i,j> in CELLS_block[bi,bj]} X[i,j,k];

/* call mixed integer linear programming solver */ solve;

/* print solution */ num sol {CELLS}; for {<i,j> in CELLS} do; for {k in DIGITS: X[i,j,k].sol > 0.5} do; sol[i,j] = k; leave; end; end; print sol; print IsBlock; quit;

Here is one such solution:

7 3 5 4 2 6 9 8 1 
4 2 6 9 8 1 7 3 5 
9 8 1 7 3 5 4 2 6 
5 7 3 2 6 4 8 1 9 
2 6 4 8 1 9 5 7 3 
8 1 9 5 7 3 2 6 4 
3 5 7 6 4 2 1 9 8 
6 4 2 1 9 8 3 5 7 
1 9 8 3 5 7 6 4 2 

It turns out that the maximum is still $33$ if you relax to Latin squares, that is, if you do not force the usual nine $3\times 3$ Sudoku blocks.

RobPratt
  • 50,938
  • Thank you! I kind of hoped that something beyond 33 does exist, but nice to know that it is an actual upper limit. – rfg Apr 27 '25 at 01:45