The best kittens, technology, and video games blog in the world.

Sunday, October 09, 2016

Solving sudoku with ruby and Z3

Solving Sudoku is a bit like FizzBuzz for constraint solvers, but since Z3 is very poorly known, and Z3 for ruby has few users other than me, I want to write a series of posts explaining various Z3 techniques, starting from very simple problems.

Parsing Sudoku

Before we get to Z3, let's read test data. A nice format for sudoku would be a text file like this:

_ 6 _ 5 _ 9 _ 4 _
9 2 _ _ _ _ _ 7 6
_ _ 1 _ _ _ 9 _ _
7 _ _ 6 _ 3 _ _ 9
_ _ _ _ _ _ _ _ _
3 _ _ 4 _ 1 _ _ 7
_ _ 6 _ _ _ 7 _ _
2 4 _ _ _ _ _ 6 5
_ 9 _ 1 _ 8 _ 3 _

Where _s represent empty cells, and numbers represent pre-filled cells.

Fairly straightforward code like this can parse it to 9x9 Array of Arrays:"\n").map do |line|{|c| c == "_" ? nil : c.to_i}

Getting us data structure like this:

[[nil, 6, nil, 5, nil, 9, nil, 4, nil],
 [9, 2, nil, nil, nil, nil, nil, 7, 6],
 [nil, nil, 1, nil, nil, nil, 9, nil, nil],
 [7, nil, nil, 6, nil, 3, nil, nil, 9],
 [nil, nil, nil, nil, nil, nil, nil, nil, nil],
 [3, nil, nil, 4, nil, 1, nil, nil, 7],
 [nil, nil, 6, nil, nil, nil, 7, nil, nil],
 [2, 4, nil, nil, nil, nil, nil, 6, 5],
 [nil, 9, nil, 1, nil, 8, nil, 3, nil]]

Z3 workflow

First, we need to get the solver:

solver =

After that we feed it with a bunch of formulas with solver.assert formula.
How would we describe sudoku problem in plain English?
  • There are 9x9 integer variables
  • Each of them is between 1 and 9
  • They correspond to prefilled data, unless that data is nil
  • Each row contains distinct values
  • Each column contains distinct values
  • Each 3x3 square contains distinct values
Once we tell Z3 about that, we check if our formulas are solvable with solver.check == :sat, and if it so, get model with solver.model - I feel those two steps should probably be refactored into one, but let's leave it for now.

Model can be then accessed with model[z3_variable] syntax to get our answers. So let's get to it!

Creating variables

There's nothing special about Z3 variables, they're sort of like ruby Symbols with associated types. So we could build our formulas with names like Z3.Int("cell[4,8]") - such name is just for our personal use, and doesn't mean cells form an array or anything.

However, since we'll be referring to the same 9x9 variables all the time it's probably useful to save them to an Array of Arrays.

cells = (0..8).map do |j|
  (0..8).map do |i|

Setting possible values

All variables are between 1 and 9, which is very easy to tell the solver about:

cells.flatten.each do |v|
  solver.assert v >= 1
  solver.assert v <= 9

Setting prefilled variables

Telling solver that Z3 Int variable is equal to some ruby Integer is completely straightforward:

cells.each_with_index do |row, i|
  row.each_with_index do |var, j|
    solver.assert var == data[i][j] if data[i][j]

All rows contain distinct values

If we relied on basic operations, we might have to give Z3 a lot of inequalities per row, fortunately there's Z3.Distinct(a,b,c,...) formula, which handles this very common case.

It makes it really easy:

cells.each do |row|
  solver.assert Z3.Distinct(*row)

All columns contain distinct values

We can use Array#transpose to flip our multidimensional array, and do the same thing we did for rows:

cells.transpose.each do |column|
  solver.assert Z3.Distinct(*column)

All square contain distinct values

This is a bit more complex rearrangement, but it's all pure ruby, with Z3 getting very similar looking formula in the end:

cells.each_slice(3) do |rows|
  rows.transpose.each_slice(3) do |square|
    solver.assert Z3.Distinct(*square.flatten)
By the way, you should really take a look at Enumerable API, it contains a lot of useful methods which will save you a ton of time.

Get the model and print it

And we're basically done:

raise "Failed to solve" unless solver.check == :sat
model = solver.model
cells.each do |row|
  puts{|v| model[v]}.join(" ")

Getting the answer we want:

8 6 3 5 7 9 2 4 1
9 2 5 3 1 4 8 7 6
4 7 1 8 2 6 9 5 3
7 1 4 6 8 3 5 2 9
6 8 9 7 5 2 3 1 4
3 5 2 4 9 1 6 8 7
1 3 6 2 4 5 7 9 8
2 4 8 9 3 7 1 6 5
5 9 7 1 6 8 4 3 2

Avoid temptation to optimize

Everything we did here was so ridiculously straightforward - we skipped even the most obvious shortcuts.

For example why the hell did we assert that cells are between 1 and 9 even for cells whose value we know perfectly well? And why did we even create variables for them if we know their value already? If we coded any kind of manual sudoku solver, we'd probably start with those.

It turns out Z3 really doesn't care about such optimizations, and will solve your problem just as efficiently either way - but by "optimizing" your code will become more complicated, and there's a good chance you'll make an error trying to "optimize".

That's not to say Z3 is made of magic, and that there are no cases where you can help it - but that's much more likely to be by restating a problem in a different way than by some microoptimizations.

Some pitfalls

One small pitfall to remember is that Z3 values - including those returned by solved models - override all operators including == so you can't check if for example model[Z3.Int("x")] == 5 - that will just create a Z3 Bool expression.

This is necessary behaviour for some more complex use cases, but for the simplest case you'll generally want to #to_s whatever you get out of the model and go on from there.

API is subject to change

Some details will probably change in future versions to simplify things - especially everything from solver.check onwards, which will probably receive a simpler API for the most common case, but existing one will still be supported as it's necessary for some more complex situations.

Integers variables constrained to specific range are another commonly used pattern which could use some shortcuts.

Z3 is crazy powerful

This was a very simple example. You could probably write a sudoku solver yourself - even if it'd take a lot more time, and would probably perform a lot worse than Z3.

This manual approach doesn't scale - doing much bigger and much more complex problems with a constraint solver is as literally fast as just writing the constraints, while doing it manually constraint list is barely a starting point, and naive approaches get exponentially slow almost right away.

Due to unfortunate accidents of programming history constraint solvers got relegated to an obscure niche, but they really ought to be a tool in every good programmer's toolkit, and once you learn them you'll find applications for them all the time.

No comments: