Paint-by-Numbers

Paint-by-Numbers

Written by

Dylan Holmes

Table of Contents

1 How can you compute numbers cheaply?

Let's say I give you a calculator with a stack-based memory: you can push a positive integer onto the stack, or clone the top element of the stack, or perform arithmetic operations on the top two elements of the stack—addition, subtraction, multiplication, integer division.

Suppose these operations each have a cost: the cost of pushing a number onto the stack is the number itself, and the cost of every other operation is 1. The game is to find, for each integer in the range 1…256, the lowest-cost program which terminates with that integer being the only item in the stack.

For example, a minimal-cost program for generating 7 is:

1,3,@,+,+

In detail, this program

  • Starts with an empty stack, as always. [ ]
  • Pushes 1 onto the stack. [ 1 ]
  • Pushes 3 onto the stack. [1, 3]
  • Clones (@) the top member of the stack [1, 3, 3]
  • Adds the top two members of the stack [1, 6]
  • Adds the top two members of the stack, yielding the desired result. [ 7 ]

Of course, there may be many equally-inexpensive programs for generating the same number.

In this article, I describe:

  1. An algorithm for finding the lowest-cost program to generate a number.
  2. An optimization strategy which speeds up the algorithm by cutting out unnecessarily repeated work.
  3. A fantastic theoretical result about how complicated the lowest-cost program can be.
  4. An interesting consequence about prime numbers.

At a high level, this article is about how hard numbers are to produce when using a specific calculation language. (This is the numbers' algorithmic complexity, i.e. their "Kolmogorov complexity"). Some numbers can be produced very cheaply, while others defy simple calculations—and large numbers are not always more costly to produce than smaller ones! Compare the cheapest programs for producing 177 and 256:

[2, @, @, 1, +, +, @, @, 2, +, *, *, +] # yields 177, costs 15
[2, @, *, @, *, @, *] # yields 256; costs 8

At a lower level, the work in this article was inspired by the Piet programming language, an esoteric programming language in which programs are bitmap images reminiscent of Piet Mondrian's paintings. It's advantageous to compute numbers using the fewest pixels possible, and this article describes how to do just that.

2 You can find optimal programs using graph search

There are many different strategies for producing a number. For example, you might try to produce its prime factors and multiply them together. Or maybe it would be cheaper to produce a convenient nearby power of two, then subtract. Or maybe you could write out the number in a particular base by repeatedly multiplying by the base and adding numbers together.

There are many possibilities—how do we find the optimal strategy? My solution is to recast the problem of finding a cheap program as a problem of searching a graph for the shortest possible path between two nodes. By recasting the program-finding problem as a path-finding problem, we can apply an off-the-shelf optimal algorithm called branch and bound (a.k.a. uniform cost search) which is guaranteed to find the shortest path through any graph.

Here is the recasting strategy:

  • The nodes of the graph are contents in the stack. For example, [1,5,7].
  • The weighted directed edges between nodes are the operators such as addition, multiplication, pushing a number onto the stack, and cloning the top of the stack. The "source" node is the state of the stack before the operator is applied, and the "target" node is the state of the stack after the operator is applied. The length/weight of the edge is the cost of an operator.
  • Paths between two nodes in the graph are therefore sequences of operators which, if you apply them, transform the start state into the goal state. The total "length" of the path is the total cost of the program.
  • In our particular problem, we want to find the lowest-cost program which transforms the initial state [] into our desired number [N]. In other words, we want a minimal-length path between [] and N in the graph.

This conversion is all we need; the branch-and-bound algorithm is guaranteed to find the shortest path through this graph and consequently to find the shortest program for generating a particular number. (Python code)

We get results like these (each line consists of a number, followed by an optimally inexpensive program for generating that number). See first results for the complete list.

1 1
2 2
3 3
4 4
5 5
6 3,@,+
7 7
8 4,@,+
9 3,@,*
10 5,@,+
11 3,@,*,2,+
12 3,@,+,@,+
13 3,@,+,@,+,1,+
14 4,@,*,2,-
15 4,@,*,1,-
16 4,@,*
--- snip ---
248 5,@,@,*,*,1,-,@,+
249 3,@,@,*,@,*,2,+,*
250 5,@,@,*,*,@,+
251 5,@,@,*,*,@,+,1,+
252 3,@,+,@,@,1,+,*,*
253 4,@,*,@,*,3,-
254 4,@,*,@,*,2,-
255 4,@,*,@,*,1,-
256 4,@,*,@,*

3 An extended set makes search much faster

The branch-and-bound algorithm, as implemented, is guaranteed to produce

4 How many programs can stop at 5

We have found that all of the numbers between 1 and 256 can be generated using optimal-cost programs that only use constants between 1 and 5. We can ask for a particular number N, what the largest constant you need to include in order to generate the lowest-cost program?

For example, you might imagine that while the numbers between 1 … 256 cost less than 16 and require only constants as high as 5, larger numbers—at least the outputs of the Ackermann function, perhaps? — would cost more and require some greater constants in order to be written as cheaply as possible; if you attempt to compute the number with fewer constants, the program will cost more than is optimal.

On the other hand, you might imagine that all integers can be written as cheaply as possible using only a fixed handful of constants. This would suggest that number-generating programs are not especially penalized for computing large numbers by using small constants and a bunch of arithmetic, rather than employing larger constants and fewer arithmetic operations.

What do you think? (You should feel free to decide what you believe before continuing.) I puzzled over this for a long time, as both views seemed intuitively appealing and a proof seemed elusive. Eventually, however, I established the answer.

What I proved was the following theorem:

Theorem. Each positive integer N can be optimally computed with a program that uses only the constants 1…5. That program is absolutely as cheap as possible, meaning that all other programs for computing N, regardless of constants they use, cost at least as much.

Moreover, I proved that this bound is tight in the sense that the theorem works for constants in [1…5], but fails for shorter intervals like [1…4].

The theorem is actually a direct consequence of the following Lemma:

Lemma Each positive integer N can be optimally computed with a program that uses only the constants 1…5 and which costs no more than N.

The lemma provides a sort of diagonalization argument. First, I'll prove that the lemma implies the theorem, then I'll prove the lemma.

The lemma implies the theorem. Suppose the lemma is true and you have an optimally-costed program \(Q\) for generating an integer N. If \(Q\) only uses the constants 1…5, the theorem holds for \(N\). Otherwise, make a new program \(Q^\prime\) by replacing each constant greater than 5 in \(Q\) with an optimal subroutine that produces that constant instead and that uses only the constants 1…5. Because of the lemma, this substitution doesn't increase the cost of the program. Hence \(Q^\prime\) costs the same as the optimal program \(Q\) and only uses the constants 1…5, which establishes the theorem for N.

5 If you can only clone and multiply, you use prime factors

6 APPENDIX: Code

6.1 Simple branch-and-bound

The following is Python code for implementing branch-and-bound for the shortest-program problem.

# https://math.stackexchange.com/questions/2355111/making-numbers-cheaply

from copy import deepcopy
from copy import copy

# a state is a tuple of numbers.
# an operator sends state -> state

#### DEFINE THE OPERATIONS THAT CAN BE USED.

# An operator transforms a state into another state.  If an operator
# is inapplicable, e.g. addition when there are fewer than two items
# in the stack, or division by zero, the operator instead returns
# None.

def op_copy(state) :
    # copy the last number
    if len(state) < 1 :
	return None
    else :
	return state + [state[-1]]


## These "Higher ops" are functions that make operators out of numbers or functions.

def higher_op_append(k):
    # return an operator that introduces a new number
    return lambda state : state + [k]

def higher_op_binary(fn):
    # return an operator that applies fn(penultimate, final)
    def op_(state):
	if len(state) < 2 :
	    return None
	else :
	    try :
		return state[:-2] + [fn(state[-2], state[-1])]
	    except :
		pass
    return op_

op_add = higher_op_binary(lambda x,y: x + y)
op_subtract = higher_op_binary(lambda x,y: x - y)
op_multiply = higher_op_binary(lambda x,y: x * y)
op_divide = higher_op_binary(lambda x,y: x / y)
op_modulo = higher_op_binary(lambda x,y: x % y)

############# SEARCH FUNCTION

# To perform branch and bound:
# 1. If these are no paths in the agenda, return No Solution Exists.
# 2. Otherwise, remove the first path from the agenda.
# 3. If the path leads to the goal, it's guaranteed to be the shortest path -- return it!
# 4. Otherwise, if you've already seen a path to this state, the
# current path is guaranteed not to be as efficient*. Ignore it and continue.
# 5. Otherwise, consider all possible operations you could perform on the current state. Each operation generates a new path.
# 6. Add the new paths to the agenda.
# 7. Sort the entire agenda in descending order with respect to cost(path) + heuristic(path).
#    As long as the heuristic is an underestimate of the remaining cost
#    to the goal, this sorting order is guaranteed to yield the
#    shortest-cost path.
# 8. Goto step 1.

def branch_and_bound(goal, ops, costs, heuristic=lambda *x : 0, agenda=None, seen=None) :

    # goal: a goal state such as [3]
    # ops: a map from symbols like '+' to functions.
    # costs: a map from symbols like '+' to their numerical cost.
    # heuristic: a function for estimating how many steps are left
    # agenda: a list of paths
    # seen: a list of states already achieved with optimal cost.

    if agenda is None :
	agenda = [{"path" : [], "state": [], "cost": 0}]

    seen = [] if seen is None else seen
    while agenda :
	current_state = agenda.pop(0)
	x = current_state["state"]

	if x is None :
	    continue
	elif x in seen :
	    continue
	elif x == goal :
	    return current_state
	else :
	    seen = seen + [x]
	    new_paths = []
	    for symbol, fn in ops.iteritems() :
		new_path = deepcopy(current_state)
		new_path["path"] += [symbol]
		new_path["state"] = fn(new_path["state"])
		new_path["cost"] += costs[symbol]

		new_paths += [new_path]

	    agenda += [y for y in new_paths if y["state"] is not None]
	    # print agenda

	    agenda = sorted(agenda, key=lambda path: path["cost"] + heuristic(path["state"]))




def find_cheaply(limit, goal) :
    # find the cheapest way to compute the goal value
    # using constants only up to the limit value.

    ops = {"+" : op_add, "-" : op_subtract,  "*" : op_multiply,  "/" : op_divide, "%" : op_modulo, "@" : op_copy}
    costs = {}

    for symbol, _ in ops.iteritems() :
	costs[symbol] = 1


    for k in range(1, limit+1) :
	ops[k] = higher_op_append(k)
	costs[k] = k


    def heuristic(state) :
	return abs(len(state)-1)

    return branch_and_bound([goal], ops, costs, heuristic)


# EXAMPLE

# Find an optimally-cheap program for generating the number 26, using
# constants only up to 5.
find_cheaply(5, 26)

6.2 Optimal-cost programs, first results

The following lists optimally-cheap programs for generating each integer between 1 and 256. The programs were allowed to use constants as high as 20, but in practice the highest constant used is 7.

The most costly number is 177, which requires a program of cost 15 to generate. All other numbers in this range have costs ranging from 1 to 14.

1 1
2 2
3 3
4 4
5 5
6 3,@,+
7 7
8 4,@,+
9 3,@,*
10 5,@,+
11 3,@,*,2,+
12 3,@,+,@,+
13 3,@,+,@,+,1,+
14 4,@,*,2,-
15 4,@,*,1,-
16 4,@,*
17 4,@,*,1,+
18 3,@,*,@,+
19 3,@,*,@,+,1,+
20 4,@,1,+,*
21 3,@,@,+,1,+,*
22 3,@,*,2,+,@,+
23 5,@,*,2,-
24 5,@,*,1,-
25 5,@,*
26 5,@,*,1,+
27 3,@,@,*,*
28 3,@,@,*,*,1,+
29 3,@,@,*,*,2,+
30 5,@,1,+,*
31 4,@,*,@,+,1,-
32 4,@,*,@,+
33 4,@,*,@,+,1,+
34 3,@,+,@,*,2,-
35 3,@,+,@,*,1,-
36 3,@,+,@,*
37 3,@,+,@,*,1,+
38 3,@,+,@,*,2,+
39 3,@,@,+,@,*,+
40 3,@,*,@,*,2,/
41 3,@,+,@,1,+,*,1,-
42 3,@,+,@,1,+,*
43 3,@,+,@,1,+,*,1,+
44 3,@,+,@,1,+,*,2,+
45 3,@,@,2,+,*,*
46 3,@,1,+,@,*,*,2,-
47 3,@,1,+,@,*,*,1,-
48 3,@,1,+,@,*,*
49 7,@,*
50 5,@,*,@,+
51 5,@,*,@,+,1,+
52 5,@,*,1,+,@,+
53 3,@,@,*,*,@,+,1,-
54 3,@,@,*,*,@,+
55 3,@,@,*,*,@,+,1,+
56 4,@,+,@,1,-,*
57 3,@,@,*,@,+,1,+,*
58 3,@,@,*,*,2,+,@,+
59 4,@,@,*,1,-,*,1,-
60 4,@,@,*,1,-,*
61 4,@,+,@,*,3,-
62 4,@,+,@,*,2,-
63 4,@,+,@,*,1,-
64 4,@,+,@,*
65 4,@,+,@,*,1,+
66 2,@,@,+,@,+,@,*,+
67 3,@,1,+,@,+,@,*,+
68 4,@,@,+,@,*,+
69 4,@,@,+,@,*,+,1,+
70 3,@,+,@,*,1,-,@,+
71 3,@,+,@,*,@,+,1,-
72 3,@,+,@,*,@,+
73 3,@,+,@,*,@,+,1,+
74 3,@,+,@,*,1,+,@,+
75 3,@,2,+,@,*,*
76 3,@,+,@,*,2,+,@,+
77 3,@,*,@,*,4,-
78 3,@,*,@,*,3,-
79 3,@,*,@,*,2,-
80 3,@,*,@,*,1,-
81 3,@,*,@,*
82 3,@,*,@,*,1,+
83 3,@,*,@,*,2,+
84 3,@,@,*,@,*,+
85 3,@,@,*,@,*,+,1,+
86 3,@,@,*,@,*,+,2,+
87 3,@,@,@,*,@,*,+,+
88 3,@,*,@,1,+,*,2,-
89 3,@,*,@,1,+,*,1,-
90 3,@,*,@,1,+,*
91 3,@,*,@,1,+,*,1,+
92 3,@,*,@,1,+,*,2,+
93 3,@,@,*,@,1,+,*,+
94 3,@,1,+,@,*,*,1,-,@,+
95 3,@,1,+,@,*,*,@,+,1,-
96 3,@,1,+,@,*,*,@,+
97 5,@,+,@,*,3,-
98 7,@,*,@,+
99 3,@,*,@,2,+,*
100 5,@,+,@,*
101 5,@,+,@,*,1,+
102 5,@,+,@,*,2,+
103 3,@,@,*,1,+,@,*,+
104 4,@,1,+,@,+,@,*,+
105 5,@,@,+,@,*,+
106 3,@,@,+,@,*,*,2,-
107 3,@,@,+,@,*,*,1,-
108 3,@,@,+,@,*,*
109 3,@,@,+,@,*,*,1,+
110 5,@,+,@,1,+,*
111 3,@,@,+,@,*,1,+,*
112 4,@,+,@,1,-,*,@,+
113 3,@,@,*,1,+,@,1,+,*,+
114 3,@,@,+,@,*,2,+,*
115 5,@,@,*,2,-,*
116 4,@,@,1,+,@,*,+,*
117 3,@,@,@,+,@,*,+,*
118 3,@,@,@,+,@,*,+,*,1,+
119 3,@,*,2,+,@,*,2,-
120 5,@,@,*,1,-,*
121 3,@,*,2,+,@,*
122 3,@,*,2,+,@,*,1,+
123 5,@,@,*,*,2,-
124 5,@,@,*,*,1,-
125 5,@,@,*,*
126 5,@,@,*,*,1,+
127 5,@,@,*,*,2,+
128 4,@,+,@,*,@,+
129 4,@,+,@,*,@,+,1,+
130 5,@,@,*,1,+,*
131 3,@,+,@,+,@,1,-,*,1,-
132 3,@,+,@,+,@,1,-,*
133 3,@,+,@,+,@,1,-,*,1,+
134 3,@,1,+,@,+,@,*,+,@,+
135 5,@,@,*,2,+,*
136 4,@,@,+,@,*,+,@,+
137 3,@,*,@,1,-,@,*,@,+,+
138 3,@,@,*,-,@,@,+,@,*,%
139 1,3,@,+,@,@,+,@,*,-,-
140 3,@,+,@,*,1,-,@,+,@,+
141 3,@,+,@,+,@,*,3,-
142 3,@,+,@,+,@,*,2,-
143 3,@,+,@,+,@,*,1,-
144 3,@,+,@,+,@,*
145 3,@,+,@,+,@,*,1,+
146 3,@,+,@,+,@,*,2,+
147 3,@,@,+,1,+,@,*,*
148 4,@,3,*,@,*,+
149 3,@,+,@,1,-,@,*,*,1,-
150 3,@,+,@,1,-,@,*,*
151 3,@,+,@,1,-,@,*,*,1,+
152 3,@,*,@,@,+,1,-,*,1,-
153 3,@,*,@,@,+,1,-,*
154 3,@,*,@,@,+,1,-,*,1,+
155 3,@,+,@,+,@,1,+,*,1,-
156 3,@,+,@,+,@,1,+,*
157 3,@,+,@,+,@,1,+,*,1,+
158 3,@,*,@,*,2,-,@,+
159 3,@,*,@,*,@,+,3,-
160 3,@,*,@,*,1,-,@,+
161 3,@,*,@,*,@,+,1,-
162 3,@,*,@,*,@,+
163 3,@,*,@,*,@,+,1,+
164 3,@,*,@,*,1,+,@,+
165 3,@,@,*,@,*,@,+,+
166 3,@,*,@,*,2,+,@,+
167 3,@,@,*,@,*,+,@,+,1,-
168 3,@,@,*,@,*,+,@,+
169 3,@,+,@,+,1,+,@,*
170 3,@,+,@,+,1,+,@,*,1,+
171 3,@,*,@,@,+,1,+,*
172 3,@,*,@,@,+,1,+,*,1,+
173 3,@,*,@,@,*,1,+,@,+,+
174 3,@,@,@,*,@,*,+,+,@,+
175 5,@,@,2,+,*,*
176 3,@,*,@,1,+,*,2,-,@,+
177 3,@,*,@,1,+,*,@,+,3,-
178 3,@,*,@,1,+,*,1,-,@,+
179 3,@,*,@,1,+,*,@,+,1,-
180 3,@,*,@,1,+,*,@,+
181 3,@,*,@,1,+,*,@,+,1,+
182 2,@,@,+,@,*,-,@,1,+,*
183 3,@,@,*,@,1,+,*,@,+,+
184 3,@,*,@,1,+,*,2,+,@,+
185 5,@,1,+,@,*,1,+,*
186 3,@,@,*,@,1,+,*,+,@,+
187 5,@,@,*,*,@,2,/,+
188 3,@,@,*,@,2,-,*,*,1,-
189 3,@,@,*,@,2,-,*,*
190 5,@,+,@,@,+,1,-,*
191 3,@,1,+,@,+,@,*,*,1,-
192 3,@,1,+,@,+,@,*,*
193 3,@,1,+,@,+,@,*,*,1,+
194 2,@,@,+,@,*,-,@,*,2,-
195 2,@,@,+,@,*,-,@,*,1,-
196 2,@,@,+,@,*,-,@,*
197 2,@,@,+,@,*,-,@,*,1,+
198 3,@,*,@,2,+,*,@,+
199 5,@,+,@,*,@,+,1,-
200 5,@,+,@,*,@,+
201 5,@,+,@,*,@,+,1,+
202 5,@,+,@,*,1,+,@,+
203 7,@,@,+,@,*,+
204 3,@,+,@,@,*,2,-,*
205 5,@,@,+,@,*,@,+,+
206 3,@,@,*,1,+,@,*,+,@,+
207 4,@,*,@,3,-,*,1,-
208 4,@,*,@,3,-,*
209 3,@,+,@,@,*,1,-,*,1,-
210 3,@,+,@,@,*,1,-,*
211 3,@,+,@,@,*,1,-,*,1,+
212 4,@,*,@,2,-,@,*,+
213 3,@,+,@,@,*,*,3,-
214 3,@,+,@,@,*,*,2,-
215 3,@,+,@,@,*,*,1,-
216 3,@,+,@,@,*,*
217 3,@,+,@,@,*,*,1,+
218 3,@,+,@,@,*,*,2,+
219 3,@,@,+,@,@,*,*,+
220 5,@,+,@,1,+,*,@,+
221 3,@,+,@,@,*,1,+,*,1,-
222 3,@,+,@,@,*,1,+,*
223 4,@,*,1,-,@,*,2,-
224 4,@,*,@,2,-,*
225 4,@,*,1,-,@,*
226 4,@,*,1,-,@,*,1,+
227 2,@,@,+,@,*,1,-,@,*,+
228 3,@,+,@,@,*,2,+,*
229 4,@,@,*,1,-,@,*,+
230 5,@,3,*,@,*,+
231 3,@,@,*,@,*,4,-,*
232 4,@,@,1,+,@,*,+,*,@,+
233 4,@,+,@,@,+,1,-,@,*,+
234 3,@,@,@,+,@,*,+,*,@,+
235 1,3,@,@,@,*,@,*,-,*,-
236 3,@,@,*,@,*,2,-,*,1,-
237 3,@,@,*,@,*,2,-,*
238 4,@,*,@,1,-,*,2,-
239 4,@,*,@,1,-,*,1,-
240 4,@,*,@,1,-,*
241 3,@,@,*,@,*,*,2,-
242 3,@,@,*,@,*,*,1,-
243 3,@,@,*,@,*,*
244 3,@,@,*,@,*,*,1,+
245 3,@,@,*,@,*,*,2,+
246 3,@,@,*,@,*,1,+,*
247 3,@,@,*,@,*,1,+,*,1,+
248 5,@,@,*,*,1,-,@,+
249 3,@,@,*,@,*,2,+,*
250 5,@,@,*,*,@,+
251 5,@,@,*,*,@,+,1,+
252 3,@,+,@,@,1,+,*,*
253 4,@,*,@,*,3,-
254 4,@,*,@,*,2,-
255 4,@,*,@,*,1,-
256 4,@,*,@,*

Date: 2017/July/13

Author: Dylan Holmes

Created: 2017-07-13 Thu 21:48

Emacs 25.2.1 (Org mode 8.2.10)

Validate