Biko's House of Horrors

Advent of Code 2021

A writeup on some of my more interesting solutions to 2021's Advent of Code. The code (in Rust) is available here.

Day 6: Lanternfish

Gold fish in blue water

Fish by Spike Stitch, CC BY-NC-ND 2.0.

A massive school of glowing lanternfish swims past. They must spawn quickly to reach such large numbers - maybe exponentially quickly? You should model their growth rate to be sure.

An observation we can make is that each lanternfish's reproduction does not depend on any other lanternfish. So, we can simplify our task by asking: "Given a lanternfish with a counter of $c$, how many fish will there be after $d$ days?" Then, we just sum over all the lanterfish we have to get the final result.

Given the rules, we can describe a lanternfish's reproduction with the following recursive function:

$$ F(c,d) = \begin{cases} 1 & d \le c \\ F(6, d - c - 1) + F(8, d - c - 1) & \text{else} \\ \end{cases} $$

In essence, this is saying:

  1. If the number of remaining days is less than the fish's counter, it won't have a chance to spawn, so we'll have just the one fish.
  2. Otherwise, it will spawn a new fish after $c$ days, resulting in two fish: one with a counter of 6 and another with 8. Therefore, we need to count their resulting populations, but after $d - c - 1$ days (i.e. the remaining days after our fish spawns).

Note that in order to calculate $F$ for a given $d$ we require only values of $F$ for smaller values of $d$. This sounds like a job for dynamic programming!

We could at this point build a $c \times d$ table, with $c = 8$ (the maximum counter value) and $d$ being 80 or 256, but there's actually a more memory-efficient solution.

Observe that we only actually care about days when a fish spawns, i.e. when $c=0$. In the function above, $F(c,d) = F(0,d-c)$ for any $d > c$:

$$ \begin{align*} F(0,d-c) & = F(6,(d-c)-0-1)+F(8,(d-c)-0-1) \\ & = F(6,d-c-1)+F(8,d-c-1) \\ & = F(c,d) \end{align*} $$

This means we can re-write it as:

$$ F(c,d) = \begin{cases} 1 & d \le c \\ F(6, d - 1) + F(8, d - 1) & c = 0 \\ F(0, d - c) & \text{else} \\ \end{cases} $$

Now, we can construct a $1 \times d$ table $T$ such that each entry $1 \le i \le d$ is $T(i) = F(0,i)$. We fill this table from $i=0$ to $i=d$ by evaluating $F(0,i)$ and using the already-computed values when possible.

All-in-all, this solution is $O(d)$ space and $O(n + d)$ time, with $d$ being the maximal number of days and $n$ being the initial number of fish.

Day 17: Trick Shot

The waning gibbous Moon above the Earth's horizon

The waning gibbous Moon above the Earth's horizon by NASA Johnson, CC BY-NC-ND 2.0.

Ahead of you is what appears to be a large ocean trench. Could the keys have fallen into it? You'd better send a probe to investigate.

Part 1: What goes up must come down

According to the rules, the probe has a constant acceleration in the $y$ direction. This means the probe is in free-fall! The nice thing about free-fall is that if the probe has an initial velocity $v_y$ upwards, when it comes back down to "ground-level" it will have velocity $-v_y$, i.e. the same speed, but downwards.1

How does this help us? Well, let's assume that our target area has a minimal $y$ coordinate of $y_\text{min}$, i.e. this is the bottom-most coordinate of the target area. And let us also assume that $y_\text{min} < 0$. This means that if we were to shoot downwards, the maximum speed we can use is precisely $|y_\text{min}|$, which will land us in the target area after 1 time unit. A larger speed will overshoot the target area.

Now, because of the symmetry of free-fall, we know that the maximum speed we can shoot at upwards is also $|y_\text{min}|$! If we shoot any faster, when the probe comes back down it will have a larger speed, and overshoot the target area.

Now is probably a good time to note that due to how the simulation works in this problem, if we shoot a probe upwards at $v_y$, it will come back down at $-(v_y + 1)$. Therefore, the maximal velocity to hit the target area should be $|y_\text{min}| - 1$ upwards.

From here, finding the highest position is just a matter of calculating the sum of an arithmetic progression.

And we didn't even write any code!

Part 2: It's not that simple

Unfortunately, it's not that simple. For starters, the "algorithm" above completely ignores the horizontal velocity. If our chosen vertical velocity $|y_\text{min}| - 1$ upwards places the probe in the target area at some time $t$ after launch, we also need to choose a horizontal velocity that will place the probe in the target area at the same time $t$. The algorithm assumes that such a velocity exists, even though there is no guarantee that it does (it did exist for my input).

Anyway, that solution is irrelevant for the second part of the problem. Now we need to find all possible velocities that land the probe in the target area.

One way to do it is to enumerate all possible pairs of $x$ and $y$ velocities and simulate the flight of the probe. That's fine, but can we do better? Given initial velocities $v_x$ and $v_y$, can we determine whether the probe will hit the target area at some point in the future, without simulating the whole flight?

To answer that, we need to define what it means to "hit the target area":

If we can express $x(t)$ and $y(t)$ in terms of $v_x$ and $v_y$, we will be able to solve for $t$ and obtain the times at which the probe reaches the boundaries of the target area. If there's an integer time between the probe entering the target area and the probe leaving it, then it's a hit.

Let's start with $y(t)$. According to the rules of the game, the current position is the sum of all $y$ velocities up to that point. Since the velocity changes as an arithmetic progression, we can use the formula for its sum:

$$ y(t) = (2v_y - (t-1)) \frac{t}{2} $$

And solving for $t$:

$$ \begin{equation} t_{1,2} = -\frac{-2v_y - 1 \pm \sqrt{4v_y^2 + 4v_y + 1 - 8y}}{2} \label{eq:y-time} \end{equation} $$

Assuming that the target area resides at negative $y$ coordinates, there is always at least one solution to \eqref{eq:y-time}: the probe's $y$ velocity always decreases, so it will eventually head downward, and thus cross the target area at some point. Moreover, because the probe's $y$ velocity always decreases, it will cross any negative $y$ coordinate only once — to cross it another time would require the velocity to change sign, i.e. to increase. Therefore, if there are two solutions to \eqref{eq:y-time}, one of them will be negative, and we should pick the positive.

Time to deal with $x(t)$. According to the rules, the speed in the $x$ direction decreases until it reaches 0. From this we can conclude that shooting in the direction opposite from the target area is useless, since that way the probe will never reach it.

Let's assume now that the target area resides in positive $x$ coordinates. If we shoot with a positive $x$ velocity, then we will have a constant acceleration of $-1$ until the probe stops. This means that until the probe stops, $x(t)$ behaves exactly as $y(t)$. And again we can solve for $t$:

$$ \begin{equation} t_{1,2} = -\frac{-2v_x - 1 \pm \sqrt{4v_x^2 + 4v_x + 1 - 8x}}{2} \label{eq:x-time} \end{equation} $$

In essence, we are treating the motion along the $x$ axis as ballistic motion, and the positive $x$ direction as "up". But, care must be taken when handling the solutions to this equation:

  1. If there are no solutions, then the probe never reaches the given $x$ coordinate with the given starting velocity $v_x$.
  2. If there is one solution, then the given $x$ coordinate is the "ballistic" trajectory's apogee. In a normal ballistic motion the probe would now start "falling" towards lower $x$ coordinates, but in our case it simply stops.
  3. If there are two solutions, then the "apogee" is at a coordinate larger than the given $x$, so the probe passes our target on the way "up", and then once again on the way "down". In this case, we should pick the smaller of the two $t$ values.

Now, armed with \eqref{eq:y-time} and \eqref{eq:x-time} we can determine when the probe flies through the target area. For vertical motion this is simple: since the probe will always reach both boundaries of the target area, we can take both times and round them to the nearest integer (see note below). This will be the range of integer times when the probe is within the target area.

For horizontal motion, things are trickier:

  1. If the probe misses the left boundary of the target area, then it's a miss.
  2. If the probe misses the right boundary, then the probe gets "stuck" in the target area. We can treat this as if the exit time is $\infty$.

There's one last bit of bookkeeping to tackle: what is the range of velocities that we should check? In the $y$ direction, the maximal speed is $|y_\text{min}|$, so we can bruteforce in the range $-|y_\text{min}| \dots |y_\text{min}|$ (again, assuming the target area is in the negative $y$ coordinates).

In the $x$ direction, by the same reasoning, the maximal speed is $|x_\text{max}|$ (assuming the target area is in the positive $x$ coordinates). And since shooting in the opposite direction is useless, we can check only the range $0 \dots |x_\text{max}|$.

Finally, a note on rounding times:

If, after rounding, the exit time is smaller than the entry time, then the probe misses the target area according to the rules of the game.

Day 21: Dirac Dice

Three green d3 dice

GenCon 2010- SWAG and Loot 02 - Dice by heath_bar, CC BY-NC-ND 2.0.

There's not much to do as you slowly descend to the bottom of the ocean. The submarine computer challenges you to a nice game of Dirac Dice.

Part 1: d100

Where do you stand?

The die we are using is deterministic, therefore given a starting position $i$ we should be able to compute the final position of a player after $n$ turns.

Let's consider player 1. On the first turn, they roll 1, 2, 3. On the next turn: 7, 8, 9. And so on. Formally, the first roll of turn $t$ will be:

$$ (0 + 6t \mod 100) + 1 $$

(We're doing these shenanigans since $\mod 100$ gives a number in the range $[0,99]$.)

The second and third rolls will be, respectively:

$$ (1 + 6t \mod 100) + 1 $$

$$ (2 + 6t \mod 100) + 1 $$

The final position after $n$ turns is, therefore:

$$ \begin{align*} p_1(i,n) = i + \sum_{t=0}^{n-1} & \left[ (0 + 6t \mod 100) + 1 \right. \\ & + (1 + 6t \mod 100) + 1 \\ & + \left. (2 + 6t \mod 100) + 1 \right] \mod 10 \\ \end{align*} $$

(Note that $i$ is 0-based here.)

This is quite ugly, but fortunately the modulo operation is distributive, and $(a \mod 100 \mod 10) = (a \mod 10)$ for any integer $a$2. This means we can wave our hands and write:

$$ p_1(i,n) = i + \sum_{t=0}^{n-1} [(1+6t) + (2+6t) + (3+6t)] \mod 10 $$

Simplifying further:

$$ p_1(i,n) = i + \sum_{t=0}^{n-1} [18t+6] \mod 10 $$

We're actually summing an arithmetic progression, so we have a closed-form solution:

$$ \begin{equation} p_1(i,n) = i + 9n^2 - 3n \mod 10 \label{eq:position-1} \end{equation} $$

Similarly, for the second player, we have:

$$ \begin{align*} p_2(i,n) & = i + \sum_{t=0}^{n-1} [(4+6t) + (5+6t) + (6+6t)] \mod 10 \\ & = i + \sum_{t=0}^{n-1} [18t+15] \mod 10 \end{align*} $$

Which yields:

$$ \begin{equation} p_2(i,n) = i + 9n^2 + 6n \mod 10 \label{eq:position-2} \end{equation} $$

Score

With expressions \eqref{eq:position-1} and \eqref{eq:position-2} for the position of a player $k$, we can calculate the score after $n$ turns given starting position $i$:

$$ \begin{equation} s_k(i,n) = \sum_{t=1}^{n} (p_k(i,t) + 1) \label{eq:score} \end{equation} $$

This is fine, but a closed-form expression would be better. Unfortunately, since there is a modulo operation inside $p_k$, this isn't as simple as the sum of an arithmetic progression.

However, modular arithmetic is compatible with polynomial evaluation, which means $p_k(i, a) = p_k(i, b)$ for any integers $a$ and $b$ that satisfy $a = b \mod 10$. How does this help us?

Observe that for any integers $j \ge 0$ and $1 \le m \le 10$:

$$ m = 10j + m \mod 10 $$

($10j = 0 \mod 10$, so we are left with $m = m$.)

Therefore:

$$ p_k(i, m) = p_k(i, 10j + m) $$

Which means we can write:

$$ \begin{equation} \sum_{m=1}^{10} (p_k(i,m) + 1) = \sum_{m=1}^{10} (p_k(i,10j + m) + 1) \label{eq:modulo-sum} \end{equation} $$

Now, let's expand expression \eqref{eq:score}:

$$ \begin{align*} s_k(i,n) & = \sum_{t=1}^{n} (p_k(i,t) + 1) \\ & = (p_k(i,1) + 1) + \dots + (p_k(i,n) + 1) \\ & = (p_k(i,1) + 1) + \dots + (p_k(i,10) + 1) \\ & \qquad + (p_k(i,11) + 1) + \dots + (p_k(i,20) + 1) \\ & \qquad + (p_k(i,21) + 1) + \dots + (p_k(i,30) + 1) \\ & \qquad + \dotsb \\ \end{align*} $$

That is, we can group the sum into "batches" of 10 elements. Then, due to \eqref{eq:modulo-sum}, we get:

$$ \begin{align*} s_k(i,n) & = \sum_{t=1}^{n} (p_k(i,t) + 1) \\ & = \left\lfloor \frac{n}{10} \right\rfloor \sum_{m=1}^{10} (p_k(i,m) + 1) + \sum_{m=1}^{n \mod 10} (p_k(i,m) + 1) \end{align*} $$

How does this help us? Well, in this form the sum can be computed in $O(1)$ time, since we're summing up at most 19 elements!

Finding the winner

With all this prologue we can finally answer the actual question: who is the winner of the game?

Note that the score of each player increases after every turn. This means that we can use binary search! Specifically, we'll search for the first turn where the score is at least 1000.3

Since calculating the score for a given number of turns is $O(1)$, the whole search is $O(\log n)$, where $n$ is the upper bound on the number of turns until a win. To be on the safe side, we can choose $n = 1000$, for the case where each turn lands on the first position on the board. This is technically impossible, but that's the absolute worst we can do.

Now, we can calculate the number of turns it takes for both players to win, and pick the one who wins first.

Part 2: The Multiverse

Oh boy, 444356092776315 universes? No way we can count that high.

What does it mean to win?

A single game of Dirac Dice consists of a sequence of moves between positions on the board. Let's focus, for now, on the moves of a single player only. What can we say about a sequence of moves that leads to a win?

Each position we land on contributes its value to the score, until a score of 21 is reached:

$$ \sum_{i=1}^n a_i \ge 21 $$

But that's not enough. The game ends as soon as a score of 21 is reached. So we need to add another constraint — that all moves but the last one don't add up to 21:

$$ \sum_{i=1}^{n-1} a_i < 21 $$

Rearranging a little:

$$ \begin{equation} 21 - a_n \le \sum_{i=1}^{n-1} a_i < 21 \label{eq:win} \end{equation} $$

That is, a sequence of moves $a_1, \dots, a_n$ is a winning sequence iff it satisfies the above inequality.

Counting wins: the table

Okay, so we know what a win looks like, but how many are there? The inequality above hints at a different question we can ask: how many ways are there to win in $n$ moves, with the last move yielding a score of $a_n$, for some arbitrary $n$ and $a_n$?

Indeed, $a_n$ is just the last position we land on in our traversal of the board. So what we are asking is: how many ways are there to win in $n$ moves, with the last move landing on position $a_n$, for some arbitrary $n$ and $a_n$?

Then, to get the total number of winning sequences, we add up the counts for every value of $n$ and $a_n$. Indeed, these two values are bounded, so we won't be looping forever:

  1. $1 \le a_n \le 10$, by the definition of the game board.
  2. $1 \le n \le 21$, since the score always increases by at least 1 after every move4.

Actually, the question we are asking is recursive. If:

$$ \begin{equation*} 21 - a_n \le \sum_{i=1}^{n-1} a_i < 21 \tag{\ref{eq:win} revisited} \end{equation*} $$

Then we can subtract the next element from the end, to get:

$$ \begin{equation} 21 - a_n - a_{n - 1} \le \sum_{i=1}^{n-2} a_i < 21 - a_{n - 1} \label{eq:subtract-two} \end{equation} $$

Indeed, the number of solutions to \eqref{eq:win} is the number of solutions to \eqref{eq:subtract-two}, times the number of ways we can get from $a_{n - 1}$ to $a_n$ (for instance, we can get from 1 to 5 by rolling 1, 1, 2, or 2, 1, 1, etc.)

Hey, it's dynamic programming time again!

Let's generalize: given a starting position $s$, a final position $f$, a number of turns $t$, and a half-open range $[m,M)$, how many ways are there to satisfy the inequality:

$$ m \le \sum_{i=1}^t a_i < M $$

Such that $a_t = f$ and the sequence $s, a_1 \dots a_t$ is a valid sequence of positions?

A valid sequence of positions is such that every position is reachable from the previous one by some throw of the dice. For instance, 1, 4 is a valid sequence, since we can get from 1 to 4 by rolling 1, 1, 1. On the other hand, 1, 2 is not a valid sequence, since the minimal number of positions we can move at a time is 3 (by rolling 1, 1, 1).

To answer our question we can now define a recursive function:

$$ \begin{equation} T(m,M,s,f,t) = \begin{cases} 0 & M \le m \\ T(1,M,s,f,t) & m < 1 \\ 0 & t \le 0 \\ c(f - s \mod10) & t=1 \land m \le f + 1 < M \\ 0 & t=1 \\ \sum_{d=3}^9 T'(m,M,s,f,t,d) & \text{else} \\ \end{cases} \label{eq:wins} \end{equation} $$

With $T'$ being:

$$ T'(m,M,s,f,t,d) = c(d) \cdot T(m - p(s,d) - 1, M - p(s,d) - 1, p(s,d), f, t - 1) $$

$p(i, d)$ being the position we end up in after moving $d$ spaces from position $i$:

$$ p(i,d) = i + d \mod 10 $$

(Note that $i$ is 0-based here.)

And $c(d)$ being the number of ways we can get a sum of $d$ after throwing three d3 dice5:

$$ \begin{equation} c(d) = {d - 1 \choose 2 } - 3 \cdot {d - 3 - 1 \choose 2} + 3 \cdot {d - 2 \cdot 3 - 1 \choose 2} \label{eq:dice-combos} \end{equation} $$

Let's go over \eqref{eq:wins} step-by-step. First, the edge cases:

  1. If $M \le m$, the range $[m, M)$ is empty, so there's no way for us to reach a score in it. Thus, $T$ is 0.
  2. Since the minimal score is 1, if $m < 1$ we can just "push it back" to 1. For instance, the number of ways to get a score in the range $[-100, 5)$ is the same as the number of ways to get a score in $[1, 5)$, since everything below 1 is unattainable.
  3. No matter what range of scores we ask for, it is impossible to reach with $t \le 0$, so again we return 0.

What happens when $t = 1$? Then we are asking for the number of ways to reach position $f$ from position $s$, and get a score in the range $[m, M)$. However, since we're making a single move only, the score is precisely $f + 1$ (remember that we are using 0-based indexing). Then:

Finally, let's tackle the recursive case. To get from some position $s$ to some position $f$ in $t$ moves and attain a score in $[m, M)$, we consider all valid moves from $s$. Given such a move $d$, we ask: how many ways are there to get from $p(s,d)$ (the position $d$ spaces after $s$) to $f$ in $t - 1$ moves while attaining a score in $[m - p(s,d) - 1, M - p(s,d) - 1)$? Note how we are subtracting the score of this next position from the range, like we did in \eqref{eq:win} and \eqref{eq:subtract-two}.

We then multiply this recursive solution by $c(d)$ — the number of ways to obtain such a move $d$ — and sum over all the possible values of $d$. The smallest possible value is 3, obtained by rolling 1, 1, 1. The largest is 9, by rolling 3, 3, 3.

Given this definition of the function, we can now build a table for the dynamic program. But to do so we need to set some boundaries:

  1. For $m$ and $M$, we are only interested in scores up to 21. While a winning score is at least 21, per \eqref{eq:win} we'll only be looking up values up to 21.
  2. For $s$ and $f$, the range is all possible positions: 0 to 9.
  3. For $t$, we need to know the maximum number of moves we can take to reach 21. This value is 14. To take the longest path to victory, we need to make sure that every move lands on the minimal possible position on the board. Ideally, we'd want to land on 0 every time, but we can't since we can move at most 9 spaces at a time. So, the best we can do is alternate between positions 0 and 1, by rolling 1 and 9. After 14 such moves we get to exactly 21, so 14 is our upper bound.

All in all, we get a table of $21 \times 21 \times 10 \times 10 \times 14 = 617400$ elements. Not bad.

Counting wins: the solution

Given this table, we can now answer our original question: how many ways are there to win in $t$ moves, with the last move landing on position $f$, for some arbitrary $t$ and $f$?

Given that final position $f$, we may have arrived at it by rolling any value $d$ between 3 and 9. And, as indicated by \eqref{eq:win}, the score prior to that last position must have been in the range $[21 - f - 1, 21)$. The number of ways to achieve that score is:

$$ T(21-f-1,21,s,(f-d \mod10),t-1) $$

Which we should multiply by $c(d)$ \eqref{eq:dice-combos}, since there are that many ways to arrive at $f$ with a step of $d$.

Now, given a number of moves $t$, we can calculate the number of ways to get a winning score from a starting position $s$, by summing over all possible final positions $f$:

$$ \begin{equation} w(s,t) = \sum_{f=0}^9 \sum_{d=3}^9 c(d) \cdot T(21-f-1,21,s,(f-d \mod10),t-1) \label{eq:wins-in-t-moves} \end{equation} $$

Counting universes

That's only 90% of the work done, however. We still haven't accounted for the moves of the other player.

For a player to actually win in $t$ moves, the other player must not reach a score of 21 before them. How many ways are there for the other player to lose in such a way? That's easy:

$$ \sum_{f=0}^9 T(1,21,s',f,t') $$

For any of the positions $f$ the other player (starting at $s'$) can end up in after $t'$ turns, we are looking for a score in the range $[1, 21)$. The $t'$ here is the number of turns the other player makes before the first player wins. If we are looking for wins by player 1, $t' = t - 1$, since player 1 goes before player 2. For wins by player 2, $t' = t$.

Now, combining this with \eqref{eq:wins-in-t-moves}, we can calculate in how many universes a player wins in $t$ turns:

$$ \begin{align*} w_1(s_1,s_2,t) & = w(s_1,t) \cdot \sum_{f=0}^9 T(1,21,s_2,f,t-1) \\ w_2(s_1,s_2,t) & = w(s_2,t) \cdot \sum_{f=0}^9 T(1,21,s_1,f,t) \\ \end{align*} $$

And thus the total number of universes in which a player wins is:

$$ \begin{align*} W_1(s_1,s_2) & = \sum_{t=1}^{14} w_1(s_1,s_2,t) \\ W_2(s_1,s_2) & = \sum_{t=1}^{14} w_2(s_1,s_2,t) \\ \end{align*} $$

And that's it!

Day 24: Arithmetic Logic Unit

A CPU seen from the bottom

CPU by sourcecoda, CC BY-NC-ND 2.0.

Magic smoke starts leaking from the submarine's arithmetic logic unit (ALU). Without the ability to perform basic arithmetic and logic functions, the submarine can't produce cool patterns with its Christmas lights!

It also can't navigate. Or run the oxygen system.

We are tasked with finding the largest (or smallest, in the second part) 14-digit number accepted by a program for the ALU. Since the number can't contain zeroes, this leaves us with only 22876792454961 options. We might be able to bruteforce this if we had a supercomputing cluster, but unfortunately we don't. So we'll have to work smarter.

Reverse Engineering

Let's take a look at the program. Maybe there's something in its structure that we can use to our advantage.

Here are the first 18 instructions of my version of it, from the first inp instruction to the next:

inp w
mul x 0
add x z
mod x 26
div z 1
add x 13
eql x w
eql x 0
mul y 0
add y 25
mul y x
add y 1
mul z y
mul y 0
add y w
add y 13
mul y x
add z y

Let's transform it into a more readable form:

inp w
x *= 0
x += z
x %= 26
z /= 1
x += 13
x = x == w
x = x == 0
y *= 0
y += 25
y *= x
y += 1
z *= y
y *= 0
y += w
y += 13
y *= x
z += y

The instructions at lines 2-4:

x *= 0
x += z
x %= 26

Can be further simplified to x = z % 26. The instructions at 7-8:

x = x == w
x = x == 0

Can be simplified to x = x != w. We now get:

inp w
x = z % 26
z /= 1
x += 13
x = x != w
y *= 0
y += 25
y *= x
y += 1
z *= y
y *= 0
y += w
y += 13
y *= x
z += y

Now, the code at lines 2-5:

x = z % 26
z /= 1
x += 13
x = x != w

Can be rearranged and simplified into:

x = (z % 26 + 13) != w
z /= 1

This code (6-10):

y *= 0
y += 25
y *= x
y += 1
z *= y

Can be simplified into z *= 25 * x + 1. And this (11-15):

y *= 0
y += w
y += 13
y *= x
z += y

Into z += (w + 13) * x.

We finally get:

inp w
x = (z % 26 + 13) != w
z /= 1
z *= 25 * x + 1
z += (w + 13) * x

In fact, by examining all "code blocks" in the program (running from one inp instruction to the next), we can observe that all of them follow the same pattern:

inp w
x = (z % 26 + A) != w
z /= B
z *= 25 * x + 1
z += (w + C) * x

Where A and C are arbitrary integers (even negative), and B is either 1 or 26. If x is 0, as determined by the input w, this block computes z /= B. If it is 1, the block computes:

z *= 26
z += w + C

Further examination reveals something else: in some blocks, x is always 1. For instance, consider the block above:

inp w
x = (z % 26 + 13) != w
z /= 1
z *= 25 * x + 1
z += (w + 13) * x

Since w can only be in the range 1 to 9, and since z % 26 + 13 is at least 13, it is always true that (z % 26 + 13) != w, so x is always 1, no matter the input.

Moreover, in the blocks where x is always 1, z is always divided by 1, and in the other blocks z is always divided by 26. So we have two kinds of blocks:

  1. Type 1:

    z *= 26
    z += w + A
    
  2. Type 2:

    x = (z % 26 + B) != w
    z /= 26
    z *= 25 * x + 1
    z += (w + C) * x
    

Where A, B, and C are arbitrary integers, with A and C always positive.

The Big Picture

Recall that our objective is for the program to output 0 in the z variable. This means that for every Type 1 block that multiplies z by 26 and adds some value there must be a "balancing" Type 2 block that divides by 26.

Note that we're making two assumptions here:6

  1. That the z += w + A part of Type 1 blocks never adds more than 26, so we need only a single division by 26 to negate a block's effect.
  2. That a Type 2 block in the program never comes before a Type 1 block. That is, Type 2 blocks exactly balance out Type 1 blocks.

Under the second assumption the first block in the program must be Type 1, and since z starts at 0 the block simply performs z = w + A. Under the first assumption z will be less than 26, so a division by 26 will bring the value down to 0.

The Algorithm

From these observations, the basic idea for finding all valid inputs is to go over all possible digits for Type 1 blocks, and choose digits for Type 2 blocks such that they perform a division by 26. That is, choose a digit w such that (z % 26 + B) == w.

By varying the order of chosen digits for Type 1 blocks we can control whether the first found solution is the smallest accepted number, or the largest. For the smallest, we should choose digits in order from 1 to 9. For the largest, we should choose from 9 to 1.

More formally:

  1. For each code block in the program:
    1. If it's a Type 1 block:
      1. Assign it some digit from 1 to 9 as input.
      2. If there're no more digits to assign, backtrack.
      3. Continue to the next block.
      4. If there's no solution with the chosen digit, return to step 1.
      5. Otherwise, return the found solution.
    2. If it's a Type 2 block:
      1. Assign it a digit w such that (z % 26 + B) == w, where z is the result of executing the whole program up to this point, with the digits chosen up to this point, and B is the constant in the block.
      2. If no such w digit exists (because it's out of the 1-9 range), backtrack.
      3. Otherwise, continue to the next block, and return the solution, if any.

Appendix: Dynamic Programming

Dynamic programming is a general method for solving recursive problems, where the solution depends on solutions to overlapping sub-problems.

For example, take the Fibonacci sequence:

$$ \begin{align*} F_0 & = 1 \\ F_1 & = 1 \\ F_n & = F_{n-1} + F_{n-2} \\ \end{align*} $$

A naive approach to calculating $F_n$ would be by closely following the definition:

fn fib(n: usize) -> u32 {
    if n <= 1 {
        1
    } else {
        fib(n - 1) + fib(n - 2)
    }
}

Except this has an exponential runtime. The problem is that we are making the same calculations over and over. For instance:

$$ \begin{align*} F_4 & = F_3 + F_2 \\ & = (F_2 + F_1) + (F_1 + F_0) \\ & = ((F_1 + F_0) + F_1) + (F_1 + F_0) \\ \end{align*} $$

Here, $F_2$ was calculated twice.

However, since each Fibonacci term depends only on previous terms in the sequence, we can calculate them up front, and use the calculation when necessary:

fn fib(n: usize) -> u32 {
    let mut table = vec![0u32; n + 1];
    table[0] = 1;
    table[1] = 1;

    for i in 2..=n {
        table[i] = table[i - 1] + table[i - 2];
    }

    table[n]
}

This way, the runtime is merely $O(n)$.7

This is the crux of dynamic programming: building a table of solutions to sub-problems in order to speed up finding a solution to the original problem. You can read more about it here.


  1. See here for an explanation of the difference between speed and velocity. ↩︎

  2. I don't have a formal proof for this, but a hand-wavy explanation is that $a \mod 100$ returns the two least-significant digits of $a$, and $b \mod 10$ returns the least-significant digit of $b$. So $a \mod 100 \mod 10$ returns the least-significant digit of $a \mod 100$, which is just the least-significant digit of $a$, AKA $a \mod 10$. ↩︎

  3. Here's one way to implement it. ↩︎

  4. Actually, the bound on $n$ is smaller. But we'll get to that. ↩︎

  5. I despise combinatorics with a fiery passion, so all I'm going to say on the matter is that this formula is derived from the stars and bars problem. See here and here for more. ↩︎

  6. This was the case in my puzzle input, but may not be true in general. The problem is still solvable without making these assumptions, but I did not handle that case. ↩︎

  7. In the particular case of the Fibonacci sequence there is also the closed-form expression↩︎