Feedback by Nicholas Drozd

Posted: Jan. 4, 2022

Hello DZ! In your Opinion 155 you make some skeptical remarks about
the "Busy Beaver" problem. Specifically, you say that it is either
meaningless (when interpreted incorrectly) or boring (when interpreted
correctly). The interpretation problem has to do with philosophical
commitments to the existence of infinite sets, which in your view make
no sense.

I would like to attempt to convince you of two things:

  1. The Busy Beaver problem requires no commitments to infinite
     anything; on the contrary, it is quite finite and well within the
     range of what a computer can "understand".

  2. It is a fabulously interesting case of experimental mathematics.
     Every single concrete value that is known was discovered by
     computer, and nothing at all can yield more new values except for
     more computer searching.

If things go well, perhaps I can even convince you to take a crack at
an open problem. This problem is described at the end. (Feel free to
skip ahead.)

Now, the Busy Beaver question asks: what is the longest that a Turing
machine program of N states and K colors can run before halting when
started on the blank tape? We can think of "the" Busy Beaver problem
as a sequence of problem instances parameterized by N and K. Although
that sequence might go on "infinitely", no particular instance deals
with anything infinite. There are only finitely many programs of N
states and K colors.

In fact, there are approximately (NK)^(NK) of them. This value gets
out of hand quickly even for small values of N and K. There are
normalization techniques that can be applied to reduce this number
somewhat, and the fact that all programs are to be started on the
blank tape can also be exploited to reduce the program space under
consideration. But even these approaches become inadequate when N or K
gets too big.

Suppose we've pared the finite set of all N-state K-color TM programs
down to a reasonably small set of candidate programs. The next step is
to find which one among them runs the longest before halting. More
specifically, we're looking for a natural number S and an NK-program P
such that P halts after S steps and no NK-program halts after S steps.

There is no general method to upper-bound S in terms of N and K; for
if there were, it could be used to solve the halting problem.

Solving the Busy Beaver problem for NK-programs requires identifying
the "champion" program and also proving that it really is the champion
(that is, proving any other NK-program halts within S steps or doesn't
halt at all).

The proof part is tricky even in simple cases, and it isn't the kind
of "proof" that illuminates anything. A skeptic such as yourself might
describe it as meaningless or needlessly formalistic. Then again, the
1963 "proof" of the 3-state 2-color Busy Beaver problem was one of the
earliest cases of a computer-assisted proof. So maybe you will have
some sympathy for it. Keep in mind that there are only finitely many
cases to consider.

But finding the champion program is something that should in principle
be possible. It's a perfectly finitistic and constructive pursuit!
Once found, a champion program can be verified. For example, a 5-state
2-color program was discovered in 1989 that runs for 47,176,870 steps
before halting. Nobody had any clue that such a program existed until
it was discovered. Of course, it was turned up by computer search.

So we have our list of candidate programs, and we want to find the
champion. How would you go about doing this? It would be nice to be
able to run the programs forever and see which ones ultimately halt,
but obviously it's nonsensical to talk about running a program
"forever". We can only ever run programs for a finite number of steps.

Okay, so run all of them for T steps and see which ones halt. Great,
but how many steps is T? There doesn't seem to be any principled way
to decide on a good T; in practice it's determined by 1) how powerful
my computer is, 2) how long I'm willing to wait, and 3) my
out-of-thin-air estimate on the upper bound of S. The history of Busy
Beaver research is littered with estimates of S that turned out to be
comically wrong. I've even made a few stupid estimates myself!

I claim that there is no theory at all that can give a clue as to the
value of S. This makes the Busy Beaver search maximally experimental:
every single known value was a new empirical discovery.

You've said that computers will eventually make human mathematicians
obsolete. A practical way to understand the "undecidability" of the
Busy Beaver problem is that even computers will not be able to figure
it out in general. Nor, for that matter, could hyper-dimensional
pure-energy beings. Nobody can! Perhaps a few more values can be
figured out than humans can find, but anyone who tries will hit a wall
(and sooner rather than later). There will always be more to do.

Hopefully I've succeeded in showing that the Busy Beaver problem is
interesting to computer mathematicians. Is it interesting to
traditional mathematicians? I think so. Consider the function L on the
natural numbers:

  * If n is divisble by 3, then L(n) = 0.
  * Otherwise, n = 3k + r with r in {1, 2},
      and L(3k + r) = L(5k + r + 3).

Is L total? That is, do we have L(n) = 0 for all n? It seems that way,
but nobody knows how to prove it.

The function L, which is closely related to the Collatz function, is
implemented by the 4-state 2-color Blanking Beaver champion program
mentioned above. Keep in mind that 4 states and 2 colors is a pretty
short program! But still it is long enough to implement a function
with a whiff of undecidability. (I don't think traditional
mathematicians have the right vocabulary for dealing with functions
like these, but that's a discussion for another day.)

OPEN PROBLEM

The classic Busy Beaver problem asks how long an NK-program can run
before halting when started on the blank tape. Here's a variation
called the Blanking Beaver problem: how long can an NK-program run
when started on the blank tape before the tape becomes blank again?

Call the classic Busy Beaver function BB and call the Blanking Beaver
function BLB. How would you estimate the rate of growth of BLB in
terms of BB? Do the values stay close together, or does one grow a lot
faster than the other?

It's been known for a few decades that BB(4, 2) = 107. That is, every
4-state 2-color program when started on the blank tape either halts
within 107 steps or doesn't halt at all. It's also known that BB(5, 2)
≥ 47,176,870 (by a program discovered in 1989) and that BB(2, 4) ≥
3,932,964 (by a program discovered in 2005).

Just in July 2021 I found a witnessing program that shows BLB(4, 2) ≥
32,779,477. So BLB seems to grow a lot faster than BB.

Okay, here's the OPEN PROBLEM: what is the value of BLB(2, 4)?

The best that I've been able to find is that BLB(2, 4) ≥ 190,524. That
isn't a very impressive value, but my feeling is that it isn't the
true value. We know that BB(2, 4) > BB(4, 2) and we know that BLB(4,
2) > BB(4, 2), so a reasonable hypothesis is that BLB(2, 4) > BLB(4,
2). Is that hypothesis true? I've argued that no theory can answer
this question, and that it's strictly a matter of experimental
computer-based mathematics.

Gee, if only there were someone who could help out in that domain!

Nick Drozd