Honor Among Sieves
- Published on
- ∘ 15 min read ∘ ––– views
Previous Article
Next Article
I'm revisiting my implementation of the prime number sieve from this post since I was losing sleep over its sheer size due to the numerical format of its output.
A Preface About Finite State Automata
A mealy machine1 is a finite state machine (though it doesn't actually have to be infinite if you get clever about positioning of ellipses in your proofs), which differs from the perhaps-conventional understanding of a FSM in that its transition dynamics are determined explicitly by both its inputs and the current state rather than just the current state.
We express a Mealy machine as a tuple 2 where:
- is the set of all possible states,
- is the initial state,
- is the set of all possible inputs,
- the set of all possible outputs,
- is the transition function mapping state-input pairs to the successive state,
- is the output ("reward") function mapping state-input pairs to the corresponding output.
It proved useful to keep this in the back of my head when mulling over Tromp's sieve.
Digesting Diagrams
On his site, Tromp provides a diagram of the alternate form of the prime number sieve:
which is far more compact and visually pleasing than the behemoth I cooked up:
This is largely because of the nature of our respective sieve's outputs. Whereas Tromp's outputs an infinite list of booleans corresponding to whether or not the positive integer number at that index is prime, mine outputs the primes themselves – and church-encoded numerals (or any reliance thereof) naturally results in much larger diagrams.
Scattered throughout his Algorithmic Information Theory repo are various .lam
programs related to prime number sievery, and I believe this is the one which produces an infinite binary stream where a '1'
indicates that the number at that index is prime, and a '0'
indicates it's composite:
let
zero = \x\y.x;
one = \x\y.y;
consz = \xs\p.p zero xs;
oneS = \cont\x\xs\p.p x (xs cont);
zeroS = \cont\x\xs.consz (xs cont);
fix = \f.(let x=f x in x);
ones = fix (\ones\p.p one ones);
sieve = fix (\sv\n\x\xs\p.p x (x xs (xs (fix n)) (sv (\r.oneS (n r)))));
main = consz (consz ones (sieve zeroS))
in main
This function can be visualized as the above diagram in conjunction with Brauner's illustration tool when translated to the format the Brauner's script is expecting (->
s instead of .
s, and with curried functions instead of multiply nested \<var>
definitions).
However, it appears as though Brauner's tool uses applicative order reduction where reduceOneLmIm
presumably means leftmost innermost (i.e. eager) evaluation, which –as detailed in the prelude3 post– is unfit for recursively defined functions depending on the placement of the recursive call.
How, then, did Tromp manage to produce the infinitely expanding list of booleans corresponding to their indices primality? To answer this question, we'll first annotate and trace4 his implementation.
Rosetta
Constants
The first few definitions are comparatively straightforward, zero
and one
correspond to true
and false
.
Streams, Constructors, and combinators
consz = \xs\p.p zero xs;
consz
(read: cons zero
) builds a stream node where the head is hardcoded to zero
.
Recall that cons
is defined as
cons = \x\xs\p.p x xs;
where p
is some predicate returning true (fst
) or false (snd
), allowing us to conditionally select the head or tail of our cons
cell. consz
, then, simply hardcodes the first argument of cons
to be zero
.
oneS
and zeroS
are stream transformers defined as:
oneS = \cont\x\xs\p.p x (xs cont);
zeroS = \cont\x\xs.consz (xs cont);
oneS
takes as arguments:
- a function
cont
to transform (apply to) the tail (xs
) of the stream, - a stream of the form
x:xs
, - and a constructor
p
such ascons
orconsz
.
zeroS
is similar with the exception that it hardcodes the constructor p
to be our previously defined consz
constructor to prepend a '0'
the stream.
These two stream transformers will be used conditionally to keep/eliminate primes/composites from our stream of naturals.
Next we have the non-lambda-pure fix
:
fix = \f.(let x=f x in x);
which is just the syntactically-glucosed Z
combinator:
Z = \f -> (\x -> f (\v -> x x v)) (\x -> f (\v -> x x v));
the rub
ones = fix (\ones\p.p one ones);
creates an infinite stream of one
values to be used as seed values to the sieve (assume all numbers are prime until proven guilty of composition, and replaced with a zero via consz
).
sieve = fix (\sv\n\x\xs\p.
p x (
x xs (xs (fix n))
(sv (\r.oneS (n r)))
)
);
this is the beast.
sv
is the recursive self-reference,n
is a function which generates a mask to apply to the stream – akin to the wheel of composite elimination, initiallyn
iszeroS
,x:xs
is our output stream, initially set to be0:0:oneS
,- and again
p
is the constructor:cons
orconsz
.
The body of this function re-constructs the existing stream of naturals after applying the mask produced by n
that's been constructed so far, and updating the mask as needed if the current number being processed proves to be prime (thus eliminating all future multiples from consideration).
┌───────────────┐
│ Stream input │
└──────┬────────┘
↓
┌──────┴───────┐
│ Apply mask n │
└──────┬───────┘
↓
┌──────┴───────┐
│ Build node │ (using p)
└──────┬───────┘
↓
┌──────┴───────┐
│ Update mask │ (n → oneS (n r))
└──────┬───────┘
↓
┌──────┴───────┐
│ Recurse sv │ (with updated n)
└──────────────┘
Let's see how that works in practice.
Suppose we've determined that the head is prime – as is the case for the first call to sieve
for x = 2
. The sieve needs to update the filter n
such that all multiples of two are flagged as composite. The body of this function accomplishes that via:
- retaining the current value:
p x (...)
- Applying the current stream to itself and the filter mask:
x xs (xs (fix n))
- Recursing into the sieve with a new filter function:
sv (\r -> oneS (n r))
Every recursion of sv
pushes another 1
into the filter, delaying the next 0
that zeroes out a multiple.
E.g. for an initial stream of oneS
corresponding to the naturals:
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, ...]
at each step, if a number is marked as '1'
by the mask produced by n
, it's considered prime, and we update the mask such that all of the multiples of that number are marked as composite. This is achieved by prepending a '1'
to n
(which, recall, is not simply a stream, but rather a stream constructor)5 to "count" when the next multiple of x
occurs. Prepending/offsetting the mask is accomplished via:
n' = \r -> oneS (n r)
this means "add a '1'
immediately before whatever the previous mask-producing-function n
said." The first invocation, then, produces a function which masks even numbers since n
s initial value is just zeroS
, and so
n' = \r -> oneS (n r)
= \r -> oneS (zeroS r)
=> 1 0 1 0 1 0 1 0 ...
So, when applied to the tail of our stream `xs:
current | ↓ | ||||||||
---|---|---|---|---|---|---|---|---|---|
index | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
x:xs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
n | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
the first mask is created, updating n
to filter even numbers by prepending a 1
to each entry:
current | ↓ | ||||||||
---|---|---|---|---|---|---|---|---|---|
index | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
x:xs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
n | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
now applying n
to xs
:
current | ↓ | ||||||||
---|---|---|---|---|---|---|---|---|---|
index | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
x:xs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
n | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
x:xs' | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
Then, recursing to the next index, we encounter another x=1
scenario at index 3:
current | ↓ | ||||||||
---|---|---|---|---|---|---|---|---|---|
index | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
x:xs | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
n | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
We update n
by again prepending another oneS
before each entry – where "entries" are function compositions, so n
(and its successors) can be thought of as groups:
n = (1 0) (1 0) (1 0) (1 0) (1 0) ...
n' = 1 (1 0) 1 (1 0) 1 (1 0) 1 (1 0) 1 (1 0) ...
n'' = 1 (1 1 0) 1 (1 1 0) 1 (1 1 0) 1 (1 1 0) 1 (1 1 0) ...
.
.
.
and at time of evaluation, the collection of n
s are folded via logical conjunction such that the mask applied at index 3 is the composition of n2 ∘ n3
. So conceptually we have
n2 = 1 0 1 0 1 0 1 0 1 ...
n3 = 1 1 0 1 1 0 1 1 0 ...
n2 ∘ n3 = 1 0 0 0 1 0 1 0 0 ...
However, we must note also that each successive iteration of n
is offset by the depth of the next multiple of the prime its generated from, so n3
begins at index 6:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
---|---|---|---|---|---|---|---|---|---|---|---|
n | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
n2 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
n3 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 |
n2 ∘ n3 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 |
current | ↓ | ||||||||
---|---|---|---|---|---|---|---|---|---|
index | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
x:xs | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
n | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 |
x:xs' | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 |
okay swag, and the next prime (5) will improve the mask beginning at index 25, 7 at 49, etc. So we get diminishing returns (cardinally speaking), but returns nonetheless.
Additionally, by accomplishing recursive sievery by pushing offsets of oneS
into our mask generator n
, we never have to realize the cost of eliminating infinitely many multiples of infinitely many primes, and instead only ever have to evaluate a single function call (the composition of n
) on the head of our stream to produce the corresponding output! The worst thing that can happen to Haskell code is that it gets executed.
In Haskell, this algorithm can be more easily read as:
primes = let
one = \cont (x:xs) -> x:cont xs
zero = \cont (x:xs) -> '0':cont xs
succ n = one . n
primes n = '1':f (primes sn) where
f = sn f
sn = succ n
in
'0':'0':primes zero
main = putStr primes
Diagram
Now that we have an intuition about how these pieces come together to form the sieve, we can identify the component pieces within the complete diagram.
From left to right, we have the first two hardcoded 0s as the monolithic vertical bars.
Then we have two placeholders for the first instance of our mask n
, with the value of false
immediately to its right:
The gross middle bit is the representation of the body of sieve
:
sieve = fix (\sv\n\x\xs\p.
p x (
x xs (xs (fix n))
(sv (\r.oneS (n r)))
)
);
Next is the visualization of oneS
On the far right side, we can pick out consz
:
let
zero = \x y -> x;
one = \x y -> y;
consz = \xs p -> p zero xs
in
consz
Tromp's BLC sieve, translated from BLC to normal λ-calculus is:
which color codes to:
and (after a singular normal-order, not applicative order beta reduction) produces an intermediate result:
from which we can start to see some shape emerging with two leading true
= composite entries, two additional placeholders for the forthcoming false
= prime entries for 2
, 3
, and the recursive structure on the tail.