Bootstrapping a C compiler
In “Reflections on Trusting Trust”,
Ken Thompson shows that you can’t trust a program on your computer,
even if you “compiled it from source” yourself,
because your compiler is also untrusted.
It’s not good enough to recompile your compiler,
because for that you need another untrusted compiler.
gcc currently on your machine has a very long compilation ancestry,
completely lost to the sands of time.
In response, people say “it’s turtles all the way down”.
The saying implies the stack is infinite, but it’s not.
A fix to the trust problem is to rebuild the stack of turtles yourself.
Imagine a project called “BootstrapCC”.
BootstrapCC begins with a handwritten ELF x86 program,
b0 has a tiny job: to transform ASCII hex into binary.
b0 is our first compiler,
and ensures that the rest of our source can be written in readable ASCII.
b0 is small and trivial enough to be hand-verified.
BootstrapCC builds the binary
b2 using both of those, and so on.
Eventually, BootstrapCC produces
a compiler for the subset of C used by the Tiny C Compiler.
From here, BootstrapCC can compile TCC,
from which most other things in existence can be built.
Along the way,
BootstrapCC creates successively higher-level languages:
an assembler, an IR, perhaps a stack-based language or a Lisp.
BootstrapCC does not rely on the lost 70-year lineage of modern programs. BootstrapCC does rely on some things, such as Intel hardware and the OS. But this is a much smaller footprint.
People are aware of the “trusting trust” paper, but they treat it as a curiosity. Some people are fanatical about compiling their programs themselves, but don’t take it to its logical conclusion. Lots of people love creating new programming languages, but all of them are written in some other language. There are even lots of people creating new compilers, but all of them need compiling with something else. BootstrapCC, sadly, is fictional, and I’m not aware of any project which performs this the bootstrapping process I described here.
More by Jim
- Project C-43: the lost origins of asymmetric crypto
- How Hacker News stays interesting
- My parents are Flat-Earthers
- The dots do matter: how to scam a Gmail user
- The sorry state of OpenSSL usability
- I hate telephones
- The Three Ts of Time, Thought and Typing: measuring cost on the web
- Granddad died today
- Your syntax highlighter is wrong
I wrote this because I felt like it. This post is not associated with my employer. Found an error? Edit this page.