Bootstrapping a C compiler
In “Reflections on Trusting Trust”,
Ken Thompson shows that you can’t trust a program on your computer,
even if you “compiled it from source” yourself,
because your compiler is also untrusted.
It’s not good enough to recompile your compiler,
because for that you need another untrusted compiler.
gcc currently on your machine has a very long compilation ancestry,
completely lost to the sands of time.
In response, people say “it’s turtles all the way down”.
The saying implies the stack is infinite, but it’s not.
A fix to the trust problem is to rebuild the stack of turtles yourself.
Imagine a project called “BootstrapCC”.
BootstrapCC begins with a handwritten ELF x86 program,
b0 has a tiny job: to transform ASCII hex into binary.
b0 is our first compiler,
and ensures that the rest of our source can be written in readable ASCII.
b0 is small and trivial enough to be hand-verified.
BootstrapCC builds the binary
b2 using both of those, and so on.
Eventually, BootstrapCC produces
a compiler for the subset of C used by the Tiny C Compiler.
From here, BootstrapCC can compile TCC,
from which most other things in existence can be built.
Along the way,
BootstrapCC creates successively higher-level languages:
an assembler, an IR, perhaps a stack-based language or a Lisp.
BootstrapCC does not rely on the lost 70-year lineage of modern programs. BootstrapCC does rely on some things, such as Intel hardware and the OS. But this is a much smaller footprint.
People are aware of the “trusting trust” paper, but they treat it as a curiosity. Some people are fanatical about compiling their programs themselves, but don’t take it to its logical conclusion. Lots of people love creating new programming languages, but all of them are written in some other language. There are even lots of people creating new compilers, but all of them need compiling with something else. BootstrapCC, sadly, is fictional, and I’m not aware of any project which performs this the bootstrapping process I described here.
Update!: BootstrapCC exists!
It’s a wonderful-looking project called
which describes itself thus:
This is a set of manually created hex programs in a Cthulhu Path to madness fashion. Which only have the goal of creating a bootstrapping path to a C compiler capable of compiling GCC, with only the explicit requirement of a single 1 KByte binary or less.
Additionally, all code must be able to be understood by 70% of the population of programmers. If the code can not be understood by that volume, it needs to be altered until it satisfies the above requirement.
More by Jim
- Your syntax highlighter is wrong
- Granddad died today
- The Three Ts of Time, Thought and Typing: measuring cost on the web
- I hate telephones
- The sorry state of OpenSSL usability
- The dots do matter: how to scam a Gmail user
- My parents are Flat-Earthers
- How Hacker News stays interesting
- Project C-43: the lost origins of asymmetric crypto
- The hacker hype cycle
- The inception bar: a new phishing method
- Time is running out to catch COVID-19
- A probabilistic pub quiz for nerds
- Smear phishing: a new Android vulnerability
Tagged . All content copyright James Fisher 2018. This post is not associated with my employer. Found an error? Edit this page.