How small could you get the first iteration of the bootstrappageness? Are you worried about possible Trojans in compilers that you're trusting?
Login to reply
Replies (2)
it's an interpreter, and i could probably read the whole codebase within a couple of hours. it's not something i take very seriously - trojan in a programming language, and definitely not if i can read the source. such things would have clear red flags, like non-commented, strange hexadecimal constants that look like they could be binary code. those don't even belong in a compiler, there is no purpose for this. for elliptic curve functions, those are always well documented elements of the arithmetic group, and all the endomorphisms and symmetries and all that.
yes, bootstrapping can be very simple. the core C syntax is an example of a language that is already so small that it's simple to build into itself. but C has some terrible elements in it, like unclear bit lengths that are often platform dependant (this was the hardest part when i ported a hamming code error correction algorithm from C into Go some time back), the union type is an abomination, the pointer dereference operator versus the dot operator for struct fields is confusing. all of these complexities lead to slow compilation, and harder validation of the correctness of the translation into machine (or other) language.
things you can leave out of Go that are not needed for a compiler are pretty extensive. channels, mutexes, atomics, goroutines, interfaces, probably you can get away with no maps by using inverted indexes or implement a key/value index map purely with slices. you could leave out slices, too, but that would require a lot of fiddling with creating them as they are extremely handy for parsing streams of bytes. you could leave out strings, too, because they are immutable and weird (and that's one of the things i'm removing anyway), but of course you would still need a string literal, it just would map to a slice of bytes (ascii/utf-8).
Woosh! Sooooo many words I don't know. But it sounds like C can be small. Could you segment off different things and kinda make it modular, so that you add in a piece after you build it for additional capability? Those channels, mutexes, interfaces and maps? That way you could always return to the initial compiler and then redo the entire bootstrap.