Email or username:

Password:

Forgot your password?
32 comments
Sergey Bugaev

@janneke not to underpaint the importance and coolness of this achievement, here's an uninformed question that you probably get a lot: how does this work wrt to depending on a Linux kernel (which is tons of C), some basic userland (or can it run as PID 1-and-only?), and x86 hardware (which... who knows what it does) to run this 357 byte binary?

If you can't trust a compiler to build your program correctly, why can you trust a kernel and some hardware to run your binary correctly?

Andrius Štikonas

@bugaevc @janneke github.com/fosslinux/live-boot project has some initial code to bootstrap Linux. It can build Linux but we still need to kexec into it (which shouldn't be too hard).

Janneke

@bugaevc
Good question! Of course: you can't.

There is currently no good answer to that other than that we chose to start on getting rid of the obviously unnecessary and "easy" binary seeds first. Or: different people have different interests and competences, if we start then eventually we'll probably get there someday. There are some ideas, though.

The least elegant but easiest "solution" would be to revert to Diverse Double Compliing (DDC, dwheeler.com/trusting-trust/). The low level tools (stage0, m2-planet, and mes) can easily do cross builds. You could build on different architectures, and kernels if you like and compare package checksums.

We did something like this for Mes (all x86_64-linux, though) at the fifth reproducible builds conference (RB-V, guix.gnu.org/en/blog/2019/repr)

Running as PID 1: During the same RB-V conference, Ludovic Courtès prototyped building a Guix package in the initial ramdisk. After the build the package is discarded, but before that its checksum is printed and can be checked with a build under GNU/Linux.

People have been working to build tiny kernels, such as: github.com/ironmeld/boot2now.

Also, Stage0 was designed to also run on the Knight VM, one could imagine running that on simpler hardware, or running the VM on different machines/architectures, dunno.

@bugaevc
Good question! Of course: you can't.

There is currently no good answer to that other than that we chose to start on getting rid of the obviously unnecessary and "easy" binary seeds first. Or: different people have different interests and competences, if we start then eventually we'll probably get there someday. There are some ideas, though.

theruran 🌐🏴

@janneke @bugaevc The folks in #bootstrappable @liberachat are working towards resolving those questions. A POSIX kernel capable of building Linux, and a bootstrap from UEFI are some projects off the top of my head.

They want to get to a FPGA softcore bootstrap, then a manually constructed CPU in TTL to bootstrap from.

But yeah, there are many parts to work on that would improve our (collective) situation, such as bootstrapping GHC: @nomeata mastodon.online/@nomeata/11026

Sergey Bugaev

@theruran @janneke I was thinking something along these lines:

find an "open source hardware" board where you can somehow verify the hardware aren't playing games on you (in particular not running all of your code in a nearly undetectable hypervisor, like we know Intel does...), probably some RISC-V board

Sergey Bugaev

@theruran @janneke

run you bootstrapping code on it with no OS whatsoever; hopefully it doesn't need much from the OS

you'd have to build in a serial driver or something like that (blinking LEDs is cool but you can't input program source this way), not that I have any idea about hardware

theruran 🌐🏴

@bugaevc @janneke and #GNUHurd could be another approach, right? it can host GCC to build Linux already?

Sergey Bugaev

@theruran @janneke the Hurd surely can run GCC and cross-compile Linux; but I'm not sure you would be winning much, for two reasons:

1. It's nowhere near as trivial to do "syscalls" as on Linux — on Linux you place some values into some registers and perform "int 0x80" or "syscall", and that's it, you've called write or exit. On the Hurd, these all are implemented in glibc on top of Mach IPC, and that needs quite a lot of code to happen.

Sergey Bugaev

@theruran @janneke Here's a project of mine where I simply print "Hello world" without relying on glibc: github.com/bugaevc/hello-hurd — but that too is written in C, imagine writing it all in hex.

2. Linux is huge, but you can build it in a minimal configuration (see tiny.wiki.kernel.org/). Mach may be a microkernel, but it's minimal in functionality, not size. In fact it's a meme in the microkernel community just how large for a microkernel Mach is. But I don't have any numbers to quantify this.

Ludovic Courtès

@bugaevc Speaking of the role of the kernel, an interesting question is how to implement isolated builds on the #Hurd—see “Isolated build environments” at guix.gnu.org/en/blog/2020/chil for an overview.

I’m curious what you think of this!

@janneke

Sergey Bugaev

@civodul hi!

I'm probably not Guix-savvy enough to fully comprehend the issue here — but as I understand it, you want to be super explicit about what each package needs to be built. Do you include libc, cc, binutils into this list of dependencies? (I imagine you do, otherwise it wouldn't be reproducible.) Apparently you do include /bin/sh.

@janneke

Sergey Bugaev

@civodul

So yeah, the Hurd servers aren't much different or any more "external" to the environment than /bin/sh. I don't think you should be firmlinking stuff from the host; you should probably just spawn a mini subhurd for each build. You want pipes and fork/exec, so you need pflocal, proc, and exec servers.

@janneke

Sergey Bugaev

@civodul

(Also /servers/proc, mentioned in your mail, is not a thing, of course 🙂 — the proc server is one of the two servers, the other one being auth, that are not accessible through the file system, but only through _hurd_ports.)

@janneke

Sergey Bugaev

@civodul

Your mail about /bin/sh also raises an interesting topic of paths. Do you want to change /dev/null and /servers/exec to some other (hash-derived I would imagine) paths? Sounds wild but you totally could!

You could then either patch glibc (and everyone who expects to find /dev/null at its usual place), or provide symlinks. But then again I don't know enough about Guix to judge here.

@janneke

Sergey Bugaev

@civodul

Unfortunately all this wouldn't help you too much with bootstrapping from source, since you cannot do I/O easily on the Hurd like you can on Linux with a few instructions; you need to do RPCs and all that (even to get your argv). This is of course hidden from you when you're using glibc.

@janneke

Sergey Bugaev

@civodul

> Also, one could argue that things like /dev/null have a well-defined interface that’s set in stone and that, consequently, how they’re implemented does not matter at all.

Yes, but also no: there certainly can be differences in behavior that are allowed by the interface (where it explicitly doesn't guarantee something), but (due to bugs) can influence the outcome. For instance, does every write to /dev/null always write the whole buffer, or can there be short writes?

@janneke

Sergey Bugaev replied to Sergey

@civodul

Or: can a signal interrupt a write to /dev/null? (On SerenityOS the answer used to be no, on the Hurd it's a resounding yes, dunno about Linux.)

@janneke

Ludovic Courtès

@bugaevc Exactly! So the question becomes: assuming you have nothing but the Mach syscalls at your disposal, what chain of programs building on each other would eventually let you run a proc and an exec server so you have the beginning of a POSIX build environment?

The whole stage0/M2/Mes story on Linux was quite a puzzle; its Hurd version would push it further. :-)

@janneke

Ludovic Courtès

@bugaevc The Hurd code lives in /gnu/store/…-hurd-*, but the translation points in the build environment would remain /dev/* and /servers/*. Changing that would be impractical and bring nothing.

@janneke

Sergey Bugaev

@civodul

Here's a fun little problem: if you have lost your proc and auth ports, but still have your fs root dir port, how can you recover those two?

@janneke

Ludovic Courtès

@bugaevc Possibly (but not necessarily) by looking up /servers/proc for the first one; as for auth, it’s forever lost?

@janneke

Ludovic Courtès

@janneke @bugaevc Actually I keep making the same mistake: there’s no /servers/proc but for some reason we have it in childhurds, just with no translator on it (I may be the guilty party :-)).

Sergey Bugaev replied to Ludovic

@civodul

Yes, /servers/proc is not it :)

I was thinking of the following scheme, which I have not tried, so this is just a theory.

You create an executable (perhaps as an unnamed file) that is setuid to yourself, and then exec it (not over your own task, unless you want that), without passing an auth or proc ports (as you have none).

@janneke

Sergey Bugaev replied to Sergey

@civodul

The translator notices this and creates a new auth handle based on its idea of your effective uids/gids (see libfshelp/exec-reauth.c); and then the exec server gives the new task a fresh proc port. You cannot access the new task because of setuid/EXEC_SECURE, but as you created the executable you still control what it does.

@janneke

Sergey Bugaev replied to Sergey

@civodul

In particular it may send its proc/auth ports back to the original task, and the original proc port may then be recovered by a simple

proc_task2proc (other_proc, mach_task_self (), &my_proc)

The exact auth port I don't think can be recovered, but at least you now have another auth port with your effective uids/gids.

@janneke

Ludovic Courtès

@bugaevc The build environment includes nothing bug the explicitly-declared userland dependencies. If a package depends on GCC and Binutils, it gets them; if not, it doesn’t.

There’s no /bin/sh there—no /bin, no /usr, nothing.

On Linux, there’s /dev and /proc, but for separate namespaces.

@janneke

Jonathan Frederickson

@civodul @bugaevc @janneke This now got me thinking... I found a Hurd post talking about how it adds POSIX compatibility to Mach: gnu.org/software/hurd/communit

And it says it still provides access to the capability-based permissions underneath, which sounds nice. But it also got me thinking: there's likely to be a lot more software targeting WASI soon, which is natively capability-based. Could it be possible for Hurd to have WASI compatibility too?

Ludovic Courtès

@jfred I have to admit I don’t know WASI…

But overall, I’m not enthused by the idea of adding an extra interpretation layer like Wasm on top of my CPU. Capsicum or the Hurd’s native interfaces look more appealing to me though.

@janneke @bugaevc

Csepp 🌢

@janneke @bugaevc
DuskOS might also be of interest, it has an even smaller binary "seed" than CollapseOS and builds everything else from Forth at boot time. It also has a (non-standard) C compiler.

Janneke

@csepp @bugaevc
Thanks, DuskOS looks pretty interesting. I think it's the first real bootstrapping effort I've seen built on Forth (after hearing a bit too often: Why don't you use Forth, bootstrapping will be triviial).

Especially as it seems that our efforts are largely complementary.

Go Up