1️⃣3️⃣ Here's the 13th installment of posts highlighting key new features of the upcoming v256 release of systemd.
ssh is widely established as *the* mechanism for controlling Linux systems remotely, both interactively and with automated tools. It not only provides means for secure authentication and communication for a tty/shell, but also does this for file transfers (sftp), and IPC communication (D-Bus or Varlink).
It relies on TCP as network transport, which is great for remote operation around the globe but really sucks for local communication with a VM and similar, as it usually requires delegation of an address space, dhcp lease, dns and so on, which while manageable are certainly a major source of mistakes, fragility and headaches. In particular it means that logging into a system to debug networking doesnt really work since without working networking you cant even log in. Sad!
7️⃣ Here's the 7th installment of my series of posts highlighting key new features of the upcoming v256 release of systemd.
In systemd we put a lot of focus on operating with disk images, specifically file system images that carry an expressive GPT partition table – something that we call DDIs ("Discoverable Disk Images").
DDIs are supposed to carry dm-verity authentication information, i.e. every single access to them is typically cryptographically protected, and linked back to a set of signing keys maintained by the system (ideally in the kernel keyring). systemd uses DDIs for the system itself, for systemd-nspawn containers, for systemd portable services, for systemd-sysext system extensions, for systemd-confext configuration extensions and more.
5️⃣ Here's the 5th installment of my series of posts highlighting key new features of the upcoming v256 release of systemd.
I am pretty sure all of you are well aware of the venerable "sudo" tool that is a key component of most Linux distributions since a long time. At the surface it's a tool that allows an unprivileged user to acquire privileges temporarily, from within their existing login sessions, for just one command, or maybe for a subshell.
@pid_eins suid programs executing in the environment of the parent process means that I might become root in a user namespace and get the filesystem view of the current mount ns. This won't work with your approach, will it?
@pid_eins This is very nice - especially having it provide a clean context, that saves a ton of headaches and sanitization I need to do when using "sudo" in other programs! (not like that's an amazing thing anyway, but occasionally it's useful and justified) Only having run0 coloring / changing the output of the called command is not something I always like, but maybe that can be optionally disabled...
It's used all over the place in userspace. In systemd we use it:
1. to detect if a block device has partition scanning off or on 2. In our udev test suite, to validate devices are in order 3. udev rules use it for some feature checks (in older versions of systemd).
It's used all over the place in userspace. In systemd we use it:
1. to detect if a block device has partition scanning off or on 2. In our udev test suite, to validate devices are in order 3. udev rules use it for some feature checks (in older versions of systemd).
Anyone knows where the kernel's github/gitlab project is? Would love to file an issue or placeholder revert PR, but somehow I cannot find it! Anyone?
(Yes, this is a joke, I am fully aware of the concept of mailing lists – as a historical concept from the 2005 era... Yes, I am too lazy to figuring out how to report this properly. Hence social media it is.)
Credit where credit is due! I'd really like to take a minute and thank Jia Tan how they helped us to finally get sd_notify() support merged into OpenSSH upstream!
It's pretty comprehensive (i.e. uses it for reload notification too), but still relatively short.
In the past, I have been telling anyone who wanted to listen that if all you want is sd_notify() then don't bother linking to libsystemd, since the protocol is stable and should be considered the API, not our C wrapper around it. After all, the protocol is so trivial
PSA: In context of the xzpocalypse we now added an example reimplementation of sd_notify() to our man page:
It's pretty comprehensive (i.e. uses it for reload notification too), but still relatively short.
In the past, I have been telling anyone who wanted to listen that if all you want is sd_notify() then don't bother linking to libsystemd, since the protocol is stable and should be considered the API, not our C wrapper...
that one can explain it in one sentence: send an AF_UNIX datagram containing READY=1 to a socket whose path you find in the $NOTIFY_SOCKET env var.
But apparently turning that sentence (which appears in similar fashion in the man page) into code is not trivial, hence this new example code.
Hence, copy away, the thing is MIT licensed. And the protocol has been stable for a decade, and I am pretty sure it's going to remain stable for another decade at least.
You might think the answer to this is 7, i.e. regular files, directories, symlinks, block device nodes, char device nodes, fifos, and sockets. But you are actually are wrong: there's an 8th one. There's the concept of an anonymous inode on Linux which has the file type of zero. You can easily acquire fds to inodes of this type via eventfd(). If you call fstat() on such fds, then (.st_mode & S_IFMT) == 0 will hold. 🤯
And I am pretty sure there's a lot of software you might be able to break given that they do not expect this case on the most basic of fs concepts.
Also note that these anonymous inodes are not actually as anonymous as one might think: because open fds appear in /proc/self/fd/ as magic symlinks you can easily get am fs path when you call stat() on will return you a zero inode type.
Here's another little feature we scheduled for the next systemd release. Everyone knows SSH well, and it's great to connect to hosts remotely, and even do file transfer. It's probably *the* single most relevant way to talk to some host for administration and various other tasks. It's a bit fragile though: it requires networking, and that even if we talk to a local VM or full OS container. But precisely networking is one of the things you might want to administer via SSH, hence you have a cyclic…
…and risky dependency. But for the VM and full OS container case there's no real need to use SSH via the network: these things run on the local system, hence why bother with IP? To address that we are adding a small generator (that means: a plugin for systemd that generates units on the fly, based on system state, configuration) which binds SSH to a local AF_VSOCK socket in a VM, and to an AF_UNIX socket in a container. You can then use these to directly connect to the system without involving…
I recently implemented a fun little feature for systemd: inspired by MacOS' "target disk mode", a tiny tool called systemd-storagetm, that exposes all local block devices as NVMe-TCP devices, as they pop up. The idea is that if available in your initrd you can just boot into that (instead of into your full OS), and can access your disks via NVMe-TCP (in case you wonder what that is: it's the new hot shit for exposing block devices over the network, kinda like iSCSI, NBD, …, but cool).
@pid_eins super cool. NVMe-oF ftw! I'm actually looking for a way of trimming down some of the dependencies in systemd-udevd and udevadm to make it even smaller for an super small inirtd. I only want systemd-udevd to initialize local storage devices for my used case. The Fedora systemd-udevd is dynamically linked to a large systemd .so is there a way of building it against smaller systemd libs like libudev etc.?
We recently added a new document to the systemd website focussing on one specific facet of the service manager: the fdstore. A concept that people should really use more to facilitate "seamless" service restarts and various other things. Please have a look:
Here's a fun new feature we are working on in systemd: userspace-only reboot. In order to reduce grey-out times on image-based OS updates to next to nothing we are making a reboot happen where kernel stays as it is, but userspace shuts down as usual, then possibly transitions into a new rootfs, and starts up again with an initial transaction as it would on a classic system boot. During the transition selected services can pass along their fds and listening sockets, to pass "live" resources…
…from the old system to the new system. This means: super-fast switching from one OS version to the next, with all service code restarted cleanly and comprehensively, but with selected resources passed through untouched, so that they can continue to operate. And it wasn't even that hard to implement: https://github.com/systemd/systemd/pull/27435
Or in other words: let's not wait for hardware, firmware, boot loader, kernel, initrd to reinitialize on a reboot, let's just focus on userspace alone.
@pid_eins >random ideas AI might have? The "AI" didn't have the idea, it's just regurgitating a systemd related idea someone wrote about in the text dataset.
The "AI" things are solely dedicated to making things even more proprietary, so more workload on humans pretty much.
Welcome to the *fed*iverse Lennart, we have free software and ですぅ。
@pid_eins >random ideas AI might have? The "AI" didn't have the idea, it's just regurgitating a systemd related idea someone wrote about in the text dataset.
@pid_eins@mastodon.social I especially love how chatgpt just made up some random shit because doing it properly would apparently have been too much work
PSA for C devs: if your library exposes a function that takes a pointer, and you add a "const" to that pointer later on, then yes, that's an API break. Why? Because the prototype of the function changed enough so that anyone taking a pointer of your function won't be able to assign it to the variable they intend to store it in. Yes, C API compat is hard. (libbpf, I am looking at you 👀👀👀, btw)
@pid_eins Well, libbpf has always had an... interesting... approach to backwards compatibility. It's supposed to be better going forward, now that it's reached v1.0. I guess time will tell...
It relies on TCP as network transport, which is great for remote operation around the globe but really sucks for local communication with a VM and similar, as it usually requires delegation of an address space, dhcp lease, dns and so on, which while manageable are certainly a major source of mistakes, fragility and headaches. In particular it means that logging into a system to debug networking doesnt really work since without working networking you cant even log in. Sad!