Email or username:

Password:

Forgot your password?
Hector Martin

Yet another person on Reddit surprised that Asahi Linux compiles stuff way faster than macOS.

"But macOS is so optimized for the hardware!" they all say... except Linux is already way more optimized in general than macOS is, for many workloads!

$ time tar xf linux-6.3.3.tar

macOS on APFS: 6.8 seconds
Linux on ext4: 1.0 seconds

Both on an M1 MacBook Air 13". That's how much faster the Linux is at dealing with files than macOS.

The hardware drivers don't matter you're dealing with pure CPU workloads and an NVMe SSD. We already have cpufreq and share the Linux NVMe core, so there's nothing left to optimize there that is specific to this hardware. The only thing missing is deep CPU idle which will unlock boost clocks, but only for single-core workloads (multicore compiling is already at its max).

39 comments
gudenau

@marcan Does Apple still cheat in the NVM-e drivers?

Hector Martin

@gudenau They cheat at the OS layer, we cheat at the driver layer because it'd be stupid not to. Our default flush interval is faster than macOS though, so your data is strictly safer on Linux (you can lose up to 30 seconds of work on macOS or so, only 1 second on Linux with default settings).

None of this matters on laptops because you are guaranteed a flush when the battery is about to die or when you hold down the power button. This only matters for desktops, when you yank power.

gudenau

@marcan Or there is a power outage or whatever, at least they are both journal filesystems I suppose so it shouldn't matter too much...

Hector Martin

@gudenau Yeah, I haven't done any explicit stress tests yet but I've hard-rebooted (including the NVMe controller) these things hundreds of times and never noticed any unexpected corruption (just the usual things with NULL trailing blocks in files, which is normal for ext4 defaults) so I'm pretty confident if there is some problem lurking it's *really* hard to hit.

Nicolás Alvarez

@gudenau @marcan IIRC you mentioned Mac mini lasting surprisingly long (1 whole second) when unplugged? Wish there was a way to detect that early enough to trigger a flush 😅

Hector Martin

@nicolas17 @gudenau Yeah, I looked for that... unfortunately, couldn't find any :(

Stephen Bannasch (316 ppm)

@nicolas17 @gudenau @marcan I wrote a kiosk application in Electron that ran on a windows 11 system. The network would go down on a power outage about 200ms before the power that ran the system failed. That was enough time to stop writing data.

gudenau

@stepheneb @nicolas17 @marcan That's pretty wild, but it's also very situation dependent.

Stephen Bannasch (316 ppm)

@gudenau @nicolas17 @marcan I was getting random corruption because my app updated persistent state when the network changed state. These were kiosks with large touchscreens in science museums. Took a while to debug this remotely because the network changing state occurred much more often than the power being randomly cut.

Stephen Bannasch (316 ppm)

@gudenau @nicolas17 @marcan there was an deterministic ordering for which systems shut down first and lucky for my needs the network shut down first.

Nicolás Alvarez

@marcan hm what about sync; time (tar xf linux-6.3.3.tar && sync)?

Hector Martin

@nicolas17 Doesn't make much of a difference, the sync after only takes 0.2s or so.

Ariadne Conill 🐰

@marcan @nicolas17 Darwin's VFS is going to have a lot of overhead anyway, I would figure due to the Mach bullshit

Sergey Bugaev

@ariadne @marcan @nicolas17 huh? The VFS does not have anything to do with Mach whatsoever, no? This is not a microkernel system we're talking about.

Ariadne Conill 🐰

@bugaevc @marcan @nicolas17 well, for one, mmap(2) would have to somehow interact with mach ports, given that mach provides the virtual memory system.

i mean, i could be wrong, i'm not terribly interested in spelunking through the Darwin source code at this immediate moment

Ariadne Conill 🐰

@bugaevc @marcan @nicolas17

by extension, however, the point is that any resource that can possibly interact with mach should be backed by resources accessible by mach. e.g. kernel handles of all kinds.

it is, admittedly possible that apple has decided to something far more cursed, like translate resources to be accessible by the mach layer as needed, but this seems far more complicated than just backing all kernel resources by mach ports to begin with

Ariadne Conill 🐰

@bugaevc @marcan @nicolas17

though it seems that you're right that file handles do not normally have mach ports on Darwin (which is surprising to me):

a program which does nothing has 11 ports opened. modify that program to open a file and it still only has 11 ports opened.

🤷

Sergey Bugaev

@ariadne @marcan @nicolas17 there is a semi-public way to wrap an fd into a Mach port (fileport) that you can then send to another process via Mach IPC, and then unwrap to receive an fd in the new process (kind of like SCM_RIGHTS). But this is just that, an explicit wrapper.

Nicolás Alvarez

@marcan I was thinking it could make the gap *bigger* due to macOS doing more fsync-cheating than Linux :P

Hector Martin

@nicolas17 We also cheat on Linux, we're not stupid, I'm not going to lose to them just because their NVMe firmware has abysmal flush performance :P

But our flush interval is way shorter so we cheat better. You can disable it with a module param if you really care (I can't imagine who would other than people running database servers on Apple Silicon on the internal NVMe?)

Edit: to be clear, the problem is *frequent* NVMe flushes suck on Apple controllers. A single flush at the end is negligible, it's when you have stuff like apt-get flushing on every file that we wound up with pathologically terribad performance on Linux. On Linux we throttle NVMe flushes to 1/sec; macOS just doesn't do them at all when you do a normal fsync, instead they have launchd doing systemwide flushes every 30 seconds or something silly like that, plus that secret nonstandard "no really, flush all the way" fcntl that nobody uses until they actually lose data and learn about it.

@nicolas17 We also cheat on Linux, we're not stupid, I'm not going to lose to them just because their NVMe firmware has abysmal flush performance :P

But our flush interval is way shorter so we cheat better. You can disable it with a module param if you really care (I can't imagine who would other than people running database servers on Apple Silicon on the internal NVMe?)

James Just James

@marcan Interesting! Do you know if the same would hold true for battery life? My whole life I've dreamed to have a laptop last a whole day of normal use, but I still only get ~4 hours on my thinkpads. Is Linux still losing here? Thanks!

Cleo Menezes Jr. :verified:

I need to say it in portuguese: Perdendo na propria casa KKK

Ondřej Surý

@marcan While I am not going to switch because of the other Apple devices, ease of use, and family integration, I think this is a great stuff! I appreciate what you are doing, and I wish Apple would officially endorse the effort.

Bjornsdottirs

@marcan So Apple don't even know their own hardware. Amusing.

Bjornsdottirs

@marcan also are those both gtar or both bsdtar?

Haelwenn /элвэн/ :triskell:
@marcan Meanwhile on my T495 with a Ryzen 3500U that tarball takes ~40s to extract…
I guess that's thanks to M1 having great I/O throughput but still… WTF.

Only thing I could expect about MacOS is greater integration thanks to holistic design possibility but I wouldn't be so sure of it being that good in practice.
rain 🌦️

@marcan Yeah, this is completely unsurprising. In my experience macOS's technical implementation details are leagues behind Linux's.

Kevin Karhan :verified:

@marcan this reminds me how #Linux also was the first #OS to run on #Itanium and how the #GCC is even the best compiler for that architecture - even better than #Intel's own!

youtube.com/watch?v=3oxrybkd7M

But yeah, the only thing that would make @AsahiLinux even faster on #AppleSilicon would be if compiling stuff would be done entirely in RAM wherever possible, leveraging 10x->1000x more IOPS and lower latency.

Asahi Linux

@kkarhan @marcan Nothing stops you from using a tmpfs for your builds 😉

CaroCaronte

@marcan
ok but you're not doing (almost) any computation here, tar does not even compress anything, you're just barely reading and writing a set of files onto the hd so your point is ext4 is faster than apfs which might be true, apfs is not known for speed but reliability over ssd, encryption, snapshots support ecc...

I personally like both mac and linux (and use windows at work) so I'm happy either way
#crossPlatformHappiness

@marcan
ok but you're not doing (almost) any computation here, tar does not even compress anything, you're just barely reading and writing a set of files onto the hd so your point is ext4 is faster than apfs which might be true, apfs is not known for speed but reliability over ssd, encryption, snapshots support ecc...

Hector Martin

@caronte Okay, tried btrfs which has all those same features: 2.7 seconds. And this was on a machine under load. Still more than twice as fast.

Linux's VFS subsystem is significantly faster than macOS', that's a fact and anyone who has managed large git trees on either OS (and Windows for that matter) will tell you how much smoother it is under Linux ;)

Hector Martin

@caronte Let's try some more VFS benchmarks. This time with a hot cache, so filesystem shouldn't matter much. Still btrfs on the Linux side:

ls -alR linux-6.3.3 > /dev/null
Linux: 0.26, macOS 1.00

Linux is 4 times faster at enumerating/stat()ing files.

time tar cf /dev/null linux-6.3.3
Linux 0.55, macOS 2.7

Linux is 5 times faster than macOS at reading a full Linux kernel tree from page cache.

Seriously, the difference is that huge. And I had to use a tmpfs on Linux instead of /dev/null since otherwise GNU tar is smart enough to optimize the actual data copy away.

@caronte Let's try some more VFS benchmarks. This time with a hot cache, so filesystem shouldn't matter much. Still btrfs on the Linux side:

ls -alR linux-6.3.3 > /dev/null
Linux: 0.26, macOS 1.00

Linux is 4 times faster at enumerating/stat()ing files.

time tar cf /dev/null linux-6.3.3
Linux 0.55, macOS 2.7

Janne Grunau

@caronte @marcan if apfs is made for reliability Apple seems to have failed at that. Evidence: the numerous users with silent file system corruption only discovered during resizing the apfs partition for an @AsahiLinux install

Go Up