@raph it would be awesome to have a bunch of mostly...

@raph it would be awesome to have a bunch of mostly adaptable demos of key algorithms.
Matrix multiply seems important.
And performance tuning across devices.

There is a nifty little project that does a few things like this for Vulkan compute.

https://github.com/google/uVkCompute

And a nice demo of NVidia's cooperative matrix Vulkan extension.

https://github.com/jeffbolznv/vk_cooperative_matrix_perf

Like 22 Dec 2023 at 4:38 | Wall-to-wall | Open on mastodon.gamedev.place

1 comment

Raph Levien

@dneto Ah, uVkCompute looks good, I agree an analog of that for WebGPU would be great.

I've thought seriously about doing the prefix sum part of that (and dipped my toe into it in the piet-gpu days), and could possibly be cajoled if someone else would run the project.

Now I'm reading up on the sort literature, and it's a pretty deep rabbithole. On CUDA, Onesweep looks very good, but I might be finding out that, for this algorithm, the gap between CUDA and WebGPU is like a yawning chasm.

22 Dec 2023 at 5:10 | Open on mastodon.online

Go Up