Email or username:

Password:

Forgot your password?
Top-level
David Neto

@raph it would be awesome to have a bunch of mostly adaptable demos of key algorithms.
Matrix multiply seems important.
And performance tuning across devices.

There is a nifty little project that does a few things like this for Vulkan compute.

github.com/google/uVkCompute

And a nice demo of NVidia's cooperative matrix Vulkan extension.

github.com/jeffbolznv/vk_coope

1 comment
Raph Levien

@dneto Ah, uVkCompute looks good, I agree an analog of that for WebGPU would be great.

I've thought seriously about doing the prefix sum part of that (and dipped my toe into it in the piet-gpu days), and could possibly be cajoled if someone else would run the project.

Now I'm reading up on the sort literature, and it's a pretty deep rabbithole. On CUDA, Onesweep looks very good, but I might be finding out that, for this algorithm, the gap between CUDA and WebGPU is like a yawning chasm.

Go Up