@BartWronski I agree on Python making it easy for R&D;...

Bart Wronski 🇺🇦🇵🇸's posts Post Back to profile

@BartWronski I agree on Python making it easy for R&D; it's hard to argue with the results.

That being said, my primary interest is in deploying Stable Diffusion and supporting models to the desktop (esp part of games) with minimal dependencies, and as far as I can tell this is still a pretty major headache.

I don't want my users to have to install Python/PyTorch/Conda/CUDA: I want it to Just Work™.

There's some interesting work happening here and there, though, like https://github.com/webonnx/wonnx

Like 15 Nov 2022 at 23:35 | Wall-to-wall | Open on mastodon.gamedev.place

9 comments

Greg

@philpax @BartWronski I don't know if this will solve the problem for you, but here's a rust version of stable diffusion

https://github.com/LaurentMazare/diffusers-rs

It uses tch-rs, which uses "py"torch, but only the c++ part of it (no python is involved). I've shipped binaries using tch-rs to other machines by just copying around a few `.so` files (the ones in the `/lib` of a pytorch tarball), but not to consumers, so I can't speak to what pitfalls there might be for that.

15 Nov 2022 at 23:42 | Open on octodon.social

philpax

@gregmorenz @BartWronski Yeah, really excited about that! It's a huge step forward (no more Python in the deployment stack!)

Unfortunately, as you mention, it still requires you to ship Torch dependencies and/or ensure that the user has the correct version of CUDA.

Server-side deployment should be a lot simpler, but client-side deployment is still problematic :(

15 Nov 2022 at 23:46 | Open on mastodon.gamedev.place

Bart Wronski 🇺🇦🇵🇸

@philpax @gregmorenz @philpax Stable Diffusion is IMO a research library/project, not a product.

People are making fantastic simpler wrappers, but I still don't consider it a commercial product.

Any productization will require wrapping it up properly and packaging, like with anything.

I also don't think one can hope to get any good performance without CUDA (but it can be bundled like with other software).

15 Nov 2022 at 23:49 | Open on mastodon.gamedev.place

philpax

@BartWronski @gregmorenz Mm, perhaps - I suppose it really depends on your definition of "product" :)

It's already quite usable by end-users and the rate of development on it is out of this world, but by that same logic, it's quite hard to package it up as a library.

For my purposes I'm just running a SD web UI with an API locally, but at some point I would like a library-like solution that can run on arbitrary GPUs. wonnx is the closest I've seen so far.

15 Nov 2022 at 23:53 | Open on mastodon.gamedev.place

Raph Levien

@philpax @BartWronski @gregmorenz So in the course of trying to do research on GPU rendering of 2D graphics, I've inadvertently done a lot of research into portable GPU infrastructure. I believe it's possible, and not a massive amount of work, to build compute infra that would run workloads like SD on Metal + DX12 + Vulkan, with something like 1M of binary and no additional complex runtime requirements.

For some reason I haven't been able to figure out, nobody really seems to care.

16 Nov 2022 at 1:42 | Open on mastodon.online

Raph Levien

@philpax @BartWronski @gregmorenz So many of the pieces are in place, and I think it will happen, first by running on WebGPU, then optimizing from there. As @philpax says, wonnx looks pretty good, but being restricted to WGSL leaves a *lot* of performance on the table compared with what GPUs can do.

16 Nov 2022 at 1:48 | Open on mastodon.online

philpax

@raph @BartWronski @gregmorenz Yeah, there's definitely a lowest common denominator problem with wgpu, but I imagine it'll be "good enough" for the short to medium term.

In the future, I hope that an actual standard for this kind of ML acceleration is formulated, but it's not really in Team Green's best interests to facilitate that...

16 Nov 2022 at 1:56 | Open on mastodon.gamedev.place

Raph Levien replied to philpax

@philpax @BartWronski @gregmorenz I agree. And more to the point, once you actually get it running, then the open source community can incrementally optimizes pieces of it until it runs pretty well. The missing piece (that seems to have very little community interest) is ahead of time compiled shaders, which would also let you do custom WGSL extensions.

16 Nov 2022 at 2:08 | Open on mastodon.online

Choong Ng replied to Raph

@raph @philpax @BartWronski @gregmorenz My experience is that this work needs commercial sponsorship of a particular shape. Research framework users generally will only use something free + open source, yet adequate support for a given piece of hardware is way beyond student or hobbyist work. I started a company that built a performance-portable deep learning framework and learned this and many lessons slowly 😀

17 Nov 2022 at 15:24 | Open on mstdn.social