Email or username:

Password:

Forgot your password?
Top-level
philpax

@BartWronski I agree on Python making it easy for R&D; it's hard to argue with the results.

That being said, my primary interest is in deploying Stable Diffusion and supporting models to the desktop (esp part of games) with minimal dependencies, and as far as I can tell this is still a pretty major headache.

I don't want my users to have to install Python/PyTorch/Conda/CUDA: I want it to Just Workβ„’.

There's some interesting work happening here and there, though, like github.com/webonnx/wonnx

9 comments
Greg

@philpax @BartWronski I don't know if this will solve the problem for you, but here's a rust version of stable diffusion

github.com/LaurentMazare/diffu

It uses tch-rs, which uses "py"torch, but only the c++ part of it (no python is involved). I've shipped binaries using tch-rs to other machines by just copying around a few `.so` files (the ones in the `/lib` of a pytorch tarball), but not to consumers, so I can't speak to what pitfalls there might be for that.

philpax

@gregmorenz @BartWronski Yeah, really excited about that! It's a huge step forward (no more Python in the deployment stack!)

Unfortunately, as you mention, it still requires you to ship Torch dependencies and/or ensure that the user has the correct version of CUDA.

Server-side deployment should be a lot simpler, but client-side deployment is still problematic :(

Bart Wronski πŸ‡ΊπŸ‡¦πŸ‡΅πŸ‡Έ

@philpax @gregmorenz @philpax Stable Diffusion is IMO a research library/project, not a product.

People are making fantastic simpler wrappers, but I still don't consider it a commercial product.

Any productization will require wrapping it up properly and packaging, like with anything.

I also don't think one can hope to get any good performance without CUDA (but it can be bundled like with other software).

philpax

@BartWronski @gregmorenz Mm, perhaps - I suppose it really depends on your definition of "product" :)

It's already quite usable by end-users and the rate of development on it is out of this world, but by that same logic, it's quite hard to package it up as a library.

For my purposes I'm just running a SD web UI with an API locally, but at some point I would like a library-like solution that can run on arbitrary GPUs. wonnx is the closest I've seen so far.

Raph Levien

@philpax @BartWronski @gregmorenz So in the course of trying to do research on GPU rendering of 2D graphics, I've inadvertently done a lot of research into portable GPU infrastructure. I believe it's possible, and not a massive amount of work, to build compute infra that would run workloads like SD on Metal + DX12 + Vulkan, with something like 1M of binary and no additional complex runtime requirements.

For some reason I haven't been able to figure out, nobody really seems to care.

Raph Levien

@philpax @BartWronski @gregmorenz So many of the pieces are in place, and I think it will happen, first by running on WebGPU, then optimizing from there. As @philpax says, wonnx looks pretty good, but being restricted to WGSL leaves a *lot* of performance on the table compared with what GPUs can do.

philpax

@raph @BartWronski @gregmorenz Yeah, there's definitely a lowest common denominator problem with wgpu, but I imagine it'll be "good enough" for the short to medium term.

In the future, I hope that an actual standard for this kind of ML acceleration is formulated, but it's not really in Team Green's best interests to facilitate that...

Raph Levien replied to philpax

@philpax @BartWronski @gregmorenz I agree. And more to the point, once you actually get it running, then the open source community can incrementally optimizes pieces of it until it runs pretty well. The missing piece (that seems to have very little community interest) is ahead of time compiled shaders, which would also let you do custom WGSL extensions.

Choong Ng replied to Raph

@raph @philpax @BartWronski @gregmorenz My experience is that this work needs commercial sponsorship of a particular shape. Research framework users generally will only use something free + open source, yet adequate support for a given piece of hardware is way beyond student or hobbyist work. I started a company that built a performance-portable deep learning framework and learned this and many lessons slowly πŸ˜€

Go Up