Basically, I feel it furthers fragmentation of the ecosystem, and fails to make progress toward a solid foundation for GPU compute. I also make the case for #WGSL and #WebGPU by comparison.
I think these are very promising ideas, and the main reason I'm putting this talk out now is to try to find open source collaborators to develop them further. But it is early days, and this is not yet a usable stack.
It's still rough work in progress, but I'm excited about the potential. And I'm really looking forward to talking with people. It won't be livestreamed, but there will be a recording.
@raph I was really excited about Swift4TF and Mojo looks like it incorporates the major lessons from that effort. Modular is well-funded enough that I expect it to be able to carry maintenance of Mojo until it starts to get traction.
I'm proud of this work. I think it's the best implementation in either the academic literature or shipping products, and am looking forward to continuing it. I'm also interested to see what applications people will find (the core ideas need tuning depending on the use case), and am also open to collaborations along these lines.
This points to the latest kurbo code (still in a PR) and also explains what was going on with the bumps, with a reasonably good solution. I believe this is now appropriate for use with fonts and similar applications. Feedback on the draft is welcome!
I've done some work in the past few weeks to land high quality Bézier path simplification in kurbo and make it more robust. It's not completely done yet (explicit corner handling is a todo), but it does a pretty good job with one of our test cases. Here it has fit the source curve (blue) with 41 Bézier segments. I believe it is close to the optimum possible solution for the error tolerance. New PR: https://github.com/linebender/kurbo/pull/269
This will be useful both for fonts and design tools for vector graphics.
I've done some work in the past few weeks to land high quality Bézier path simplification in kurbo and make it more robust. It's not completely done yet (explicit corner handling is a todo), but it does a pretty good job with one of our test cases. Here it has fit the source curve (blue) with 41 Bézier segments. I believe it is close to the optimum possible solution for the error tolerance. New PR: https://github.com/linebender/kurbo/pull/269
@raph I'm just the peanut gallery here but I have to ask... what's going on at these two points, on the lower edge? normally I'd expect the simplification to be smoother, but these seem to be surprisingly sharp turns on a smooth section of the input curve? I had to double check, are the source and output curves swapped? (just asking because they jumped out, incredibly impressed with this work)
There is no exact translation of the word "petty" to german, but if I were to explain the meaning of it to someone, I would simply show them this article. Good one Raph ^ ^
I will be showing off Vello at the WebGL and #WebGPU meetup on March 22 in San Francisco. I'm excited, I think it'll be a chance to show it to a wider audience and meet a bunch of people. Now I just need to work on getting a compelling demo together.
Whew! Getting the last of the edge-crossing logic was hard, but I finally have my multisampled path rendering in a working state. The numerical robustness is not perfect yet, but I suspect that will involve relatively minor tweaks.
Performance is not quite as good as the area-based approach, but still very fast, and I have ideas how to address that.
Now I have to figure out how to write all this up!
It's not ready to check into main yet for a variety of reasons, one of which is that it only applies to fills, not strokes. The plan for that is to do stroke to fill conversion in a compute shader, but to say that's difficult is an understatement. I do seem to specialize in such things, though.
My research retreat was very productive. I didn't quite finish implementing sparse multisampled path rendering on GPU, but I got close (hopefully soon). I ran into a performance snag but have a good idea how to fix it.
But the real fruit is probably ideas for even higher quality path rendering - a hybrid of analytical area AA and multisampling - and also compositing. Expect to hear more before long.
@raph how much do you think people care about higher quality antialiasing? Skia is looking like they are going all in with MSAA in graphite and that kind of swayed my confidence that better is really needed
I'm excited, I'm about to go on a week-long research retreat in a lakefront cottage. During the week, I'll focus on an idea for multisampled path rendering, and be able to tune out all other distractions. I've been craving doing something like this for a while, and the last time I did it was really great.
I hope to have an update about what I do during the week.
Anyone in driving range of Berkeley want a Dell P2415Q 4k monitor which may or may not be flaky? It wasn't turning on earlier today, so I got a replacement, but as Murphy's law would have it, it seemed to light when I plugged it in when I got back. I'll list on craigslist if I don't get a response here.
I just booked a cottage on a lake for a week-long research retreat Feb 13-18. I've done this kind of thing a couple times before and am hoping to make it a regular practice. The idea is to fully immerse into one deep topic, without the usual distractions and gear-switching.
This time it'll *probably* be path rendering without conflation artifacts, but there are a couple other deep topics I'd love to address. Attached is an image from that inquiry: how to parallelize line coverage on a grid.
@raph Hi Raph, I watched your great talk about Xilem, very interesting stuff,
looking forward to see where it will go.
I am currently building a 2D sprite editor in Rust, so this problem you
posted is something I had to deal with recently (but without the parallelism).
If you calculate the size of the line, you can find the intermediate points
and give each pair to a thread/task, would that work?
@raph You might want to take a look over how forma currently does it: it reduces the problem to finding the ith term in the sorted union between two arithmatic progressions ax + c and bx + d, which can be solved in O(1) for some given level of accuracy.
Today's work on Vello was of the grungy, infrastructural nature. But that's just as important as the high-flying fancy parallel algorithms. The web port works now on main, which is actually a big deal: it uses #WGSL workgroupUniformLoad to avoid uniformity problems, which has now landed in Chrome Canary.
Also, clear_buffer was a no-op TODO in the wasm bindings, causing corrupt atomic buffers. Now fixed in wgpu main.
Now that the tree is in good shape, on to the fun stuff again!
Over the past week or so, I've been doing a pretty deep dive into GPU synchronization and also its relation to frame pacing. These are fairly arcane topics, though it's well understood that getting synchronization right is essential for modern APIs such as Vulkan.
I believe we have too much legacy from the single-threaded days and it's time to rethink it in terms of modern async.
Over the past week or so, I've been doing a pretty deep dive into GPU synchronization and also its relation to frame pacing. These are fairly arcane topics, though it's well understood that getting synchronization right is essential for modern APIs such as Vulkan.
I believe we have too much legacy from the single-threaded days and it's time to rethink it in terms of modern async.
If you were laid off at Google or otherwise impacted, please feel free to reach out to me to chat. I know several excellent people who lost their jobs on Friday.
Now is also a good time to plug the Alphabet Workers Union. I've somewhat quietly been a member for exactly a year (tomorrow is my anniversary), but going forward that will be less quietly. Your job does not love you back, but your union will look out for you.
@raph Raph, what do you think about these layoffs across the tech industry, by corporations that have not suffered a decline in profits?
As a historian, it looks to me like coordination to reduce the bargaining power of high-paid engineers, aka the rich taking the smart down a few notches.
Perfectly legal?
Daniel McNab tracked down a missing barrier in the pathtag scan shader, which was causing artifacts on Vulkan (but, as these things go, not affecting mac). The same day, Max finally got around to installing the RTX 4080 (which was a Christmas present). That means I'm now able to run Vello and profile it in Nsight.
The results are jaw-dropping. Not just butter-smooth zoom and pan paris-30k at 120fps at 4k, but Nsight reports <1.4ms compute time to render.