@TomF I will be happy to update and correct the talk. There *might* be a bit of an element of Cunningham's Law there.
To respond though, I don't think I need a coherent memory fabric, for the stuff I'm doing I'm fairly happy using atomics to indicate explicit communication between workgroups.
Interesting correction re texture queries. From my perspective I don't see a huge difference between "CISC instruction" and "send a packet" but from yours I can see it's pretty different.
@raph Yup - unless you literally need the machine to run an off-the-shelf OS with very few changes, you clearly want to be able to bypass the coherent fabric for all sorts of traffic. It burns a lot of power and limits your bandwidth.
We had lots of plans for turning it off for certain areas of memory, and make the traffic look more like a GPU, but we never got the chance to implement those. Ah well.