Email or username:

Password:

Forgot your password?
Top-level
Raph Levien

@dneto I realize what I posted here is very clear exactly what I'm going for. There's more detail of my current thinking in a Zulip thread: xi.zulipchat.com/#narrow/strea

3 comments
David Neto

@raph
This made me think of the parallel kernels connected with real channels that Altera made about 10 years ago. It's in their OpenCL FPGA optimization guide. I don't recall whether the channel operations synchronized global memory writes, my hunch is they don't/didn't.

Raph Levien

@dneto I'm interested in prior art, so pointers are welcome (I'll look into this). There's also CUDA streams, which is maybe the closest existing thing, though I haven't yet carefully studied the alternatives in CUDA world.

David Neto

@raph
Intel pre seemed some overviews of this at IWOCL.

E.g.

iwocl.org/wp-content/uploads/i

The CNN work was published more formally too.

Go Up