@vertigo @millihertz Regarding that average basic-block size I had an interesting at-least-to-me solution for this usecase of parsing (which probably brings the average down), though I'm not sure how well it generalises.
What if we split the processor in 2 so half executes machine code that's near-entirely branches, thus relying mainly code density? And the other half primarily deals in straight-line code?
I saw a parser generator which included a tight-loop interpreter for such a machine.
@vertigo @millihertz In otherwords: Yes, my hypothetical did rely on code-cache.
Even if I toyed with an alternate way of handling it!