For speed-critical loops in #uxntal, consider this pattern:

Use the return-stack to juggle the items needed inside the loop, and if you know how many times the loop needs to run, flatten your boundaries to a single byte(so 0..0x10, becomes 0xf0).