@simon Yeah, I've used the doctest.testfile approach for longer tests *and* to reduce complaints about the doctests "cluttering" the main source code (to be fair, this also reduces friction against including much more thorough tests, so it's a net win.)
There's also an "adjacent" tool https://bitheap.org/cram/ - written in python, but it processes the markdown literal-blocks as shell commands - great for CLI testing, especially in worked-example form.