@rrwo @collinsworth

Most of the time, the tech that scales to the million node cluster takes a lot more than a few seconds to run and burns a substantial amount of CPU cycles while managing itself.

I used to use K8s but for small-scale workloads it was just not the right tool. And the distribution I used (k3s) had quite a few subtle bugs, that took hours to debug.

A lean setup based on Ansible and a mix of Docker and bare metal is now much easier to manage and a lot faster.