@mhoye (this is not particularly novel, I remember reading a similar paper about using gzip to de-anonymize written works back around when I was learning about singular value decomposition to support vector space document searching from Maciej Ceglowski, I think back in the early 2000s. But it definitely is novel that this old technique is now outperforming current tech darlings.)
@mhoye that prediction ≈ compression is also the basis of the Hutter Prize
https://en.m.wikipedia.org/wiki/Hutter_Prize