@Preuk Interesting modalities. Automatic collection, but no public access. That is a way to protect yourself from lawsuits, for sure. I wonder how they decide on “domiciled in France” – by top level domain? I used to be a French citizen until about 2007. Did they archive my pages? Did they remove the archives? I never used the .fr top-level domain. Also, almost none of my pages are in French. But that doesn’t seem to be a requirement. I wonder if Switzerland has a similar program.
@alex here is the dataset
https://api.bnf.fr/index.php/fr/liste-des-adresses-url-des-collectes-ciblees-du-web-francais-par-la-bnf