@ShadowJonathan can you tell me more about how you know who is doing the scraping? I'm curious how you detected that it was happening
Top-level
@ShadowJonathan can you tell me more about how you know who is doing the scraping? I'm curious how you detected that it was happening 3 comments
@ShadowJonathan @ikanreed could you lend the list of the common useragents used? this is gonna be really useful for blocking purposes @ShadowJonathan that might have another explanation on the ips. My understanding of the great firewall is that it uses proxies for all requests to servers outside of China(for censorship purposes), and there's probably a finite number of ip addresses associated with those proxies. Which might generate that same pattern of user agents being varied from one ip. Still doesn't explain the weird API requests part. |
@ikanreed i looked through my logs, saw a lot of "QQDownload", and "TencentTraveler", found it cute and interesting, but when i did a zgrep on my logs to see what kind of traffic patterns they had, i saw they hit *only* this API, which was *very* suspicious to me
when i started looking into the IPs involved, i saw they used other user agents as well, which speaks heavily of user agent spoofing, no normal browser, client, or scraper would have any of these kinds of traffic patterns