Consent-based indexing is a hot-button issue on the fediverse, so the fact that Bluesky needs large-scale indexing to function is probably already a big red flag for many of you. But what are they indexing, exactly? Well, that "large-scale" metrics category includes likes, reposts and followers (which it would have to, in order to serve algorithms), so in effect, your entire social graph is being constantly indexed in a cloud somewhere at all times.
As an aside: Expect ads. The functionality of Bluesky depends in no small part on big index servers that constantly crawl and store data about every user, regardless of what PDS they're posting from. Those index servers will almost certainly be run by for-profit companies, and they're going to want to recoup their costs somehow. They may not serve ads directly to your timeline (though, they handle the algorithms, so they surely could), but they'll serve them to you somewhere.