

Not with the way the protocol is designed, no. Content is pushed to other instances by basically sending the event to every subscribers, so it inherently requires some kind of active subscription to receive content. And thus the bots.
Technically, ActivityPub would support a system of private communities and profiles, where the remote user have to accept your subscription/follow first, so it makes more sense seen that way why it’s not just broadcasting everything to everyone. Lemmy doesn’t support that and makes all content visible to everyone, so each instance really only needs one bot user to subscribe to every community it can find, and it shows up in everyone’s All feed which many use to discover content. And thus the bot subscriptions, one per instance that runs one of those.
On my small private instance it also makes sense I only receive content which I’m subscribed to, it makes my storage requirements much smaller and reduces the overall load for everyone by only federating what is necessary.
A simple workaround though would be for those bot users to have a special flag on them where instances can ignore them from the count to get a more accurate number, but it’s pretty low on the priority list. Plus when you have 1k, 5k, 10k subscribers, those 50-100 bot users stop being meaningful anyway.
Both, but the instance the community is on is the one that forwards it to all the other instances will get it from.
It does delete everything, but it’s a bit buggt and cannot be guaranteed (because I can just restore from backup and undelete everything if I want).
Everything is pushed to all interested instances and host their own copy.
Only the ones that you subscribe to. But yes generally it would increase load on the big ones like lemmy.ml and lemmy.world since they’re popular.
But also in a way, it’s no different than one user viewing the post, and your instance have a copy of it and can serve many more users. And the remote instance gets to push it to you when it’s convenient. So not really a problem.
Your instance already have a copy of it all. You always go through your instance (except media, depending on your instance’s settings if the media cache/proxy is enabled).
Roughly, how ActivityPub works is that instance A subscribes to B (by sending it a subscription request to a given community), and then B just sends A everything that happens from that point on. If you post, then A goes to B to inform it of the post, and then B broadcasts it it to everyone else. A owns the user, B owns the community.
Most questions can then be answered by thinking of what would happen. What would happen if B bans a user from A? Well A doesn’t care, neither does C, B will just ignore everything from that user. What happens if A bans the user? Well, that user can’t post at all so indirectly also banned from B and C. What happens if A bans a user from C but C posts to B? A will ignore it, while B and C sees it. And so on.
Each instance is independent and makes its own decisions, so the view is slightly different from instance to instance.
And yes the fact everyone basically have a fully copy of everything does have some considerable privacy implications. A rogue instance can just ignore deletes and keep full edit histories. Every post, every comment and every vote is public information. It’s entirely an honor system when it comes to deleting stuff.