Kubo bitswap will set sendDontHave flag differently depending on message sending to peerEntries or broadcastEntries, as far as I understood, both should be set to true or is there any reasoning behind this implementation detail ?
I’ve digged into other ipfs implementation, it seems like helia doesn’t do this and will always set sendDontHave to true to the outgoing message?
The reason given to me by knowledgeable people was that on a broadcast, we only want to hear from peers that have the block. It makes sense: you are optimistically sending Want-Haves to a bunch of peers. The normal case is they won’t have the block. You don’t need to waste a return trip waiting for them to tell you they actually don’t. The only interesting peers are those who do have it.
When not broacasting, iirc, sendDontHave is true. In that case, you are directing a request to a limited set of peers expecting a positive answer. Negative response indicates that the peer is alive at least and trying to cooperate, rather than sending want-haves into the void. I don’t know how lack of response in handled there, but you could use it to rate and sort peers, potentially drop unresponsive ones from the session.
Regarding Helia, I don’t know. Does it perform optimistic bitswap broadcasts like Boxo? Perhaps @achingbrain can chime in!
Let’s say we have a set of peers {Pa … Pz}, with a minDontHaveTimeout of 1s, right now only Pz has the block Bz. When Pa requests block Bz, it’ll broadcast to a subset of peers right ? Since boxo hardcoded to not return donthave response, we will wait 1s before realizing the subset of peers don’t have block Bz, thus it’ll pick another subset of peers to perform the same broadcast again. Since the peers selection algorithm is random, it might have to retry a lot of time before successfully selecting a subset of peers which contains peer Pz. During this time, peer Pa might be stucked in a broadcast loop.
we will wait 1s before realizing the subset of peers don’t have block Bz
Broadcasts are not happening all the time one after the other. iirc it’s more like every 30 seconds. The full wantlist is sent, so you are also not fishing for a specific CID in principle. The “broadcast loop” happens regardless as long as there are cids in the wantlist.
I think right now broadcast happens to all connected peers rather than a subset or a randomly selected portion, but it is wasteful/spammy in general and some optimizations are on the way (trying to hit the group of peers most likely to have something rather than everyone). (it is a randomly selected subset in the sense that who you have open connections to every moment is rather random and they get pruned rather randomly).
Bitswap complements content routing (dht etc) with these “discovery” mechanisms, but you can’t fully rely on broadcasts to discover any content when the network grows large enough. On a small swarm where there are many copies of the content it works. On a large network with very sparse copies or lots of content concentrated in one place, then it’s mostly wasteful.