I have been researching IPFS/IPFS-cluster and am trying to clarify certain functionalities.
Say I have 3 nodes each containing a large CID file. If a node searches for that CID is there anyway that the node can do a download of the CID blocks from multiple nodes at ounce instead of the first node that appears to be storing that CID? Reading the documentation regarding the Merkle DAG it should be possible but every test I’ve run only pulls the files from a single node.
Thank you in advance for the help.
The protocol in charge of doing the block exchange in IPFS is Bitswap. Bitswap leverages its direct connections to discover and retrieve the content from the peers storing it. Assuming all your nodes are all directly connected, Bitswap operates as follows for a leecher L, and your three seeders storing the content (S1, S2, S3):
- L broadcasts a WANT request for the root CID to all its connections to see who stores the content (S1, S2, S3).
- S1, S2, S3 answer with HAVE messages signalling they all store the content.
- L adds all seeders to what we call a session (i.e. nodes that potentially have the content). L chooses probabilistically one of the peers to request the transmission of the block. The probability of choosing a node depends on certain metrics related to previous interactions with the node (you can learn more about this here: https://github.com/ipfs/go-bitswap/blob/master/docs/how-bitswap-works.md).
- From there on, L traverses the DAG requesting the next blocks to the peers in the session. This probabilistic selection of peers in the session is done for every block, so you need to be really lucky for the same peer to be selected in every block. In short, Bitswap is designed to leverage multiple streams of transmission.
In your case, can it be the case that your leecher is only connected to one of the seeder nodes so this is the only one being included in the Bitswap session? Even more, could it be the case that the peers are not directly connected and Bitswap needs to resort to the DHT to find a provider for the content? In this case, Bitswap only adds the provider found in the DHT lookup in its session and fetches all the content from the same peer.
There are also periodic broadcasting schemes to populate sessions with more peers, but this is done every minute, so if the transfer lasts less than minute you won’t see this behavior.
I hope this explanation of Bitswap has given you a clearer view of how file sharing works in IPFS. If you can give me a bit more of context on the network topology you are using we can try to identify what could be happening and why your node is not leveraging multiple streams.