How does IPFS Cluster implement replication

SwJay · April 25, 2020, 10:04am

Hi Community!

I’ve seen the documentation saying that ipfs-cluster-ctl add will add to several cluster peers, and the number is defined by a replication factor. I’d still love to learn a more detailed picture on how ipfs cluster manages the automatic replication? And which part of the source code may I refer to?

Thanks!

dysbulic · April 26, 2020, 12:07am

I’ve been wondering about this as well. An app I would love to see is a video streaming platform where my videos don’t get pulled because I walk by a radio playing something copyrighted.

I’ve been wondering about the cluster management and how intelligent it would be about reasonably distributing the data so peers aren’t overloaded.

I dug around the repo a bit and found this in allocate.go:

This file gathers allocation logic used when pinning or re-pinning to find which peers should be allocated to a Cid. Allocation is constrained by ReplicationFactorMin and ReplicationFactorMax parameters obtained from the Pin object.

The allocation process has several steps:

Find which peers are pinning a CID

Obtain the last values for the configured informer metrics from the monitor component

Divide the metrics between “current” (peers already pinning the CID) and “candidates” (peers that could pin the CID), as long as their metrics are valid.

Given the candidates:

Check if we are overpinning an item

Check if there are not enough candidates for the “needed” replication factor.

If there are enough candidates:

Call the configured allocator, which sorts the candidates (and may veto some depending on the allocation strategy).

The allocator returns a list of final candidate peers sorted by order of preference.

Take as many final candidates from the list as we can, until ReplicationFactorMax is reached.

Error if there are less than ReplicationFactorMin.

I’ve been trying to understand how the preference order is determined. It has something to do with “Metrics” coming from the pub/sub network. I’m still fuzzy on it.

hector · April 26, 2020, 11:25am

The preference order depends on the metric configured, which by default is the free-space available reported by peers.

In the case of adding, the cluster peer doing the job will do the chunking and dag-building in the same way that IPFS does, and then send the blocks to their destinations using raw /block/put (the destinations being the list of peers returned by the allocator).

SwJay · April 26, 2020, 1:39pm

Thanks for your reply!

So when adding a file to the ipfs cluster for replication, the local ipfs-cluster-ctl itself deals with the files (chunking, DAG, etc.) in full charge and sends them to allocated cluster peers (if that’s the case, I’m also wondering how to send blocks to other peers?), where ipfs daemon is not involved?

I used to suppose that once a file is added, the ipfs cluster would synchronize a global pinset, and those allocated cluster peers would call their ipfs daemons to get and pin this file into local cache.

hector · April 26, 2020, 10:00pm

Yes, the last step after all blocks have been sent, is to pin the file. Since blocks are probably already in place since they were added, the pinning should finish fast.

The local ipfs-cluster-service. ipfs-cluster-ctl is just a client that will do a normal file upload to the local “server” or cluster peer. As the file is received, the server chunks it and sends the blocks around to other cluster peers (via cluster RPC), which then do the local /block/put to their respective IPFS daemons.

SwJay · April 29, 2020, 2:42am

Thanks so much!

After learning your comment, I’ve been reading the source code to locate the functionality it uses to send blocks to other peers, and I found this official news particularly helpful! There is the Add() functionality in /adder/util.go, where it calls rpcClient.MultiCall() to send an ipld node to other peers’ BlockPut() method. You are totally right!

Topic		Replies	Views
Is there any provision to choose ipfs-cluster peer for content replication? IPFS Cluster ipfs-cluster	26	2005	September 22, 2020
Logic behind replication factor IPFS Cluster ipfs-cluster	4	1337	November 8, 2018
File replication in IPFS - how to control? IPFS Cluster ipfs-cluster	4	2801	May 2, 2019
Build a private distribution network using ipfs-cluster Help ipfs-cluster	7	1015	August 27, 2020
Replication factor not getting recovered IPFS Cluster ipfs-cluster	4	603	August 20, 2021

How does IPFS Cluster implement replication

Related topics