Customizing sharding & replication in ipfs-cluster

0zAND1z · February 28, 2019, 7:58am

There’s some great progress going on at https://github.com/ipfs/ipfs-cluster/ to help pin and persist needful info on IPFS. So first, kudos to the team

I am trying to understand the project deep in the weeds & have 2 roadblocks understanding this project:
1. Is the cluster fault tolerant in rendering the files?
Suppose I add a file(say X) to IPFS and pin it to the cluster with a replication factor of 2. My understanding is that the file is pinned to 2 nodes(say N1 and N2) in the cluster of size M. Consider both the nodes N1 and N2 have now crashed (byzantine or not) after some time. Would the file still be available per client request ?

2. Does the cluster support customisable sharding?
Can the same file X pinned in the cluster be divided into an S number of shards(say x1, x2, x3 … xS) in an overlapping manner across the M number of nodes in the cluster?

3. Can we target the replication and sharding to select a specific or set of nodes?
Suppose I am more confident in replicating a file to specific node(s) identified by its multiaddr(A1, A2 etc.) respectively. Is it possible (or part of the planned roadmap) to allow this decision making ability vest at the user who adds the file?

Would be helpful if somebody can help address these questions. Thanks in Advance.

hector · February 28, 2019, 3:29pm

Hi,

Technically speaking, there are cluster peers N1 and N2 running along IPFS peers IPFS1 and IPFS2. The ipfs daemons are the ones providing the content, regardless of whether cluster peers are running or not. Of course, things won’t work if the IPFS daemons die and they are the only providers.

There is a bunch of code in Cluster to do exactly this, but we are missing support from IPFS to have “recursive-pinning-to-a-max-depth”. The idea is that if the file needs to be split among several daemons, IPFS should not try to recursively pin the whole thing as soon as it gets a node from the DAG. Some discussions have happened and can be followed here: Feature: "children" pinning mode · Issue #5133 · ipfs/kubo · GitHub but, even though it can be done with the current pinning system, IPFS team decided they want to refactor/rewrite the whole thing before adding support for this. Whenever this lands, Cluster could make use of it with relative ease as most of the code is ready.

We just merged the ability to provide a custom list of “allocations” to pin content to. This list is a preference list and takes priority over the allocations decided by cluster (by default in terms of free IPFS repository space). This will be ready in the next release.

SwJay · May 3, 2020, 1:58pm

Hi there, it’s been a whole year now, any updates on the customisable sharding?

hector · May 3, 2020, 10:11pm

No, we are only marginally closer.

Topic		Replies	Views
How does IPFS Cluster store the file? Is Segmentation possible? IPFS Cluster	5	837	June 29, 2020
How to facilitate sharding on collaborative clusters? IPFS Cluster ipfs-cluster	6	188	July 1, 2024
Availability of unpopular personal files Help	10	738	August 14, 2018
Is there any provision to choose ipfs-cluster peer for content replication? IPFS Cluster ipfs-cluster	26	2007	September 22, 2020
IPFS file replication on all nodes in Private network	6	1821	September 26, 2017

Customizing sharding & replication in ipfs-cluster

Related topics