How to facilitate sharding on collaborative clusters?

From https://collab.ipfscluster.io/: “Note that ipfs-cluster-follow will try to pin the full archive, so you will need at least as much space available in your node as the Size requirement indicates for each archive.” How can I mitigate this issue by hosting a cluster and splitting it into manageable shards?

1 Like

Most of the “sharding” feature is implemented on IPFS Cluster but it is not activated as it was first lacking some Kubo support. There is now a workaround and at some point I’ll get around to finish this.

That said, sharding for collab clusters is only useful for replication factors < “everyone”, which requires trusted or semi-trusted clusters.

1 Like

Thanks for replying. By replication factors, I’m assuming you mean authorized followers. I understand that secure and trustless sharding is difficult to implement, but is it really impossible? It would actually be very useful, because that way massive clusters, such as the digital archive I’m building, could be pinned by random, untrusted volunteers without forcing them to pin the entire cluster. This would probably deter many of them. Also, what is this workaround? Are you saying it is incomplete, or does it currently work?

IPFS Cluster does not verify whether followers are actually pinning or not the content, now how much space they have and other metrics. It trusts whatever they say.

That means that a malicious follower can potentially always pretend it has plenty of space, get pins or shard assigned, but not pin anything in the end.

trustless (…) is it really impossible?

No, but that requires “proof of storage” and that takes us to the Filecoin paper.

A cluster made of random, untrusted volunteers cannot use, for example, “replication-factor=3”, which means that a pin, or a shard, gets assigned to 3 members of the cluster, because those 3 members might just be pretending to pin things.

Thus, the scenario that works best is to let everyone pin everything and hope that some of the peers are not malicious.

I don’t see any other way in a fully open/untrusted cluster that cannot be gamed, but happy to hear any proposals.

Regarding sharding, the main bulk of code is there, but some changes need to be made to trick Kubo into pinning only what is on every shard and not follow IPLD links to things outside the shard when traversing for pinning.

Hey @hector
I also facing the same issue and post will be helpful for me. Thanks :smiling_face_with_three_hearts:

What have you figured out so far in terms of solutions? This information could be incredibly invaluable to me.

Solutions to what exactly?