Using IPFS Cluster in a bandwidth constrainted environment

Hello everyone,

I’m currently experimenting with IPFS Cluster to assess its suitability for environments with resource constraints, such as intermittent connectivity, low bandwidth, and high latency. I have set up three Linux VMs on the same PC, each running IPFS, ipfs-cluster-ctl, and ipfs-cluster-service, and they all share the same cluster secret.

I have a couple of questions:

  1. I attempted to modify the service.json file to reduce the frequency of heartbeat messages. However, after restarting the ipfs-cluster-service daemon, I did not observe any changes in activity frequency when monitoring with Wireshark. Could anyone provide guidance on the correct way to make this adjustment?
  2. Is there a way to reduce the overall bandwidth usage of IPFS through the configuration file, aside from altering the frequency of heartbeat messages?
  3. Are there any existing research papers that investigate how the number of nodes in an IPFS Cluster affects the bandwidth requirements for normal operation?

I appreciate any insights or advice you can provide!

Thank you!

Hi @Jiajunn, just thought I’d mention the Traffic Control (tc) tool in Linux if you’re not already familiar with it. It can be helpful for limiting bandwidth, and introducing jitter and loss to a network interface. You can read more about it in the man pages for tc, and the following online reference is also a decent one: https://tldp.org/HOWTO/Traffic-Control-HOWTO/

Hi @cewood , thanks for the tip! I’ve actually been experimenting with the Traffic Control (tc) tool in Linux to test the IPFS Cluster. I noticed that when setting the throughput to 22Kbps, with a 5k burst and 5k limit, the peer loses connection to the cluster. I was wondering if there are any configuration options or methods to reduce the IPFS Cluster’s bandwidth usage so it can function in such an environment, or if modifying the source code would be necessary.

Hello @Jiajunn … leaving out Kubo bandwidth usage (dht, bitswap retrievals etc.), ipfs-cluster-specific bandwidth corresponds mostly to pubsub and correlates heavily with the number of peers in the cluster.

In the last stable version an ipfs-cluster-ctl health bandwidth command was added so you can check by yourself.

The good news is that a bunch of work to “fix” high bandwidth usage from pubsub was merged into master a while ago (gossipsub: optimize for diverse clusters with many peers by hsanjuan · Pull Request #2071 · ipfs-cluster/ipfs-cluster · GitHub). Apart from improving things out of the box (a lot), it exposes pubsub configuration values so that they can be tuned.

The bandwidth usage breaks down to how each pubsub peers broadcasts a message and how long it takes for that message to reach all the peers. The most efficient thing is to broadcast from 1 to everyone at once, but as soon as your number of peers is in the thousands, this stops being an option. Increased heartbeat intervals etc. but in the end pubsub settings should optimize for the cluster characteristics.

Hi @hector, appreciate the help! I will look into tuning the pubsub configuration values. Now I’ve been trying to use tc qdisc add dev [interface] root tbf rate [rate] burst [burst] latency [latency] to limit the throughput of my IPFS Cluster node, investigating the throughput at which it disconnects from the remaining cluster. Currently, I check for disconnection by using ipfs-cluster-ctl peers ls and checking whether there is Context Error in the output (since it means the request timed out). Is there a more straightforward way to check whether a node is disconnected from the rest of the cluster?