Unconf: IPFS Cluster infra workshop

IPFS cluster/scalability/observability deep dive with @hsanjuan. Some topics:

  • What should we be doing to ensure our IPFS node(s) don’t fall over from load, either via direct requests or pubsub?
  • How can we detect build up of problems before they become critical?
  • How should we respond if it does start to have issues handling the load, etc.?


  • The cluster swarm
  • The configuration file
  • Cluster peer start
  • The API
  • The pinning process
  • The allocations
  • The adding process
  • The pin batching
  • The CRDT DAG
  • CRDT-DAG sync
  • Pin Queue management
  • Adding a new peer
  • Skipping state sync
  • Removing an existing peer
  • Metrics dashboard
  • Host management, e.g. growing/shrinking safely.
  • APIs
  • Problems, how to detect them, how to proceed:
    • Too many uploads
    • Faulty disk
    • Corrupted badger
    • How to remove from DNS
    • How to prevent allocations to node
    • Caveats, what NOT to do when solving issues