`ipfs pin add` often hangs with a wantlist of ~7 CIDs

I’m running a private swarm at work to push disk images to lab equipment, and I’ve found that ipfs pin add often hangs with seemingly just a few items in the want list shown by ipfs stats bitswap. Typically if I run a second instance of ipfs pin add it will complete within a few seconds and both ipfs pin add processes then return.

I’ve taken to looping over all my nodes to launch second, and sometimes third, ipfs pin add processes to get things moving along faster. That seems wrong, but I’m not sure how to collect enough forensics (etc) to make a constructive bug report.

My environment

Private swarm on a 10/8 corporate network. Latency is typically < 10ms. Throughput is typically ~800 MiB/s (i.e. 10Gbps), with the notable exception of my laptop which only has 3 MiB/s and acts as a seed for the update.

All the nodes are running kubo.linux-arm64 v0.31.1 on Raspberry Pis with kernel 6.6.20+rpt-rpi-v8.

Steps to reproduce

  1. Laptop: CID=$(ipfs add ./update.img.zst)
  2. Laptop: ipfs-cluster-ctl pin add $CID --name update.img.zst
  3. Wait for the two bootstrap nodes to report “PINNED” in ipfs-cluster-ctl status
  4. Laptop: Launch an Anisble playbook to start IPFS nodes on a number of RPi4B

Each Raspberry Pi 4B will then:

  1. ipfs daemon --enable-gc
  2. ipfs pin add $CID
  3. ipfs get $CID -o /some/path/update.img.zst
  4. ipfs pin rm $CID

Typically (2) will hang on about 1/5 of the IPFS nodes, so I loop over them to ssh -t $SOMEHOST ipfs pin add $CID which usually resolves it in about ~5-10s on each node.

How should I go about troubleshooting this? For example, which ipfs log level commands would be most appropriate?

Sample stuck nodes and loop to unstick them

What are the chances that GC is running while trying to add? GC will lock the pinning system.

Do you mean that 1 of the 5 nodes hangs, but then when running ipfs pin add in some other node, the node that was hanging finishes pinning then?

Does the node hang indefinitely otherwise? If not, for how long does it hang?

Does the hanging node report being connected to the other nodes that have the content (ipfs swarm peers)?

Does the hanging node have connectivity to all peers that have the content?

In the hanging node, what does ipfs routing findprovs "cid" say? If some are found, is it connected to those peer ids?

I cannot discard that this is a bitswap bug though and there are some improvements coming for next release, but first we should discard other things.

(Sorry for the delay in replying)

What are the chances that GC is running while trying to add? GC will lock the pinning system.

Slim, I think. I do launch with ipfs daemon --enable-gc but it is the first time the daemon has ever been launched on the node. In fact all the IPFS nodes are launching for the very first time because our systems run from RAM disks (OverlayFS of a read-only boot media) and wipe the SSD where we store the IPFS repo on every boot.

Do you mean that 1 of the 5 nodes hangs, but then when running ipfs pin add in some other node, the node that was hanging finishes pinning then?

No, I run another ipfs pin add command on the stuck node and that usually unsticks it. For a while there are two (or more!) ipfs pin add processes running on the system. Typically the node has been stuck for >10min before I SSH in, run ps fo pid,cmd -A to find the ipfs pin add command that is stuck, then copy and paste that command to launch a second instance, and often both then exit with success after ~5 seconds.

Does the node hang indefinitely otherwise? If not, for how long does it hang?

I believe it to be indefinite, but the longest I have waited has been a couple of hours.

Does the hanging node report being connected to the other nodes that have the content (ipfs swarm peers )?

Yes, bizarrely. Before I developed the not-quite-a-workaround of running another ipfs pin add command I would use ipfs swarm peers to check, and ipfs swarm connect /ip4/10.1.26.158/tcp/4001/p2p/12D...6ag or indeed swarm peer disconnect followed by swarm peer connect. I don’t feel that it made much difference.

Does the hanging node have connectivity to all peers that have the content?

Yes. They are all on the same logical network and can contact each other AFAIK. Certainly I am not aware of any firewall rules that would block IPFS traffic on just some nodes in the same rack while allowing others. I can SSH and ping all the nodes from my laptop, and they can all SSH and ping each other.

In the hanging node, what does ipfs routing findprovs "cid" say? If some are found, is it connected to those peer ids?

I will check that. (Hopefully this afternoon, as I am about to do another deployment.)

I was able to reproduce this during this afternoon’s deployment. Please see this (unlisted) pastebin document: https://pastebin.com/eMSFXhR0

(My previous post also has a pastebin link, but may have been missed since it was rendered in a way that is easy to dismiss as an advert)

My random sampling suggests that the stuck nodes are indeed connected to other nodes which have all the data they want.

I reran the script that generated the above but this time appended an ipfs pin add command, and that unstuck the nodes: https://pastebin.com/zV7N0Ngx

Oh, and I should explain the lipfs in my script - it’s a simple wrapper script to make it easy to work with a docker-based IPFS node. The L is for “local” as in private:

#!/bin/bash

# Launch IPFS if necessary
if [[ $(docker container inspect ipfs --format '{{.State.Running}}' 2>&- ) != "true" ]]; then
  # Not running - has it ever?

  if [[ $(docker container inspect ipfs --format '{{.State.Status}}' 2>&- ) != "" ]]; then
    echo "IPFS container is no longer running. Check 'docker logs ipfs' to diagnose or delete with 'docker rm ipfs' and try again." >&2
    exit 1
  fi

  sudo mkdir -p /external-ssd/ipfs
  sudo chown $(id -u):$(id -g) /external-ssd/ipfs

  # Note: Assumes `ipfs:latest` is present locally (because it's part of dbfOS)

  echo "Launching IPFS ..." >&2
  docker run \
    --detach \
    --user $(id -u):$(id -g) \
    --name ipfs \
    --network host \
    --volume /etc/ipfs/swarm.key:/swarm.key:ro \
    --volume /:/host \
    --volume /external-ssd/ipfs/:/data/ipfs/ \
    ipfs >/dev/null

  echo "Waiting for IPFS to become ready ..." >&2

  until curl --silent http://127.0.0.1:7070/ &>/dev/null; do
    if [[ "$(docker container inspect --format '{{.State.Running}}' ipfs)" != "true" ]]; then
      echo "IPFS does not appear to have become ready! Try 'docker logs ipfs' to diagnose," >&2
      exit 1
    fi
  done

  echo "IPFS service started" >&2
fi

exec docker exec -it -w "/host/$PWD" ipfs ipfs "${@}"

The entrypoint of this custom docker container is as follows, it just checks for the private swarm key and sets the bootstrap nodes:

#!/bin/sh
# Configure an existing IPFS repo

if [[ ! -r /swarm.key ]]; then
  echo "ERROR: /swarm.key not present or not readable! Did you remember to mount it from /etc/ipfs/swarm.key?" >&2
  exit 1
fi

cp -v /swarm.key /data/ipfs/swarm.key

export GOMAXPROCS=$(nproc)

ipfs init
ipfs config profile apply local-discovery
ipfs bootstrap rm all  # Do not use any public nodes in this swarm
ipfs bootstrap add /ip4/10.58.203.80/tcp/4001/p2p/12D3KooWMq2yauGRfCU5AUKsiajvtrhEt2TnFfiYbfpgTX9hDk2A # nc-b9-8-2.iaas.eu02.arm.com, a bare metal host
ipfs bootstrap add /ip4/10.7.76.84/tcp/4001/p2p/12D3KooWGiXNzrq22WThzCKqv3xwv5f46xQL9Pn7fTp6wavKfbjV # aarch64.noir.arm.com, a bare metal host
ipfs config Routing.Type dht  # Only use this swarm for finding peers and data, not public services
ipfs config Addresses.Gateway /ip4/127.0.0.1/tcp/7070  # Avoid default :8080, that's used by other services on dbfOS
ipfs config --json Gateway.PublicGateways '{"localhost":{"Paths":["/ipfs", "/ipns"], "UseSubdomains":false}}'  # Prevent the gateway service (:7070) from redirecting users to bafyhashstuff.localhost:7070/
ipfs config Datastore.StorageMax 100GB
# Defaults for `ipfs add`
ipfs config Import.HashFunction blake3    # Faster than SHA256 on RPi4B
ipfs config Import.UnixFSChunker buzhash  # Split on certain hash values, similar in function and intent to `gzip --rsyncable`

ipfs daemon --enable-gc

The custom image’s Dockerfile is also very simple, and most importantly sets LIBP2P_FORCE_PNET in the environment, to ensure our private/local swarm does not connect to the public internet:

# Note: ipfs/kubo:release doesn't seem to work on Raspbian (32-bit), it always has this error on launch:
#   /sbin/tini: error while loading shared libraries: libc.so.6: cannot open shared object file: No such file or directory
# So we will make our own
FROM alpine:3.20 AS downloader

ARG VERSION=v0.32.1

# Build arg to get OS + architecture for multi-arch building (defined by buildx)
ARG TARGETOS
ARG TARGETARCH

ADD https://github.com/ipfs/kubo/releases/download/${VERSION}/kubo_${VERSION}_${TARGETOS}-${TARGETARCH}.tar.gz /tmp/ipfs.tar.gz
RUN cd /tmp && tar xvf ipfs.tar.gz

FROM alpine:3.20

COPY --from=downloader /tmp/kubo/ipfs /usr/local/bin/ipfs

# Limit traffic to private swarms; those for which a shared secret
# (`swarm.key`) is defined. If the shared secret is not available `ipfs daemon`
# will fail to launch with the following error:
#
# Error: constructing the node (see log for full detail): privnet: private network was not configured but is enforced by the environment
ENV LIBP2P_FORCE_PNET=1

# Set our data directory
ENV IPFS_PATH=/data/ipfs
VOLUME /data/ipfs

# Install our configuration script, which is run every time the container
# starts (after `ipfs init` but before the user-specified command, such as
# `dameon`)
COPY entrypoint.sh /entrypoint.sh
RUN chmod 0755 /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]

Sorry, I forgot to follow up…

This could be a deadlock in Bitswap… we have been fixing some stuff that could result in your problem in the last release: 0.33.0-rc1

Can you try that one and see if the issue still persists?

The fact that it works when manually retrying (i.e. new bitswap session etc.) makes it likely that it is at least bitswap related.

If it still hangs, let it hang for 15 minutes at least and then run ipfs diag profile while it hangs and upload the result somewhere.