Upon doing some testing I found an error while replicating data in different IPFS Node. It’s called “context canceled”. Here is an attached image with it:
After doing some research (BlockPut context canceled · Issue #1192 · ipfs/ipfs-cluster · GitHub and https://github.com/ipfs/go-ipfs/issues/6402) I found out the possible culprit is a misconfiguration for proxy headers in the nginx reverse proxy of the IPFS node. I want to add these headers:
to every node I create via Kubernetes/stateful set. I found that there’s an easy way to configure ipfs node parameters via the config json file. It is being used in my cluster-setup-confmap.yaml:
# This is a custom entrypoint for k8s designed to run ipfs nodes in an appropriate
# setup for production scenarios.
mkdir -p /data/ipfs && chown -R ipfs /data/ipfs
if [ -f /data/ipfs/config ]; then
if [ -f /data/ipfs/repo.lock ]; then
ipfs init --profile=badgerds,server
ipfs config Addresses.API /ip4/0.0.0.0/tcp/5001
ipfs config Addresses.Gateway /ip4/0.0.0.0/tcp/8080
ipfs config --json Swarm.ConnMgr.HighWater 2000
ipfs config --json Datastore.BloomFilterSize 1048576
ipfs config Datastore.StorageMax 5GB
chown -R ipfs /data/ipfs
I want to add the nginx headers to this file but I’ve had problems doing so. Can anyone help me set them please? Or if there’s a useful link to do so!
Chances are those context-cancelled errors just mean timeouts, those ipfs peers were unable to pin the item and timed out. Perhaps the logs have more info though.
I ran the logs command for k8s:
kubectl logs test-javi-ipfs-clusternode-7 ipfs-cluster
and for ipfs logs:
kubectl logs test-javi-ipfs-clusternode-7 ipfs
I think it’s the bug mentioned in Issue #1192. Thanks for your help!
No, #1192 was an issue when adding. You are pinning.
Your context-cancelled happens 2 minutes after the pins arrive, which is the default PinTimeout setting. I think your ipfs daemon is not well connected to the rest of the network and is unable to pin/fetch content and cluster cancels the operations as it sees no progress.
I see, I thought it involved the same process, because it had the same error log and this happens very rarely. From all tests I did, this happens depending on the size of the file that is being pinned. If it’s a big one, it happens more often. It has happened when adding big files through the ipfs cluster ctl, or when the cluster automatically replicates data to newly added nodes. But I think it is well connected since it can pin normally the majority of the time. @hector thanks
@hector do you know how can I further debug this one? it is a really unfrequent error.
You would need to find out why some pins stop making progress. I would set more verbosity to IPFS logs and run some tests. Make sure that ipfs daemons in the cluster are well connected (
ipfs swarm peers) and are not dying/being re-started.
I see, how do I enable more log information? I saw something about DEBUG in the ipfs cluster. Is there a way to set this via the sts set? @hector thanks
Start by setting IPFS_LOGGING to INFO (go-ipfs/environment-variables.md at master · ipfs/go-ipfs · GitHub).
It is not ipfs-cluster debugging, it’s ipfs daemon debugging. Your problem is with ipfs not being able to fetch content.
after adding the logging to the new nodes, this is the error:
it can’t increase buffer size
You need to figure out why the ipfs node does not download blocks.
ipfs swarm peers, checking that it is connected, checking that the item you are trying to pin has providers, checking that it is possible to connecto to those providers etc.
So, after doing more debugging I found this issue. In fact, my nodes are getting created badly. I found this l error.
It seems the bootstrap is failing