More info needed: Go-ipfs ERROR bitswap provider querymanager/providerquerymanager

I’m setting up a new app server running about 5 docker containers. One of them is Kubo/go-ipfs running this version:

Kubo version: 0.14.0
Repo version: 12

This is a cloud server, and it’s running the exact same Docker containers as an older server that I’m getting ready to retire. The older server is consuming less than 1GB of disk space. But the new server fills up its disc space after a few hours. The new server is filling up its hard drive at about 1GB per hour.

Through trial and error, I started with a fresh installation by wiping the data directories. Then I took down all the Docker containers except for IPFS.

When I look at the IPFS logs, I get a steady stream of these errors:

2022-10-05T18:59:25.789Z	ERROR	bitswap	providerquerymanager/providerquerymanager.go:338	Received provider (12D3KooWLsSWaRsoCejZ6RMsGqdftpKbohczNqs3jvNfPgRwrMp2) for cid (bafybeiglr77fwb4ul6dwpin5adhma4huwspearnyifz6a4r7p7tqvlbzoy) not requested
2022-10-05T18:59:44.447Z	ERROR	bitswap	providerquerymanager/providerquerymanager.go:338	Received provider (QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC) for cid (bafybeihohqbfp4velio5leyi7meycljazuibfruckhptm56fv7qn4rsgh4) not requested
2022-10-05T18:59:44.474Z	ERROR	bitswap	providerquerymanager/providerquerymanager.go:338	Received provider (12D3KooWDLYiAdzUdM7iJHhWu5KjmCN62aWd7brQEQGRWbv8QcVb) for cid (bafybeihohqbfp4velio5leyi7meycljazuibfruckhptm56fv7qn4rsgh4) not requested
2022-10-05T19:00:12.128Z	ERROR	bitswap	providerquerymanager/providerquerymanager.go:338	Received provider (12D3KooWLSMVRxtFrRWofS6MjysgWnPh7iiFEGYeEAeBQceNrf4G) for cid (bafybeic6szf64ijmnpoksr7mkw4bubpkshcp3ojje6y3tqwsjt4xr7u2t4) not requested
2022-10-05T19:00:12.143Z	ERROR	bitswap	providerquerymanager/providerquerymanager.go:338	Received provider (12D3KooWBeb4VBQ7mfYEmLKkjcgvtfo6hZHCtyWdR2p8YeWFYD8P) for cid (bafybeic6szf64ijmnpoksr7mkw4bubpkshcp3ojje6y3tqwsjt4xr7u2t4) not requested
2022-10-05T19:00:12.604Z	ERROR	bitswap	providerquerymanager/providerquerymanager.go:338	Received provider (12D3KooWNybYMN9JBAspGKeanZNTWUsnjaZc6ZUz55YZbgykwXwQ) for cid (bafybeighy33e65gdznjmpu6u4vlikzdiv5zxhzxpxganbbjr5ncg4sqxp4) not requested
2022-10-05T19:00:12.725Z	ERROR	bitswap	providerquerymanager/providerquerymanager.go:338	Received provider (12D3KooWCTQbMuB6yZq9155kEXeuN2gPTHHtd2zZSzaXAMeUvPC9) for cid (bafybeighy33e65gdznjmpu6u4vlikzdiv5zxhzxpxganbbjr5ncg4sqxp4) not requested
2022-10-05T19:00:17.398Z	ERROR	bitswap	providerquerymanager/providerquerymanager.go:338	Received provider (12D3KooWNybYMN9JBAspGKeanZNTWUsnjaZc6ZUz55YZbgykwXwQ) for cid (bafybeifyvlmqchivnbwsk6smmbw4ncfqs5gncyr5zhd2rplhuyeoa6jkte) not requested
2022-10-05T19:00:17.399Z	ERROR	bitswap	providerquerymanager/providerquerymanager.go:338	Received provider (12D3KooWCTQbMuB6yZq9155kEXeuN2gPTHHtd2zZSzaXAMeUvPC9) for cid (bafybeifyvlmqchivnbwsk6smmbw4ncfqs5gncyr5zhd2rplhuyeoa6jkte) not requested
2022-10-05T19:01:26.010Z	ERROR	bitswap	providerquerymanager/providerquerymanager.go:338	Received provider (QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC) for cid (bafybeictjwh2f2qol2qdzijdgprny2dobu2nybdxkvpvhlmm3fytprxlf4) not requested
2022-10-05T19:05:36.896Z	ERROR	bitswap	providerquerymanager/providerquerymanager.go:338	Received provider (QmQzqxhK82kAmKvARFZSkUVS6fo9sySaiogAnx5EnZ6ZmC) for cid (bafybeihee2uqjw6fcb3mshmbsdchnwtfd32ncu3ec4pgpl3iik6epvllle) not requested

After doing some additional debugging, I found that the difference between the two servers was the Bootstrap setting in the config file. The old server uses my own custom bootstrap servers that replace the default Bootstrap servers. The new server had not yet had the Bootstrap section of its config file overwritten.

I’m still debugging and testing, but after removing the Bootstrap entry in the config file, waiting about 30 minutes for other IPFS nodes on the network to forget about my node, I fired up the new server again and the hard drive does not appear to be filling up. The bitswap error logs have also decreased significantly.

I’ve encountered this problem before, and I’d like to understand what the root cause is. It appears to me to have something to do with the default Bootstrap nodes. Every few months, I stumble into this issue. In the past I’ve gone to lengths to start my own private IPFS network to avoid it, and lately I’ve figured out that the source of the issue seems to be the Bootstrap settings. But replacing the bootstrap nodes doesn’t completely stop the issue.

I tried to pull up some of those CIDs using gateways, like this attempt, but I get 429 errors. That’s strange.

The only other articles I was able to find on this topic was this GitHub Issue and this discussion thread. Neither of which explains what is happening.

Does anyone know what is happening under the hood? Can you give an explanation? Any suggestion on how to further debug the issue of the hard drive filling up?

1 Like

I’d still like to understand more about the source of those bitswap errors: where they come from, what they mean, how they are tied to the increase in hard drive usage.

But I’ve got an additional data point to add to the discussion:

In order to further debug the issue, I reconfigured the old server. The older server did not have a the Gateway port open (in the firewall). I think that largely explains the stability of it.

Once I opened port 8080 for the Gateway, and assigned it to the domain name, the bitswap errors immediately started. That seems to indicate that the errors are directly associated with the usage of the Gateway.

I changed the subdomain that the Gateway was associated with, and the bitswap errors (and hard drive increase) stopped. That leads me to believe that those bitswap errors are a result of some kind of spam attack???

If that assessment is correct, how can I protect my IPFS Gateway from that attack? The link I provided in the first post ipfs.w3s.link returns a 429 error. Presumably this is their protection against that attack? How can I replicate that?

I route access to the Gateway through nginx, so I can add rate limiting through there. But if there is some sort of blacklist or abuse detection or rate limit feature built into the IPFS node to mitigate these kinds of disruptions, I’d like to know.

Hey there, do you have any updates on this? I’m seeing something similar

No. I never figured out the source of those errors. That’s why I was posting here.

I did solve the root problem, by setting the No-Fetch config flag. This prevented the IPFS Gateway from serving content that has not been explicitly pinned. This prevented the massive hit it was taking to retrieve and serve content.

1 Like

Right, good to know, I was about to scale the highwater value, something like that, probably would just take more time to get a OOM on the service.
I’ll try your solution, thanks!