How to diagnose file not propagating to other Gateways?

I have a set of several small files that got added to my kubo 0.23 local server, but they seem to not be propagating through the network to other gateways. An example file from this set is bafybeihzdl3qzygoqrxm2uapjcx4meix6dag5kqydgpybinx26eu2bl6cm. I have my node set up to have Cloudflare’s IPFS nodes as Peers (using the “Peering > Peers” config setting), and so viewing that CID on Cloudflare’s IPFS gateway works (which proves my node is able get some data out at least).

But trying to call up that same CID on ipfs.io and dweb.link just times out (504 error).

The web UI for my node shows it has about 160 peers, and has some in/out traffic, but only a few KiB/sec on the gauges. What else can I do to figure out why these files aren’t discoverable by the big public gateways, even after my node’s had the files for over 24 hours now?

What does https://ipfs-check.on.fleek.co/ say?

I’m not sure what I should put in the “Multiaddr” field there, to diagnose that further? If I put in my own node’s peer ID there, it indicates the CID is present on that node, but “Could not find the multihash in the dht” .

Your own peerid is good, the page says:

Multihash not advertised in the dht. Your machine has not advertised that it has the given content in the IPFS Public DHT. This means that other machines will have to discover that you have the content in some other way (e.g. pre-connecting to you optimistically, pre-connecting to you since related content is already advertised by you, some rendezvous service, being on the same LAN, etc.). If using go-ipfs consider enabling the Accelerated DHT Client, which will advertise content faster and in particular should enable you to continue to republish your advertisements every 24hrs as required by the network.

Have you tried enabling the accelerated dht client as suggested ? :slight_smile:

Thanks @Jorropo; that link to “Accelerated DHT Client” doesn’t jump to a specific section on that page, and searching the page it links to for “Accelerated DHT”, it only appears a few times in the page, under the “Optimistic Provide” feature. Is that a rename of the same feature? I turned it on for now, to see if that has any effect.

For distributing/advertising specific CIDs, there’s no difference between “pinning a CID” and “having the item in your ‘Files’”, correct?

Optimistic provide is a different feature than accelerated DHT client, I see what has happen accelerated DHT client used to be experimental and now it’s not. https://github.com/ipfs/kubo/blob/master/docs/config.md#routingaccelerateddhtclient
I’ll go update the link on ipfs check thx.

Aha! So, from that updated documentation, setting “Routing > AcceleratedDHTClient” to true will make my local node be more aggressive about advertising content it knows about. That would include things that are in “Files”, things that are pinned, and things that are locally cached from my local home network browsing global IPFS content on it. I’ll try turning that on, but also setting “Reprovider > Strategy” to “pinned” (as for my use-case, the files I want to make sure are advertised are the ones I explicitly added. The content my node knows about from me casually browsing using it I don’t need to as aggressively advertise) and see how that works.

(Side thought: it would be great to have a “Reprovider > Strategy” option of “files”, to allow anything added into the “Files” virtual folder structure to be broadcast, rather than me having to also go through and explicitly pin each subfolder I care about…)

Not really they both try as hard, the algorithm changes to be more efficient, it does fancy batching of operations in the keyspace and is ~6 millions times faster.

I completely agree, we have an issue open about that on Kubo.

Note that the default of all will do everything that includes files directories, and internal merkle datastructure of the file.

So it seems that the base issue has been my node didn’t “advertise to the DHT”, which the ipfs-check site identified. I switched on the Accelerated DHT Client, and that appears to have fixed the issue for these files I was actively monitoring the resolution of.

But if this sort of situation happens again, it appears there’s not a whole lot of diagnostic tools available to probe/debug whether my node was even trying to advertise a specific CID, and if it was, if it was being rejected by peers for some reason? There’s the ipfs stats dht command, but that evaluates peers, not CIDs.

And now that I’ve had “Accelerated DHT client” running for a day, the ipfs stats dht report shows 16,000 peers, but the vast majority of them have “last useful” and “last queried” set to “never”, and have no “Agent Version”. The web UI only reports ~90 peers; are those the “active” peers from that DHT list?

Are there any debugging commands I can run on my peer to inspect the flow of advertisements and received broadcasts of individual CIDs, that could be used to see if knowledge of particular CIDs is flowing through the system?

In v0.22.0 Kubo and later it now advertise 128 CIDs first and mesure how long did this took. If this is taking too long it prints a message asking you to active the accelerated dht client.

There are two things here.
First last useful and last queried, that is a bug in the statistics, the bucket client can know peers it have not queried, the accelerated dht client does not do that, all the peers in ipfs stats dht have been scanned. To fix this bug we would need to add some stats field that record the last scan and set all peers to that.

The webui lists open connections, you don’t need an open connection to all the peers in the DHT, you will open one when as needed.

You can view ipfs stats provide which gives you total stats. Sadly the advertisements are not a flow and we don’t track individual CIDs either so putting more tracking than this is non trivial.