DHT is unable to fetch multi-address for certain peers when private network is scaled up (both in terms of number of nodes, and number of transactions wrt DHT). Due to this we face 2 issues:
it is unable to connect to certain nodes giving the error “dial backoff” and/or “i/o timeout”.
it is unable to fetch the providers for a given key/contentID for certain nodes, i.e. it gives empty response on dht.FindProvidersAsync(key) call even when dht.Provide(key) has been called for these nodes.
For (1) we tried to fetch the multiaddress from the DHT using dht.FindPeer API, but it gave the error “Routing: Not Found”
In both these cases all peers are online but still this issue persists when scaling the network up . On a smaller network it works fine for any number of calls to the DHT.
We are running a network of 2000 nodes, with around 350-450 connections (as per host.network().Peers() api call). We did some local testing and found out that the nodes were dial-able from others as well.
This sounds like those nodes are not reachable or not running.
Any chance the records have expired (they need to be republished every 12h).
Anything relevant in logs?
After checking the most obvious explanations, you might need to look closely at debug logs from the node that cannot find routing and the node that it is trying to find routing to, make sure that it is still listening etc.
We tried to print the o/p of “GetClosestPeers” which returned us the NodeIds & then we tried to use"FindPeer" to get the connection details of that node but Address Info was missing. How come NodeId is being retrieved but Address Info is not being retrieved?
It happens for records provided for around 30m-1h as well (not all but some)
The main thing I would check is what the routing tables of your peers look like and checking that peers have routing tables that are about the right amount of filled (i.e. you should see around Min(1/2^bucketNum, bucketSize) peers per bucket.
Some code in go-ipfs that prints out stats like this is go-ipfs/stat_dht.go at master · ipfs/go-ipfs · GitHub (I highlighted the areas that deal with getting the peers out of the routing table and figuring out which buckets they are from).