We are running a private ipfs swarm. Many of the connected devices are developer machines that are not running constantly. There are also some android devices that are using --routing=dhtclient
to reduce network bandwidth, so these are not participating in the DHT at all. There is one node running on an ec2 instance that is permanently available (soon to be upgraded to an ipfs cluster).
We are having problems with DHT resolution. When we add some data on some machine a
that is directly connected to b
, resolution works. But if we have nodes a
and b
that are both behind a NAT and can only talk via an intermediary node, hash resolution does not work.
When troubleshooting this, I tried using ipfs dht findprovs <the-hash>
and never got any answer, even for hashes that can be resolved. So it seems that the DHT does not work at all in our private swarm for some reason, and the only reason things usually work is that nodes talk directly to each other.
Is there some minimum number of permanently available nodes that is required for the DHT to work? What can we do to fix this?
Update: I looked into this some more, and it turns out that the DAG is working fine. a
adds something, and in a short time ipfs dag findprovs <hash>
on b
produces the node id of a
. But even then I am still not able to get the content for the hash on machine b
. If I get the content on a machine that is a peer of both, suddenly the resolution works at b
.
So if both a
and b
are behind a NAT, they should be able to communicate with each other, right???
Update 2: we have a hunch that the cause of the issues is one of our team members being behind a dual NAT. It seems that this ticket does apply: https://github.com/ipfs/go-ipfs/issues/2879
Is a solution for this in the works? Any info we could provide to debug this?