Dht.findpeer fail when finding a node behind nat

I have two node. One have a public ip and I call it public node. The other is behind a nat device and is called private node.

The two nodes have the same config file: dht mode is set ModeAuto, AutoRelay and Relay is set true.

When I use dht.findpeer to find the peer addr of the public node, it usually returns quickly with the correct address.

But when I want to find the peer addr of the private node, it use a long time and returns “routing: not found error”. To confirm relay is working, I print addr on the private node.
/ip4/45.142.178.1/udp/4001/quic/p2p/12D3KooWMJ8DryrNd4Xyo1esbu6ayCQYFTLeTiSkduz1AMLcLHbq/p2p-circuit
That says relay is working on the private node.

Problem also happened when I set the public node’s listenAddr to “127.0.0.1:4005”

So, I have 2 questions:

  1. why they have different behavior? As far as I know, peer addr is queried through dht and nodes without a pubic ip can also join dht. But why they can’t be find?

  2. why listenAddr can also affect the result?

thanks! @stebalien

What version of go-ipfs are you using? If you’re not using v0.9.0 try that and see if it resolves your issue.

This might be the same issue as Several questions re discovery of nodes under NAT

Thanks. It seems to work!

But I always need to call findpeer two or three times to get result.
I read the code. Try to find how it works.
When a node call dht.findpeer:

  1. pick K closest peers to the id in our Routing table
  2. asks each of these peers to return the K DHT server peers closest to the id, call them peers2
  3. asks each of peers2 to return the K DHT server peers closest to the id.

It only do 3 times iterative query, so if we are not lucky, we will not find target peer? So we need try more times?
Also if a node(nated) want to be easier to be find, it’s better to connect to more dht servers?

Not really, the DHT is a structured network being connected to nodes outside of the ones you need to shouldn’t really help much.

It’s possible that the node behind a NAT keeps getting its connections pruned due to mostly being useless to the server node. It’s possible that we either need servers to keep information about clients around a little longer or need clients to more frequently reconnect to their designated servers (i.e. the K peers closest to their ID in the Kademlia space). It’s also possible that if your target peer’s address changes a lot (i.e. your ISP frequently changes your IP address) that this problem it could be exacerbated.

I haven’t heard too many reports of this so it’s also possible that dialing your peer is failing for other reasons. ipfs dht findpeer doesn’t just find the peer’s address, but also tries to connect to it which could mean that it could fail if for some reason the peer was difficult to talk to (overwhelmed, network dropping packets, etc.).

One way you could examine if you’re connected to your 20 closest peers in Kademlia space is to run ipfs dht query <yourPeerID> to get your closest peers and then periodically run ipfs swarm peers and see how many of those peers you’re connected to. You shouldn’t need to be connected to them all the time, but understanding how frequently you’re connected could help