How can I tell if my IPFS node is publicly available?

danieln · May 16, 2022, 11:53am

I have IPFS (go-ipfs) running on a machine in a local network behind NAT (My router supports upnp) and would like to pin files/CIDs locally and test their discoverability through public gateways.

For this, I presume that need to make sure that the IPFS node is publicly dialable/available for swarm connections.

As I understand it, go-ipfs binds to multiple ports:

4001 the libp2p swarm port which should be publicly available. If it isn’t reachable can content existing only on the node even be accessed/requested by the rest of the network?
5001 the Daemon API used by the ipfs CLI to manage the node
8080 the http web gateway

After running ipfs id I saw the following address in the Addresses array:

	"Addresses": [
		"/ip4/127.0.0.1/tcp/4001/p2p/QmPQFVmn66NRCdR1aRBdkDXKJW1mBxSwFaxTc5qHVaMpD9",
		"/ip4/127.0.0.1/udp/4001/quic/p2p/QmPQFVmn66NRCdR1aRBdkDXKJW1mBxSwFaxTc5qHVaMpD9",
		"/ip4/192.168.178.180/tcp/4001/p2p/QmPQFVmn66NRCdR1aRBdkDXKJW1mBxSwFaxTc5qHVaMpD9",
		"/ip4/192.168.178.180/udp/4001/quic/p2p/QmPQFVmn66NRCdR1aRBdkDXKJW1mBxSwFaxTc5qHVaMpD9",
		"/ip4/77.143.135.5/tcp/64898/p2p/QmPQFVmn66NRCdR1aRBdkDXKJW1mBxSwFaxTc5qHVaMpD9",
		"/ip4/77.143.135.5/udp/64898/quic/p2p/QmPQFVmn66NRCdR1aRBdkDXKJW1mBxSwFaxTc5qHVaMpD9",
	]
// I changed the public IP and NodeID for privacy

I checked in the router configuration and saw that port 64898 was indeed mapped automatically with UpNP.

Questions

Which ports does go-ipfs attempt to automatically forward with UpNP? (given that a node binds to multiple ports but you rarely want to make the admin API port publicly available)
Why did go-ipfs map port 64898 rather than 4001?
How can I verify if my IPFS node is publicly reachable for swarm connections?
How can I tell if the DHT is in Client or Server mode?
If the DHT of a node is in client mode (because it isn’t publicly dialable), will content that is only pinned to the node be discoverable to the rest of the network?

ylempereur · May 16, 2022, 4:23pm

whatever you have configured in Addresses.Swarm
that’s what upnp returned (you can’t force a specific port with upnp. use port mapping instead if you want to do that)
do an ipfs id <your_node_id> from another node not on your local network or use IPFS Check
not sure (if it’s reachable and Routing.Type is “dht”, it should be. also, the traffic pattern is far busier in server mode)
that’s 2 questions: content is discoverable through the DHT if your node does a good job of reproviding (reachability and client/server mode don’t matter for this). if your node is reachable, all the content in your cache is downloadable by the rest of the network (client/server mode doesn’t matter for this).

Having your node in DHT server mode helps the network, but isn’t needed to serve content. the only thing that is truly needed is to be reachable.

hector · May 17, 2022, 8:49am

easier:

when you run ipfs id on your node it will print a list of protocols, among them the DHT ones:

		"/ipfs/kad/1.0.0", <-- only if your node is publicly reachable
		"/ipfs/lan/kad/1.0.0",

danieln · May 17, 2022, 9:29am

Thanks!

"/ipfs/kad/1.0.0", is exactly was I was looking for.

The reason I was confused (and correctly suspected that the DHT is not in server mode) is because IPFS Desktop that starts the IPFS daemon with --routing=dhtclient which overrides the Routing.Type value in the IPFS configuration.

Content is discoverable through the DHT if your node does a good job of reproviding (reachability and client/server mode don’t matter for this)

What factors can help improve content discoverability/reachability for files that I care about being available assuming they are pinned to my local node?

Based on my reading of the docs, the PUT operation for a given provider record will put the provider record at the K closest peers, and also store it ourselves. so by having the DHT in server mode a given CID (that is locally pinned) should be more reachable.

Am I understanding this correct?

hector · May 17, 2022, 10:42am

More or less… the search for that CID will be directed towards “close peers”. So even if your peer is storing the record, a peer looking for that record on the DHT won’t necessarily visit your peer looking for it unless the peer ID and the CID are “close” in terms of DHT-distance, but it may help I guess.

ylempereur · May 17, 2022, 3:32pm

Let me explain what I meant by that:

Your node will “reprovide” all your blocks every 12 hours. Those records will survive for 24 hours. The problem you run into is that, if you have a lot of blocks, it can take more than 12 hours to do a reprovide run. In fact, if you have more than just a handful of blocks, it can take days to do a run. If that happens, the records time-out after 24 hours and some of your content is no longer discoverable.

On my primary node, I have calculated that a reprovide run takes over 100 days, leaving my content undiscoverable for 99 of those days.

If that is your situation too, you can use the Accelerated DHT Client. I use it, and my reprovide runs now take 13 minutes each time, solving the problem.

ipfs config --json Experimental.AcceleratedDHTClient true

hsn10 · May 18, 2022, 6:44am

Running DHT server will improve reachability of your content in case your blocks are not found in DHT.

DHT is not very durable. While there is 24 hour record timeout, in reality its much less because people turn off their nodes after few hours. I have script which reprovides most important blocks to network each 1.5 hours (empirically tested).

Running node for longer time (uptime) improves content reachability. Do not restart unless you have to.

danieln · May 18, 2022, 2:34pm

Yeah that makes sense, basically it’s a question of the XOR distance between my peerID and the CID that I’m providing.

The problem you run into is that, if you have a lot of blocks, it can take more than 12 hours to do a reprovide run. In fact, if you have more than just a handful of blocks, it can take days to do a run.
On my primary node, I have calculated that a reprovide run takes over 100 days, leaving my content undiscoverable for 99 of those days.

What’s the reason that reproviding can take so long? Is it to avoid overloading the DHT network?

Where can I read more about the AcceleratedDHTClient?

If I want to force reproviding a given CID that I’m pinning locally, is this the correct command:
ipfs dht provide QmRKs2ZfuwvmZA3QAWmCqrGUjV9pxtBUDP3wuc6iVGnjA2

ylempereur · May 18, 2022, 3:00pm

I think the reason it takes so long is that it has to walk the DHT for each block in order to find the right DHT nodes to write the records to. With the Accelerated DHT Client, it knows about all the DHT nodes on the network and can talk to them directly.

Accelerated DHT Client

Yes, but you don’t have to use that. Just set Reprovider.Interval to something smaller than “12h” if you want to speed it up (I use “8h” and my records have been rock solid)

Topic		Replies	Views
Can i become a full functioning node without public IP? Help	5	1571	March 29, 2021
How can I find out, if my node is reachable? Help	1	605	June 19, 2020
Ipns dht findpeer <hash> sometimes shows only local ips?	4	430	June 23, 2021
IPFS DHT can't seem to find anything correctly Help go-ipfs	10	1120	October 16, 2018
Public gateways will not retrieve my files :pensive: Help	12	2533	September 27, 2019

How can I tell if my IPFS node is publicly available?

Questions

Related topics