Clarification question: how does an ipfs peer reach my local computer at port 4001 if I haven't opened that port on my home router

I think this is a pretty “newbie” question. I’m just hoping to understand a little better how ipfs networking works. Maybe someone can clarify the concepts involved, explain what I’m not understanding, or point me to some helpful article.

Here’s my question:

My understanding is that when I’m connected to an ipfs swarm, I’m connected to a bunch of peers listening (receiving requests from other peers) via port 4001.

But in my experience, in order to run a home web server hosted on a computer within my LAN, I have had to manually go into my home router and open port 80 (and then forward all request to port 80 to a specific MAC address). Otherwise, no outside computer can connect with the webserver.

Similarly, I have had to explicitly open port 22 to allow ssh into my home LAN. Otherwise the router blocks these requets.

But I certainly have not done this for port 4001.

And yet, when I request a hash via the ipfs gateway (ipfs.io/ipfs) for a unique file I’ve pinned to my local computer (within my LAN), the ipfs gateway is able to retrieve the data.

My understanding is that this gateway has found a peer that is eventually connect to a peer that is connected to me, and this “direct peer” is requesting that data from my computer via port 4001 and then passing it on.

So I’m wondering how that is possible? I would be grateful of any clarifications, corrections, or pointers to helpful reading.

Two ways:

  • Either you have ended up connected to the gateway (your peer opened the connection to it as part of the peer-discovery mechanism).
  • Or the port was opened using uPnP (if supported/enabled in your router).

The relay option is possible too, but I am not sure if ipfs/anyone is running public relays for this. Does your node report a relay address (ipfs id)?

1 Like

After ipfs id, under protocols I see: “/libp2p/circuit/relay/0.1.0”, but there is nothing that looks like “relay” under addresses.

But basically, the idea is: when I run “ipfs daemon”, some communication with my router is happening and port 4001 is in fact being opened?

Interesting. At least my intuition that 4001 must in some way have been opened is more or less correct, no? :slight_smile:

If uPnP is enabled on your router, then yes, IPFS will contact the router and request that port 4001 be forwarded to it. uPnP is a protocol that allows software to do what you can manually do when you forward ports.

But your IPFS daemon is also connecting OUT to other peers on the Internet. But unlike http and other client/server protocols, any IPFS peer-to-peer connection can carry traffic in BOTH directions. So even though you connect out to other peers (a client connecting to a server in traditional networking), that peer now knows your peer ID and can send requests (acting as a client to your server) to you over that connection.

1 Like

If you do an “ipfs swarm peers --direction” you can see each peer connection and which direction it is. If your 4001 port is not open, they should all be “outbound”. If your swarm port is open, there should be a mix of “inbound” and “outbound”.

If you do have an open port, your daemon will have an easier job of keeping up a good connected peer count because other peers that do not have their port open can connect to your instance.

1 Like

Thanks, that’s really helpful.

I’m still a little fuzzy on how “knowing the peer id” is enough for the “other peer” to communicate “back” to me and for that request to get through my router/gateway.

(Perhaps, once the connection is open, it can remain open and communicate freely back and forth until it is closed. I need to do some more reading :slight_smile: Any good reading suggestions to clarify this kind of network behaviour in more depth?)

Overall, you’ve made it much clearer. Thanks.