IPFS behind a sidecar Wireguard

I have an IPFS node and a private IPFS cluster running in containers. It works well as it is but as soon as I try to have a VPN WireGuard (ProtonVPN) sidecar container to it, it does not seem to work anymore.

Here is my IPFS config:

ipfs config --json Datastore.BloomFilterSize 4194304
ipfs config Datastore.StorageMax 5000GB
ipfs config Routing.Type dhtclient
ipfs config Swarm.ResourceMgr.MaxMemory 17GB
ipfs config Addresses.API "/ip4/0.0.0.0/tcp/5001"
ipfs config --json Experimental.AcceleratedDHTClient true

See below a truncated version of my docker compose file (only including the ipfs node and not ipfs cluster):

version: "3.4"
services:

  ipfs:
    container_name: ipfs
    image: ipfs/kubo:latest
    restart: unless-stopped
    volumes:
      - /path/to/data/:/data/ipfs
      - /path/to/config/001-ipfs-config.sh:/container-init.d/001-ipfs-config.sh
    network_mode: service:wireguard

  wireguard:
    container_name: wireguard
    image: lscr.io/linuxserver/wireguard
    restart: unless-stopped
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=CET
    networks:
      - ipfs_wireguard
    cap_add:
      - NET_ADMIN
      - SYS_MODULE
    volumes:
      - /lib/modules:/lib/modules
      - /path/to/config/wg0.conf:/config/wg0.conf
    sysctls:
      - net.ipv4.conf.all.src_valid_mark=1
    ports:
      # IPFS
      - "4001:4001/tcp" # ipfs swarm - expose if needed/wanted
      - "4001:4001/udp" # ipfs swarm - expose if needed/wanted
      # IPFS Cluster
      - "9096:9096/tcp" # Cluster swarm endpoint
      - "9096:9096/udp" # Cluster swarm endpoint

When the sidecar is enabled, I see that my true IP is indeed not disclosed anymore, and it’s replaced by 2-3 other IPs.

The IPFS container shows this kind of error sometime: ERROR autorelay autorelay/autorelay.go:84 failed to start relay finder {"error": "relayFinder already running"}

The IPFS cluster shows more frequent errors:

cluster    | 2023-03-26T01:03:09.304Z   INFO    cluster ipfs-cluster/cluster.go:598     reconnected to xxxxxxxxxxxxxxxxxxxx
cluster    | 2023-03-26T01:03:09.304Z   INFO    cluster ipfs-cluster/cluster.go:598     reconnected to xxxxxxxxxxxxxxxxxxxx
cluster    | 2023-03-26T01:03:36.044Z   INFO    cluster ipfs-cluster/cluster.go:598     reconnected to xxxxxxxxxxxxxxxxxxxx
cluster    | 2023-03-26T01:04:22.049Z   INFO    cluster ipfs-cluster/cluster.go:598     reconnected to xxxxxxxxxxxxxxxxxxxx
cluster    | 2023-03-26T01:04:22.049Z   INFO    cluster ipfs-cluster/cluster.go:598     reconnected to xxxxxxxxxxxxxxxxxxxx
cluster    | 2023-03-26T01:05:07.059Z   INFO    cluster ipfs-cluster/cluster.go:598     reconnected to xxxxxxxxxxxxxxxxxxxx
cluster    | 2023-03-26T01:05:07.059Z   INFO    cluster ipfs-cluster/cluster.go:598     reconnected to xxxxxxxxxxxxxxxxxxxx
cluster    | 2023-03-26T01:05:46.796Z   INFO    cluster ipfs-cluster/cluster.go:598     reconnected to xxxxxxxxxxxxxxxxxxxx
cluster    | 2023-03-26T01:05:54.778Z   INFO    crdt    go-ds-crdt@v0.3.10/crdt.go:891  DAG repair finished. Took 3m19s
cluster    | 2023-03-26T01:05:54.778Z   ERROR   crdt    go-ds-crdt@v0.3.10/crdt.go:510  error getting node for reprocessing xxxxxxxxxxxxxxxxxxxx: context deadline exceeded
cluster    | 2023-03-26T01:06:12.529Z   INFO    cluster ipfs-cluster/cluster.go:598     reconnected to xxxxxxxxxxxxxxxxxxxx
cluster    | 2023-03-26T01:06:12.529Z   INFO    cluster ipfs-cluster/cluster.go:598     reconnected to xxxxxxxxxxxxxxxxxxxx

I also tried only announcing the endpoint IP of my WireGuard config.

On the ProtonVPN side, I have enabled a bunch of features that without being an expert I thought could help with port forwarding:

# Bouncing = 9
# NetShield = 0
# Moderate NAT = on
# NAT-PMP (Port Forwarding) = on
# VPN Accelerator = off

Is it a lost cause? Or running an IPFS node behind a WIreGuard VPN is supposed to smooth?

Also it’s worth noting that I used https://check.ipfs.network/ to check the status of my node and the only reported failure is the latest one:

âś” Successfully connected to multiaddr
âś” Found multiaddrs advertised in the DHT:
# All multiaddr without my true IP
âś” Found multihash advertised in the dht
❌ There was an error downloading the CID from the peer: transient connection to peer

My experience is that any kind of VPN prevents port mapping, so IPFS has to rely on relay v2 for establishing connections. That works ok for simple stuff, but the second you try to exchange blocks, it needs to use hole punching to establish a direct connection, and that rarely (never?) works across a VPN (latency issues prevent sync?).

Thanks for the insights. That’s helpful to better understand what’s the bottleneck.

I found a message from July last year fro you (Roadmap for Circuit Relay v2 File Transfer? - #2 by ylempereur) with the following ipfs config:

ipfs config --json Swarm.RelayClient.Enabled true
ipfs config --json Swarm.EnableHolePunching true

Reading the doc it seems like hole punching is already enabled by default but I just wanted to make sure that are no default ipfs config I could try to play with to increase the success rate here?

Hole punching is fickle, even when it works.

I’m not familiar with the details of how wireguard works. See if there is any way to configure static port mapping to a node, that would solve the problem.

I gave it a try with a VPN provider supporting static port forwarding. I can call ipfs swarm connect using the exit VPN IP and the configured static port and the connection works well.

Now I have an issue to announce properly this IP/port from the IPFS node.

I followed the doc and used ipfs config --json Addresses.AppendAnnounce to add the correct multiaddr with the VPN IP and port.

Upon start, ipfs id correctly adds those to the Addresses key but after a few seconds the set multiaddr just disappears. Not sure the reason for that since ipfs swarm connect works well.

It seems like my vpn ip and port are persisted after adding:

ipfs config --json Swarm.RelayClient.Enabled false
ipfs config --json Swarm.DisableNatPortMap true

Does that make sense?

It turns out the ipfs swarm connect command worked for a wrong reason and once behind the vpn I can’t make it work. I always hit a Connection refused error. I also tried with telnet as well.

I am quite sure the port is correctly forwarded because when I run a port checker binary, it works well: docker-compose exec ipfs /bin/sh -c "wget -qO port-checker https://github.com/qdm12/port-checker/releases/download/v0.3.0/port-checker_0.3.0_linux_amd64 && chmod +x port-checker && ./port-checker -port 4322"

I have the impression ipfs is not listening on my port 4322 but I dont know the reason. I also tried to add it to the Swarm with:

ipfs config --json Addresses.AppendAnnounce '[
    "/ip4/xxx/tcp/4322",
    "/ip4/xxx/udp/4322/quic",
    "/ip4/xxx/udp/4322/quic-v1",
    "/ip4/xxx/udp/4322/quic-v1/webtransport"
]'

ipfs config --json Addresses.Swarm '[
    "/ip4/0.0.0.0/tcp/4001",
    "/ip6/::/tcp/4001",
    "/ip4/0.0.0.0/udp/4001/quic",
    "/ip4/0.0.0.0/udp/4001/quic-v1",
    "/ip4/0.0.0.0/udp/4001/quic-v1/webtransport",
    "/ip6/::/udp/4001/quic",
    "/ip6/::/udp/4001/quic-v1",
    "/ip6/::/udp/4001/quic-v1/webtransport",
    "/ip4/xxx/tcp/4322",
    "/ip4/xxx/udp/4322/quic",
    "/ip4/xxx/udp/4322/quic-v1",
    "/ip4/xxx/udp/4322/quic-v1/webtransport"
]'

but it fails the same.

The only thing that ends up working for me is:

ipfs config --json Addresses.Swarm '[
    "/ip4/0.0.0.0/tcp/4322",
    "/ip6/::/tcp/4322",
    "/ip4/0.0.0.0/udp/4322/quic",
    "/ip4/0.0.0.0/udp/4322/quic-v1",
    "/ip4/0.0.0.0/udp/4322/quic-v1/webtransport",
    "/ip6/::/udp/4322/quic",
    "/ip6/::/udp/4322/quic-v1",
    "/ip6/::/udp/4322/quic-v1/webtransport"
]'

You’re forgetting one important thing, a VPN is a NAT router. This means that, while your node is listening on port 4322 on the LAN, the VPN is probably listening on a different port on its public IP address. That’s the port you need to announce :confused:

I configured the VPN to listen to that port on the public IP address (4322) and this same port is forwarded to the container.

I already checked it with port-checker. It runs a small server in the ipfs container and I have been able to access it via the VPN public IP and the same port as above.

An alternative fix in my config would have been to redirect 4322 to 4001 within the container (4322:4001), so the ipfs announce 4322 but listen only on 4001.

But the other fix by editing Addresses.Swarm works as good.

Maybe a question for you: when adding a multiaddr to Addresses.AppendAnnounce, is it expected for ipfs to also listen to it or is it only to announce and so listening is always and only done with Addresses.Swarm?

It only announces. It only listens to what’s listed in Swarm. So, the mapping is something you handle in the router.

For example, here is the config I use on my laptop (not my main node) when I use it:

  "Addresses": {
    ...
    "AppendAnnounce": [
      "/ip4/47.144.XX.XXX/tcp/4002",
      "/ip4/47.144.XX.XXX/udp/4002/quic",
      "/ip4/47.144.XX.XXX/udp/4002/quic-v1",
      "/ip4/47.144.XX.XXX/udp/4002/quic-v1/webtransport"
    ],
    ...
    "Swarm": [
      "/ip4/127.0.0.1/tcp/4001",
      "/ip4/127.0.0.1/udp/4001/quic",
      "/ip4/127.0.0.1/udp/4001/quic-v1",
      "/ip4/127.0.0.1/udp/4001/quic-v1/webtransport",
      "/ip4/192.168.2.8/tcp/4001",
      "/ip4/192.168.2.8/udp/4001/quic",
      "/ip4/192.168.2.8/udp/4001/quic-v1",
      "/ip4/192.168.2.8/udp/4001/quic-v1/webtransport"
    ]
  },

And my router is set to port map 47.144.XX.XXX:4002 to 192.168.2.8:4001

Also, since it’s of no use to you, you should kill LAN discovery:

ipfs config --json Discovery.MDNS.Enabled false

Ok, thanks. It makes perfects sense to me, and it’s what I ended up understanding. I think the original confusion was coming from the difference in between announcing and listening.

Thanks again!