Peer behind a firewall

Hey,

I have 4 peers running ipfs-cluster available on Internet. I would like to be able to add other peers by bootstrapping them to the leader. Those peers are going to stand behind a firewall (I can not tweak anything), for sure inbound port will be filtered, outbound should be opened. I can not guarantee that the multiaddress of those peers would be accessible from Internet.

Currently, the cluster is able so see the peer asking for connection, but I get quickly those message : Failed to heartbeat to Qmc****: dial backoff logging.go:105 and the service shutdown on the peer

Is there any configuration on this peer to make ipfs-cluster working ?

Thanks,
Florian

Hey,

Can you post the logs of the Firewalled peer?

Are you bootstrapping to the same peer that shows the Failed to heartbeat messages? If not, can you try that?

If NAT hole-punching works, and you bootstrap directly to the cluster Raft leader, I think that the firewalled peer should then manage to punch holes to the rest of peers. But if bootstrapping to someone that is not the leader, that makes the leader not be able to heartbeat the firewalled peer.

If you are indeed already bootstrapping to the leader, then libp2p’s NAT hole-punching may not work in your environment. We might explore other options like QUIC or libp2p circuits then, but I haven’t tried them myself.

Are you bootstrapping to the same peer that shows the Failed to heartbeat messages? If not, can you try that?

I tried and here is the log

Can you post the logs of the Firewalled peer?

17:38:27.707  INFO  consensus: Current Raft Leader: QmcWtMUskZ7cANVnrABm7sBdEpmyv8iPsxST3YwMpTURSw raft.go:293
17:38:27.708  INFO    cluster: QmcnEty1HpQzzXq2X4tXPh2foFAVH152wQPhwdNvkceh3g: joined QmcWtMUskZ7cANVnrABm7sBdEpmyv8iPsxST3YwMpTURSw's cluster cluster.go:692

17:38:47.800 ERROR    cluster: no state has been agreed upon yet cluster.go:856
17:38:59.868 ERROR  p2p-gorpc: dial attempt failed: <peer.ID Qm*kceh3g> --> <peer.ID Qm*bvJHsz> dial attempt failed: context deadline exceeded call.go:63
17:39:00.082 ERROR  p2p-gorpc: dial attempt failed: <peer.ID Qm*kceh3g> --> <peer.ID Qm*gXD7iQ> dial attempt failed: context deadline exceeded call.go:63
17:39:00.234 ERROR  p2p-gorpc: dial attempt failed: <peer.ID Qm*kceh3g> --> <peer.ID Qm*Gu2RHf> dial attempt failed: context deadline exceeded call.go:63
17:39:29.371 ERROR    cluster: no state has been agreed upon yet cluster.go:856

Here is the log from the leader :

déc. 06 17:38:26 ipfs-amazon ipfs-cluster-service[8633]: 17:38:26.489  INFO  consensus: peer added to Raft: QmcnEty1HpQzzXq2X4tXPh2foFAVH152wQPhwdNvkceh3g consensus.go:355
déc. 06 17:38:27 ipfs-amazon ipfs-cluster-service[8633]: 17:38:27.074  INFO    cluster: Peer added  QmcnEty1HpQzzXq2X4tXPh2foFAVH152wQPhwdNvkceh3g cluster.go:602

And from an other node :

Dec 06 17:41:03 ipfs-tutu ipfs-cluster-service[7591]: 17:41:03.467 ERROR  p2p-gorpc: dial attempt failed: <peer.ID Qm*bvJHsz> --> <peer.ID Qm*kceh3g> dial attempt failed: context deadline exceeded call.go:63
Dec 06 17:41:03 ipfs-tutu ipfs-cluster-service[7591]: 17:41:03.467 ERROR    cluster: <peer.ID Qm*bvJHsz>: error in broadcast response from <peer.ID Qm*kceh3g>: dial attempt failed: <peer.ID Qm*bvJHsz> --> <peer.ID Qm*kceh3g> dial attempt failed: context deadline exceeded  cluster.go:1180

If I try to pin a file from a peer from the cluster I get the following message from

<peer.ID Qm*kceh3g> : CLUSTER_ERROR: dial attempt failed: <peer.ID Qm*bvJHsz> --> <peer.ID Qm*kceh3g> dial attempt failed: context deadline exceeded | 2018-12-06T16:41:03Z

It looks like the peer can not be reach from Internet

Ah, ok, but the firewalled peer is not dying on bootstrap right?

what happens if you run ipfs-cluster-ctl peers ls from that peer? This will open connections from the peer to the rest. Can you pin afterwards?

Right, It is not dying anymore, thanks !

Indeed, peers from the cluster does not complain anymore, logs keep quiet on the firewalled peer. Do you know why I need to peer ls before doing any things ?

When pinning a file (42MB) from one of the member of the cluster I had a timeout issue :

Dec 07 10:25:40 ipfs-tutu ipfs-cluster-service[1015]: 10:25:40.088 ERROR      adder: error adding to cluster:  read tcp4 127.0.0.1:9094->127.0.0.1:37032: i/o timeout adder.go:146

It looks like the file is too big to synchronise on all members of the cluster.

I tried with a smaller file (1.2MB) and it worked ! I was able to download the file from the firewalled peer gateway.

My plan is to synchronize large amount of files from a cluster to firewalled peers. I’m wondering how I’m gonna synchronize file over 1GB, is there any tweak to be made ?

Many thanks for your help !

Either this is an network issue unrelated to cluster or your peer configuration have timeouts set (see read_timeout and write_timeout https://cluster.ipfs.io/documentation/configuration/#restapi - they should be set to 0).

It is a work around. This forces the firewalled peer to open connections to every other peer. Once those connections are established, they can be used to contact that peer. What operating system are you using though?

I have opened an issue here Improve experience with NAT'ed/Firewalled peers · Issue #614 · ipfs-cluster/ipfs-cluster · GitHub so that we can figure out what’s the best approach to make the process painless.

Nice ! I have been able to pin a 700MB file, my settings were wrong. They were coming from an ansible playbook. I wwill make a PR about it.

Should I replicate all the default settings from : https://cluster.ipfs.io/documentation/configuration/ ?

I noticed the issue, thanks, I’ll follow it !

Yes, I think so (or better, the defaults from ipfs-cluster-service init). Thanks for catching that. I am manually overwriting those everywhere so I didn’t notice.

Thanks, it works well now !

I wish to build ipfs-cluster topology with mixed arch (leader public amd64) and firewalled nodes (arm, arm64).
I saw your project https://github.com/hsanjuan/ansible-ipfs-cluster. But I am not yet familiar with ansible…

Preparing a workshop to teach how to build the New Internet during a resilience learning festival.
I am willing to experiment IPFS with CJDNS and write a detailed step by step guide.
I wonder if I could rely on your help?