How to restart ipfs-cluster after all the nodes went offline?

Hi everyone,
For a project I need to create an ipfs-cluster to copy automatically data. The number of connected nodes will vary from 0 to N. So I need to find a way to restart my cluster without changing the peerID of my leader node when all the node are disconnected.

I’m currently using ipfs 0.4.19 and ipfs-cluster-service 0.10.0

I created an ipfs-cluster with 3 nodes : node 0 (the leader), node 1 and node 2 sharing the same cluster-secret key.
Node 1 and node 2 are connected to node 0 using daemon --bootstrap /ip4/ip4_node_0/tcp/9096/ipfs/ PeerID_node_0.

When I started for the first time the 3 nodes all worked well.
Then I stopped all the nodes, and I tried to restart the node 0 to restart my cluster.

This node doesn’t want to restart because I got raft ERROR.

11:57:35.612  INFO  ipfsproxy: IPFS Proxy: /ip4/127.0.0.1/tcp/9095 -> /ip4/127.0.0.1/tcp/5001 ipfsproxy.go:273
11:57:35.612  INFO  consensus: existing Raft state found! raft.InitPeerset will be ignored raft.go:187
11:57:42.173 ERROR       raft: NOTICE: Some RAFT log messages repeat and will only be logged once logging.go:105
11:57:42.173 ERROR       raft: Failed to make RequestVote RPC to {Voter QmWd1WHa3tSNgdnqDS4SnVNmyH4sZAHeLe38eQ15mXXXXX QmWd1WHa3tSNgdnqDS4SnVNmyH4sZAHeLe38eQ15mXXXXX}: dial attempt failed: <peer.ID Qm*G3XYFz> --> <peer.ID Qm*mXXXXX> dial attempt failed: dial tcp4 0.0.0.0:9096->91.161.XX.XXX:33102: i/o timeout logging.go:105
11:57:42.568 ERROR       raft: Failed to make RequestVote RPC to {Voter QmWd1WHa3tSNgdnqDS4SnVNmyH4sZAHeLe38eQ15mXXXXX QmWd1WHa3tSNgdnqDS4SnVNmyH4sZAHeLe38eQ15mXXXXX}: dial backoff logging.go:105
11:57:47.173 ERROR       raft: Failed to make RequestVote RPC to {Voter QmNvG6WC1LhPJGFPDDLi1XUVUveM9J5fkFMuL4D58XXXXX QmNvG6WC1LhPJGFPDDLi1XUVUveM9J5fkFMuL4D58XXXXX}: dial attempt failed: <peer.ID Qm*G3XYFz> --> <peer.ID Qm*8XXXXX> dial attempt failed: dial tcp4 0.0.0.0:9096->91.161.XX.XXX:49114: i/o timeout logging.go:105
11:57:47.768 ERROR       raft: Failed to make RequestVote RPC to {Voter QmNvG6WC1LhPJGFPDDLi1XUVUveM9J5fkFMuL4D58XXXXX QmNvG6WC1LhPJGFPDDLi1XUVUveM9J5fkFMuL4D58XXXXX}: dial backoff logging.go:105
11:57:52.769 ERROR       raft: Failed to make RequestVote RPC to {Voter QmWd1WHa3tSNgdnqDS4SnVNmyH4sZAHeLe38eQ15mXXXXX QmWd1WHa3tSNgdnqDS4SnVNmyH4sZAHeLe38eQ15mXXXXX}: dial attempt failed: <peer.ID Qm*G3XYFz> --> <peer.ID Qm*mXXXXX> dial attempt failed: dial tcp4 0.0.0.0:9096->192.168.0.18:9096: i/o timeout logging.go:105
11:57:55.612 ERROR    cluster: ***** ipfs-cluster consensus start timed out (tips below) ***** cluster.go:411
11:57:55.612 ERROR    cluster:
**************************************************
This peer was not able to become part of the cluster.
This might be due to one or several causes:
  - Check the logs above this message for errors
  - Check that there is connectivity to the "peers" multiaddresses
  - Check that all cluster peers are using the same "secret"
  - Check that this peer is reachable on its "listen_multiaddress" by all peers
  - Check that the current cluster is healthy (has a leader). Otherwise make
    sure to start enough peers so that a leader election can happen.
  - Check that the peer(s) you are trying to connect to is running the
    same version of IPFS-cluster.
**************************************************
 cluster.go:412
11:57:55.613  INFO    cluster: shutting down Cluster cluster.go:483

My leader node can’t communicate with the others because they are offline and I can’t restart them.

So my question is : how to restart the leader node to correct these raft errors without changing the PeerID of my leader (and bootstrap all the nodes to the new PeerId) ?

1 Like

I think I found what a part of the solution.

To clean the state of my leader I can use :
ipfs-cluster-service state cleanup to restart properly my leader cluster node.

This command clean the state and delete all the peers. I need to re-add all the peers.

I don’t know If I need to create another topic for the next part :

Because my peer number will change a lot, I need to find a solution to delete the peers on exit.

To delete the peer on clean exit (for example ctrl + c) I can set leave_on_shutdown to true
Most of my peers will not realize a clean exit…
Therefore I need to find a way to automaticcaly remove offline peers.

Do you have any idea on how to do that ?

Thanks in advance for your answer !

Hi @Mairkur, --bootstrap is the process of joining the cluster (becoming part of the peerset that maintains the shared state using Raft for consensus). Starting the nodes with --bootstrap should only be done once. Afterwards, you can start them normally without the --bootstrap flag (as they are already bootstrapped). The first of your problems was that you need to re-start 50%+1 of the peers. Otherwise, no quorum can be reached to elect a leader and the peer exists. If you wish for peers to wait for leaders longer, there is a wait_for_leader_timeout configuration option.

Unfortunately, Raft imposes lots of constraints on the peerset. At least 50%+1 of the peers in the peerset should be available at all times or no pin/unpin is possible.

There is no good workaround at the moment, but in a matter of weeks we are going to offer an alternative to Raft (based on CRDTs), where all the constraints set by Raft around the peerset will disappear. Peers will be able to come and go freely, and bootstrapping is just a helper to discover other peers, but not even necessary in some scenarios. So what you want to do is going to be fully supported rather soon. You can follow the effort here: https://github.com/ipfs/ipfs-cluster/pull/685

Let me know if you have more questions

1 Like

I have been restarting my additional peers with an ipfs-cluster-service.service file that includes this line:

ExecStart=/usr/local/bin/ipfs-cluster-service daemon --bootstrap /ip4/45.32.232.5/tcp/9096/ipfs/QmQdaAeef8Y7brNQpvNFdKnG6FhW3x6Z3bbqEbALasykv1
Restart=on-failure

So, this will run every time a peer goes down. Is this wrong? Should I use just

ipfs-cluster-service daemon

or maybe

ipfs-cluster-service daemon --upgrade

?

Thanks for your reply !
I will with pleasure follow the development.

I will try to find a temporary solution.

The idea is to launch inside another program like python ipfs-cluster-service daemon to get all the informations printed in the console (like the error with the peerID). In this way I will be able to know when an error with a peer occurs based on the string I received.
If the count of an error exceed X I can automaticaly delete this peers.

I found how to append the printed informations in a file :
ipfs-cluster-service daemon |& tee -a logsipfs/logsipfsservice

Next I need to use python to read the file and look for error :slight_smile:

It is slightly wrong because the state of that peer is wiped (folder rotated) to be able to do a clean bootstrap, which will then copy the state from somewhere else. The reason we clean the state before bootstrap is that you could potentially be bootstrapping a peer to a different cluster and this might be getting you in an very ugly place as you might have a diverging Raft log if you do that.

If the peer was part of the cluster already, you can start normally without --bootstrap.

Rather than parsing the logs (which are not meant to be parsed), you can just call the /peers endpoint in the API (:9094), (equivalent to ipfs-cluster-ctl --enc=json peers ls), and check for peers in error state in the response. Then call DELETE /peers/<peerID> (as long as 50% of the cluster is running).

Thank you! I will make that change.