Kubernetes IPFS Cluster Help

Good day,

Im trying to set up an IPFS Cluster via kubernetes/minikube. I have some questions/issues using a example file that initializes a pod with 1 cluster and 1 node.

  1. Is it required to run a cluster associated with a node every time? Can I initialize a cluster in 1 pod and nodes on other pod files for scalability or do I always need to set up ipfsclusters for each new node? What would be the best approach here?

  2. When Im adding a new cluster to the clusters network via the CLUSTER_SECRET, I find it generates errors at runtime and doesn’t replicate data right. It requires a second boot to replicate existing data properly. Here are the logs:

2021-06-30T03:14:38.671Z	INFO	cluster	ipfs-cluster/cluster.go:651	Cluster Peers (without including ourselves):
2021-06-30T03:14:38.671Z	INFO	cluster	ipfs-cluster/cluster.go:653	    - No other peers
2021-06-30T03:14:38.671Z	INFO	cluster	ipfs-cluster/cluster.go:666	** IPFS Cluster is READY **
2021-06-30T03:14:38.713Z	ERROR	ipfshttp	ipfshttp/ipfshttp.go:564	error posting to IPFS:Post "http://127.0.0.1:5001/api/v0/repo/stat?size-only=true": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:38.790Z	ERROR	ipfshttp	ipfshttp/ipfshttp.go:726	Post "http://127.0.0.1:5001/api/v0/repo/stat?size-only=true": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:38.791Z	ERROR	p2p-gorpc	go-libp2p-gorpc@v0.1.2/call.go:63	Post "http://127.0.0.1:5001/api/v0/repo/stat?size-only=true": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:38.791Z	ERROR	diskinfo	disk/disk.go:96	Post "http://127.0.0.1:5001/api/v0/repo/stat?size-only=true": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:38.837Z	WARN	monitor	pubsubmon/pubsubmon.go:200	discarding invalid metric: &{Name:freespace Peer:12D3KooWCtXAyqWsBrBMLbZRWaevcHKpJeG7toqrY9mPiMLS51Ys Value:0 Expire:1625022908835596727 Valid:false ReceivedAt:0}
2021-06-30T03:14:39.289Z	ERROR	ipfshttp	ipfshttp/ipfshttp.go:564	error posting to IPFS:Post "http://127.0.0.1:5001/api/v0/id": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:39.389Z	ERROR	ipfshttp	ipfshttp/ipfshttp.go:564	error posting to IPFS:Post "http://127.0.0.1:5001/api/v0/repo/stat?size-only=true": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:39.392Z	ERROR	ipfshttp	ipfshttp/ipfshttp.go:726	Post "http://127.0.0.1:5001/api/v0/repo/stat?size-only=true": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:39.392Z	ERROR	p2p-gorpc	go-libp2p-gorpc@v0.1.2/call.go:63	Post "http://127.0.0.1:5001/api/v0/repo/stat?size-only=true": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:39.394Z	ERROR	diskinfo	disk/disk.go:96	Post "http://127.0.0.1:5001/api/v0/repo/stat?size-only=true": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:39.398Z	WARN	monitor	pubsubmon/pubsubmon.go:200	discarding invalid metric: &{Name:freespace Peer:12D3KooWCtXAyqWsBrBMLbZRWaevcHKpJeG7toqrY9mPiMLS51Ys Value:0 Expire:1625022909396226048 Valid:false ReceivedAt:0}
2021-06-30T03:14:39.418Z	ERROR	ipfshttp	ipfshttp/ipfshttp.go:564	error posting to IPFS:Post "http://127.0.0.1:5001/api/v0/pin/ls?type=recursive": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:39.420Z	ERROR	p2p-gorpc	go-libp2p-gorpc@v0.1.2/call.go:63	Post "http://127.0.0.1:5001/api/v0/pin/ls?type=recursive": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:39.421Z	ERROR	pintracker	stateless/stateless.go:443	Post "http://127.0.0.1:5001/api/v0/pin/ls?type=recursive": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:39.421Z	ERROR	pintracker	stateless/stateless.go:492	Post "http://127.0.0.1:5001/api/v0/pin/ls?type=recursive": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:39.421Z	INFO	cluster	ipfs-cluster/cluster.go:1020	12D3KooWCtXAyqWsBrBMLbZRWaevcHKpJeG7toqrY9mPiMLS51Ys: joined QmUZuikUkuDqVbaKNaAtjvpf7x9Pg8sVLevA3kzHhjFYQU's cluster
2021-06-30T03:14:46.901Z	INFO	crdt	crdt/consensus.go:227	new pin added: Qmf8oj9wbfu73prNAA1cRQVDqA52gD5B3ApnYQQjcjffH4
2021-06-30T03:14:46.904Z	ERROR	ipfshttp	ipfshttp/ipfshttp.go:564	error posting to IPFS:Post "http://127.0.0.1:5001/api/v0/pin/ls?arg=Qmf8oj9wbfu73prNAA1cRQVDqA52gD5B3ApnYQQjcjffH4&type=recursive": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:46.904Z	ERROR	p2p-gorpc	go-libp2p-gorpc@v0.1.2/call.go:63	Post "http://127.0.0.1:5001/api/v0/pin/ls?arg=Qmf8oj9wbfu73prNAA1cRQVDqA52gD5B3ApnYQQjcjffH4&type=recursive": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:46.932Z	INFO	crdt	crdt/consensus.go:227	new pin added: QmXsVdkcq5DVDSs1ZKVBwTaPibNeZijYgcfCcTF8xyvACM
2021-06-30T03:14:46.934Z	ERROR	ipfshttp	ipfshttp/ipfshttp.go:564	error posting to IPFS:Post "http://127.0.0.1:5001/api/v0/pin/ls?arg=QmXsVdkcq5DVDSs1ZKVBwTaPibNeZijYgcfCcTF8xyvACM&type=recursive": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:46.934Z	ERROR	p2p-gorpc	go-libp2p-gorpc@v0.1.2/call.go:63	Post "http://127.0.0.1:5001/api/v0/pin/ls?arg=QmXsVdkcq5DVDSs1ZKVBwTaPibNeZijYgcfCcTF8xyvACM&type=recursive": dial tcp 127.0.0.1:5001: connect: connection refused
2021-06-30T03:14:46.936Z	INFO	crdt	go-ds-crdt@v0.1.20/heads.go:134	adding new DAG head: bafybeig65kdo4lg6de6puojxmeewzfm2a6iuszauugrrmdjjuq2rvhqedm (height: 2)
  1. I added a new file to the cluster and I could access it to all nodes of that cluster and the other clusters. After adding a new cluster to this existing clusters network, I could not access that file. It stayed loading undefinitely. Then, I cancelled that transaction and ran the ipfs cluster status and got this error:
/ # ipfs-cluster-ctl status
Qmf8oj9wbfu73prNAA1cRQVDqA52gD5B3ApnYQQjcjffH4:
    > ipfs-cluster-4       : PIN_ERROR: context canceled | 2021-06-30T03:19:16.630883712Z
    > ipfs-cluster-2       : PINNED | 2021-06-30T03:26:56.155684825Z
    > ipfs-cluster-1       : PINNED | 2021-06-30T03:26:56.165378395Z
    > ipfs-cluster-3       : PINNED | 2021-06-30T03:26:56.160042279Z
    > ipfs-cluster-0       : PINNED | 2021-06-30T03:26:56.170677135Z
QmXsVdkcq5DVDSs1ZKVBwTaPibNeZijYgcfCcTF8xyvACM:
    > ipfs-cluster-4       : PIN_ERROR: context canceled | 2021-06-30T03:19:16.658331509Z
    > ipfs-cluster-2       : PINNED | 2021-06-30T03:26:56.155811442Z
    > ipfs-cluster-1       : PINNED | 2021-06-30T03:26:56.165428762Z
    > ipfs-cluster-3       : PIN_ERROR: context canceled | 2021-06-30T03:19:16.648266711Z
    > ipfs-cluster-0       : PINNED | 2021-06-30T03:26:56.170748888Z

Can anybody help me understand what is happening here?

Thanks in advance for any help! :smiley:

There should be 1 cluster peer per IPFS daemon.

What these logs say is that ipfs is not at 127.0.0.1:5001 as expected. I am not sure why a reboot fixes it.

“Context cancelled” usually means the cluster peer asked IPFS to pin something and the operation timed out. Perhaps the IPFS daemon is not well connected to the rest of the network or has some issue (check logs?).

At least it seems your cluster peers are well connected and can see each others.

1 Like

Hello @hector ,

After doing some testing, my partner could fix this error by assigning a lower amount of storage to the cluster and its node. (minikube did not have enough default storage values allocated and it crashed). However, I’m still having this crashloop error. Whenever I initiliaze a new replica of ipfs cluster plus ipfs node it crashes. Then, on delete/restart, it works fine. These are the logs. It seems my ipfs node denies some sort of permission on first startup:

  1. created 5 replicas, stopped at 2 because 2nd one crashed.

  2. Upon looking at logs of that new replica, ipfs-cluster-1, the cluster had issues accessing the ipfs node. It said “conection refused”

  3. I cant see the ipfs node logs due to some sort of lack of permission.

I am not sure. These errors (the ipfs container using the ipfs user not having access to its own configuration) belong to your Kubernetes deployment configuration and how it is mounting volumes, and with what permissions etc.