Help: ipfs-cluster is filling my disk

felixwatts · June 28, 2021, 10:05am

Im running a two node IPFS cluster. I have quite limited disk space on one of the machines. I do need to pin some files, but the data I need to pin is very small, only a few MB. The problem I have is that the folder .ipfs-cluster/badger is growing until it fills the available disk space on one of the cluster nodes. The folder contains over 20GB of data when it maxes out the disk. Incidentally this is much larger than the whole of the data in .ipfs.

Is there a way to control the maximum size of this badger database, ideally limiting it to 10GB or less? How? What would be the consequences of doing so?

Thanks for any help!

hector · June 28, 2021, 1:10pm

Hello,

see Support a levelDB backend for cluster by hsanjuan · Pull Request #1364 · ipfs/ipfs-cluster · GitHub which is addressing very large database · Issue #1320 · ipfs/ipfs-cluster · GitHub.

felixwatts · July 7, 2021, 8:01am

Great, thanks for the support! As I understand it, this LevelDB support is due in version 0.14.0. Do we have any idea when that will be released?

hector · July 7, 2021, 8:18am

There is an 0.14.0-rc2 release that you can try: http://dist.ipfs.io.ipns.localhost:8080/ipfs-cluster-service/v0.14.0-rc2

hector · July 7, 2021, 8:19am

Also, added GC to badger in cluster, but I have not yet tested how that performs. It should help solve the problem too.

felixwatts · July 12, 2021, 9:56am

Hi,
So, using 0.14.0-rc2 and using a fresh init with default parameters (just copying peerstore and identity from previous installation):

There is still a badger database using a lot of disk space, not as much but still a lot.
There seems to be a memory leak, over a few days the ipfs-cluster-service process gradually ate up all the RAM.
I see this error message frequently in the log: ipfs ipfs-cluster-service[23277]: 2021-07-12T10:04:33.414Z ERROR monitor pubsubmon/pubsubmon.go:136 msgpack decode error [pos 1]: only encoded map or array can decode into struct

hector · July 12, 2021, 10:40am

You can use 0.14.0 now
What do you call a lot? The GC cycle by default happens every 15 minutes and for me it correctly sized down badger folders to a very acceptable size (from 100GB to 1GB), now if you are super limited on space it may be that GC does not help you much and that the only way to make it work would be to play with badger configuration settings around table sizes etc. you may be better off with leveldb then.
I am not sure about memory leak. I do not see that on my own peers. If you can run with ipfs-cluster-service --stats daemon and get a heap profile for me with curl localhost:8888/debug/pprof/heap > heap that would be good. Other than that, how are you using your nodes? Are they idling while this is happening?
The error message is because not all your peers are on 0.14.0, you need to upgrade all peers.

hector · July 12, 2021, 10:47am

Also curl localhost:8888/debug/pprof/goroutine > goroutines.

felixwatts · July 20, 2021, 9:27am

Hi Hector,
Thanks for the help. I am now using v0.14.0.

I am unable to reproduce the memory leak. I’m not sure what happened before, but either I was mistaken or its fixed.

Using LevelDB datastore the datastore size on disk is much smaller. It seems to have stabilised at about 1.3GB.

Everything seems to be working now. Thanks!

Topic		Replies	Views
How big can the badgerdb get? [kubernetes planning] Help kubernetes , ipfs-cluster	1	982	August 14, 2020
What are the limits to repository size in go-ipfs? Kubo go-ipfs , repo	4	1321	January 9, 2019
Disk space not being freed by the garbage collector in a ipfs-cluster IPFS Cluster ipfs-cluster	5	1125	December 4, 2020
Advice for starting a new ipfs-cluster IPFS Cluster	3	755	February 10, 2021
Go-ipfs badgerds memory usage and migrating back to file-based store Kubo go-ipfs	3	859	March 27, 2019

Help: ipfs-cluster is filling my disk

Related topics