My team and I are running the ipfs/go-ipfs:v0.25.0 image in kubernetes in a single pod. We’ve found that lately the node is using huge amounts of resources and we’re not quite sure how to debug. We currently have 40Gb of memory and 16 cores allocated to it. The pod OOMs every ~2 days.
Here are what I think are the most relevant metrics, happy to pull any others. The red at the top of each graph is an OOM event, darker red means more OOMs in that particular metric interval.
Would recommend starting with kubo/docs/debug-guide.md at master · ipfs/kubo · GitHub and in particular getting a profile dump while the memory usage is high. This will let you use Go’s go tool proof -http:1234 on the heap file and see where all the memory is going.
I’ll chip in with the same issues. Upgraded to 0.29 a few days ago and increased memory from 4G to 16G.
Please check the Grafana RAM chart, the thin white lines are where kubo getting OOM killed by the kernel and restarts.
It looks like a memory leak to me.