Excessive/Expected IPFS memory, threads, CPU? - private deployment

Suggested to move here from Excessive/Expected IPFS memory, threads, CPU? - private deployment · Issue #7850 · ipfs/go-ipfs · GitHub

Version information:

~$ipfs version --all
go-ipfs version: 0.7.0
Repo version: 10
System version: amd64/linux
Golang version: go1.14.4

System information:

~$ cat /proc/cpuinfo | grep "model name"
model name : Intel® Xeon® CPU E5-2697 v3 @ 2.60GHz
model name : Intel® Xeon® CPU E5-2697 v3 @ 2.60GHz

~$ cat /proc/meminfo
MemTotal: 16398484 kB
MemFree: 522740 kB
MemAvailable: 11577716 kB
Buffers: 358036 kB
Cached: 5887524 kB
SwapCached: 12876 kB
Active: 6208340 kB
Inactive: 9162904 kB
Active(anon): 3178700 kB
Inactive(anon): 1027976 kB
Active(file): 3029640 kB
Inactive(file): 8134928 kB
Unevictable: 416 kB
Mlocked: 416 kB
SwapTotal: 1003516 kB
SwapFree: 0 kB
Dirty: 4 kB
Writeback: 0 kB
AnonPages: 9113824 kB
Mapped: 134616 kB
Shmem: 66848 kB
KReclaimable: 228508 kB
Slab: 339572 kB
SReclaimable: 228508 kB
SUnreclaim: 111064 kB
KernelStack: 13744 kB
PageTables: 39352 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 9202756 kB
Committed_AS: 16393256 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 25548 kB
VmallocChunk: 0 kB
Percpu: 13168 kB
HardwareCorrupted: 0 kB
AnonHugePages: 2048 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
FileHugePages: 0 kB
FilePmdMapped: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
DirectMap4k: 1658752 kB
DirectMap2M: 15118336 kB
DirectMap1G: 2097152 kB


Running an experiment for potential PoC of a private IPFS deployment to determine memory and CPU consumption, and in general become more familiar with the IPFS solution. Not looking at a clustered deployment yet, as we don’t need the replication to all nodes, only some nodes will seed/pin, others will pull on-demand.

Overview of the experiment:
Started with a minimal set of nodes (4).
Primary Node (Peer 1) (for the experiment with pinned content)
Consumer Node (Peer 2) (for the experiment, just continuously pulls from Primary, doing ipfs repo gc after 10 files
Pinned 60K+ objects (variety of different files, large (1GB x 10 /dev/urandom generated files), medium (10MB small x 100 /dev/urandom files), (1K x 5K /dev/urandom files), and other assorted.
Peer 3 - Present but not participating in any get or other operations
Peer 4 - Present but not participating in any get or other operations

The operation involves getting 10 1GB files from the primary node, then issuing the GC operation, and repeat. Monitoring the memory, CPU and threads of the ipfs daemon on the consumer Node.

Ran the same setup a couple of times over the holiday season to capture some data, and first iteration, ipfs get operation just stalled, and was none responsive when the ipfs instance got to 5.8G resident memory usage.

In the second iteration the system kept running until I interrupted it this morning (started on morning of 2nd Jan 2021), at which point it had reached the following consumption level, as shown in the ipfs issue page (as a new user I can only embed one image??)

So although this is not an expected deployment model or typical usecase, what are the expected scale numbers to expect in terms of memory usage, threads and garbage collection for the ipfs daemon (without the --enable-gc), and ipfs repo gc being triggered between operations?

1 Like

It is really hard to get an idea of what is happening. Your machine is swapping, so no wonder everything is stuck. Yet IPFS it reported to use 3x% of memory, so is there something else causing your machines to go into swap?

I suggest you try to dive a bit deeper on correlating your usage with when the memory consumption increases and trying to figure out what gets stuck or what operation is most impactful.

Resources: go-ipfs/debug-guide.md at master · ipfs/go-ipfs · GitHub

Also, locahost:5001/debug/metrics/prometheus allows you to scrap data with prometheus and then you can plot it (i.e. with Grafana). There should be docker-container bundles for this on the web.