Hello everyone,
I’m currently working on setting up an IPFS server for a production application and I could use some assistance with configuring it properly. Here’s the setup and the issue I’m facing:
Setup Details
-
Master Server: AWS m6a.2xlarge machine
- Specifications:
- vCPU: 8
- Memory: 32 GiB
- Network Performance: Up to 12.5 Gbps
- Purpose: Handles file uploads
- Configuration: Running IPFS as a Docker image
- Specifications:
-
Cluster Nodes: Two other nodes serving as IPFS gateways on the internet
-
Additional Info: Running IPFS-cluster image on all machines
Problem Description
I encounter an issue where IPFS commands start hanging indefinitely, particularly ipfs add
and ipfs pin/ls
. Some commands still work, but the overall functionality is severely impaired. Interestingly, resources do not seem to be running out when ConnMgr is enabled, yet the commands still hang. Without these settings, all RAM is eventually used up. Another interesting thing is, if I reset docker image in that state, applications waiting for command will receive answer and CID of the file.
I’ve tried using Kubo versions 0.27, 0.28, and 0.29, but the issue persists across all versions.
Additional Details
- Repo Stats:
/ # ipfs repo stat
NumObjects: 2390888
RepoSize: 129242800328
StorageMax: 190000000000
RepoPath: /data/ipfs
Version: fs-repo@15
- Full Configuration:
{
"API": {
"HTTPHeaders": {}
},
"Addresses": {
"API": "/ip4/0.0.0.0/tcp/5001",
"Announce": null,
"AppendAnnounce": null,
"Gateway": "/ip4/0.0.0.0/tcp/8080",
"NoAnnounce": null,
"Swarm": [
"/ip4/0.0.0.0/tcp/4001",
"/ip6/::/tcp/4001",
"/ip4/0.0.0.0/udp/4001/quic-v1",
"/ip4/0.0.0.0/udp/4001/quic-v1/webtransport",
"/ip6/::/udp/4001/quic-v1",
"/ip6/::/udp/4001/quic-v1/webtransport"
]
},
"AutoNAT": {},
"Bootstrap": [
"/dnsaddr/bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt",
"/ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
"/ip4/104.131.131.82/udp/4001/quic-v1/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb",
"xxx(node 1 address)",
"xxx(node 2 address)"
],
"DNS": {
"Resolvers": {}
},
"Datastore": {
"BloomFilterSize": 0,
"GCPeriod": "1h",
"HashOnRead": false,
"Spec": {
"mounts": [
{
"child": {
"path": "blocks",
"shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
"sync": true,
"type": "flatfs"
},
"mountpoint": "/blocks",
"prefix": "flatfs.datastore",
"type": "measure"
},
{
"child": {
"compression": "none",
"path": "datastore",
"type": "levelds"
},
"mountpoint": "/",
"prefix": "leveldb.datastore",
"type": "measure"
}
],
"type": "mount"
},
"StorageGCWatermark": 90,
"StorageMax": "190GB"
},
"Discovery": {
"MDNS": {
"Enabled": true
}
},
"Experimental": {
"FilestoreEnabled": false,
"GraphsyncEnabled": false,
"Libp2pStreamMounting": false,
"P2pHttpProxy": false,
"StrategicProviding": false,
"UrlstoreEnabled": false
},
"Gateway": {
"APICommands": [],
"HTTPHeaders": {},
"NoDNSLink": false,
"NoFetch": false,
"PathPrefixes": [],
"PublicGateways": {
"xxx.com": {
"Paths": [
"/ipfs",
"/ipns"
],
"UseSubdomains": true
},
"xxx.io": {
"Paths": ["/ipfs", "/ipns"],
"UseSubdomains": true
}
},
"RootRedirect": "",
"Writable": false
},
"Identity": {
"PeerID": "xxx",
"PrivKey": "xxx"
},
"Internal": {},
"Ipns": {
"RecordLifetime": "",
"RepublishPeriod": "",
"ResolveCacheSize": 128,
"UsePubsub": true,
"MaxCacheTTL": "1m"
},
"Migration": {
"DownloadSources": [],
"Keep": ""
},
"Mounts": {
"FuseAllowOther": false,
"IPFS": "/ipfs",
"IPNS": "/ipns"
},
"Peering": {
"Peers": null
},
"Pinning": {
"RemoteServices": {}
},
"Plugins": {
"Plugins": null
},
"Provider": {
"Strategy": ""
},
"Pubsub": {
"DisableSigning": false,
"Router": ""
},
"Reprovider": {
"Interval": "11h0m0s",
"Strategy": "pinned"
},
"Routing": {
"AcceleratedDHTClient": true,
"Methods": null,
"Routers": null
},
"Swarm": {
"AddrFilters": null,
"ConnMgr": {
"Enabled": true,
"LowWater": 1500,
"HighWater": 2500,
"GracePeriod": "3m"
},
"DisableBandwidthMetrics": false,
"DisableNatPortMap": true,
"RelayClient": {},
"RelayService": {},
"ResourceMgr": {
"Enabled": true,
"Limits": {},
"MaxMemory": "20GiB"
},
"Transports": {
"Multiplexers": {},
"Network": {},
"Security": {}
}
}
}
Specific Issues
- Command Hanging:
ipfs add
andipfs pin/ls
commands hang indefinitely - Docker Image Unhealthy: Without the ConnMgr settings, the Docker container running IPFS becomes unhealthy frequently and requires manual intervention to reset.
Questions
- How to solve issues with IPFS hanging on certain commands?
- is it possible to set a timeout for commands?
- Are there any recommended configurations or best practices for running IPFS in a production environment, especially regarding connection and resource management?
Any advice or suggestions would be greatly appreciated! Thanks in advance for your help.
Best regards,
Tine