I use ipfs/go-ipfs:latest to serve IPFS data to helia client. The server runs under Docker. Sometimes I’m getting strange behavior of the server: it stops deliver data to Helia. When I’m looking to datastore/LOG I see the following at the time server stops delivering the data:
09:45:04.523321 db@close closing
09:45:04.523681 db@close done T·334.55µs
It’s actually strange as the Docker still runs at the time. When I rerun it the error disappears but then appears at random time.
Here is my configuration:
Dockerfile.ipfs
# Dockerfile.ipfs
FROM ipfs/go-ipfs:latest
# Initialize IPFS with the server profile
RUN ipfs init --profile=server && \
# Allow all origins for API access
ipfs config --json API.HTTPHeaders.Access-Control-Allow-Origin '["*"]' && \
# Set the maximum storage limit to 40GB
ipfs config Datastore.StorageMax "40GB"
Thank you very much for the reply. I’m not sure about the previous case as I din’t check if ipfs daemon was running, but the similar issue appeared yesterday and however in LOG was not any note about DB closing and daemon was running, helia wasn’t able to connect to the go-ipfs.
Here is a most recent log before I restarted container and helia reconnected:
WARN[0000] network default: network.external.name is deprecated in favor of network.name
ipfs_slonig_org | Changing user to ipfs
ipfs_slonig_org | ipfs version 0.27.0
ipfs_slonig_org | Found IPFS fs-repo at /data/ipfs
ipfs_slonig_org | Initializing daemon...
ipfs_slonig_org | Kubo version: 0.27.0-59bcea8
ipfs_slonig_org | Repo version: 15
ipfs_slonig_org | System version: amd64/linux
ipfs_slonig_org | Golang version: go1.21.7
ipfs_slonig_org | 2024/03/15 01:00:03 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes for details.
ipfs_slonig_org | Swarm listening on /ip4/127.0.0.1/tcp/4001
ipfs_slonig_org | Swarm listening on /ip4/127.0.0.1/udp/4001/quic-v1
ipfs_slonig_org | Swarm listening on /ip4/127.0.0.1/udp/4001/quic-v1/webtransport/certhash/uEiDhxeqn95zUHSZKgmxghEW2hALByB5HaKgdfRVcMTsFZQ/certhash/uEiDGlzl6dkqqkkj6AHRc0gGLUJlCvl6xCnqR_WYhP1SDJg
ipfs_slonig_org | Swarm listening on /ip4/192.168.128.2/tcp/4001
ipfs_slonig_org | Swarm listening on /ip4/192.168.128.2/udp/4001/quic-v1
ipfs_slonig_org | Swarm listening on /ip4/192.168.128.2/udp/4001/quic-v1/webtransport/certhash/uEiDhxeqn95zUHSZKgmxghEW2hALByB5HaKgdfRVcMTsFZQ/certhash/uEiDGlzl6dkqqkkj6AHRc0gGLUJlCvl6xCnqR_WYhP1SDJg
ipfs_slonig_org | Swarm listening on /p2p-circuit
ipfs_slonig_org | Swarm announcing /ip4/127.0.0.1/tcp/4001
ipfs_slonig_org | Swarm announcing /ip4/127.0.0.1/udp/4001/quic-v1
ipfs_slonig_org | Swarm announcing /ip4/127.0.0.1/udp/4001/quic-v1/webtransport/certhash/uEiDhxeqn95zUHSZKgmxghEW2hALByB5HaKgdfRVcMTsFZQ/certhash/uEiDGlzl6dkqqkkj6AHRc0gGLUJlCvl6xCnqR_WYhP1SDJg
ipfs_slonig_org | Swarm announcing /ip4/192.168.128.2/tcp/4001
ipfs_slonig_org | Swarm announcing /ip4/192.168.128.2/udp/4001/quic-v1
ipfs_slonig_org | Swarm announcing /ip4/192.168.128.2/udp/4001/quic-v1/webtransport/certhash/uEiDhxeqn95zUHSZKgmxghEW2hALByB5HaKgdfRVcMTsFZQ/certhash/uEiDGlzl6dkqqkkj6AHRc0gGLUJlCvl6xCnqR_WYhP1SDJg
ipfs_slonig_org | Swarm announcing /ip4/65.109.58.6/udp/4001/quic-v1
ipfs_slonig_org | Swarm announcing /ip4/65.109.58.6/udp/4001/quic-v1/webtransport/certhash/uEiDhxeqn95zUHSZKgmxghEW2hALByB5HaKgdfRVcMTsFZQ/certhash/uEiDGlzl6dkqqkkj6AHRc0gGLUJlCvl6xCnqR_WYhP1SDJg
ipfs_slonig_org | RPC API server listening on /ip4/0.0.0.0/tcp/5001
ipfs_slonig_org | WebUI: http://0.0.0.0:5001/webui
ipfs_slonig_org | Gateway server listening on /ip4/0.0.0.0/tcp/8080
ipfs_slonig_org | Daemon is ready
ipfs_slonig_org | 2024/03/15 01:20:32 websocket: failed to close network connection: close tcp 192.168.128.2:39916->147.75.87.27:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 01:20:32 websocket: failed to close network connection: close tcp 192.168.128.2:39904->147.75.87.27:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 02:36:08 websocket: failed to close network connection: close tcp 192.168.128.2:49330->192.227.67.185:4002: use of closed network connection
ipfs_slonig_org | 2024/03/15 06:20:21 websocket: failed to close network connection: close tcp 192.168.128.2:37438->139.178.91.71:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 06:35:34 websocket: failed to close network connection: close tcp 192.168.128.2:41596->139.178.88.95:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 06:50:36 websocket: failed to close network connection: close tcp 192.168.128.2:44002->139.178.91.71:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 07:05:53 websocket: failed to close network connection: close tcp 192.168.128.2:49284->139.178.88.95:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 17:25:54 websocket: failed to close network connection: close tcp 192.168.128.2:54096->192.227.67.185:4002: use of closed network connection
ipfs_slonig_org | 2024/03/15 17:52:52 websocket: failed to close network connection: close tcp 192.168.128.2:50586->139.178.88.95:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 18:22:09 websocket: failed to close network connection: close tcp 192.168.128.2:36062->139.178.88.95:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 18:35:42 websocket: failed to close network connection: close tcp 192.168.128.2:46374->139.178.91.71:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 18:51:37 websocket: failed to close network connection: close tcp 192.168.128.2:46062->139.178.88.95:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 19:21:30 websocket: failed to close network connection: close tcp 192.168.128.2:55856->139.178.88.95:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 19:39:50 websocket: failed to close network connection: close tcp 192.168.128.2:38258->147.75.87.27:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 19:40:22 websocket: failed to close network connection: close tcp 192.168.128.2:56382->139.178.91.71:443: use of closed network connection
ipfs_slonig_org | 2024/03/15 21:37:03 websocket: failed to close network connection: close tcp 192.168.128.2:58404->147.75.87.27:443: use of closed network connection
Is it possible to do not let docker restart and get the logs before it actually dies? Is it running out of file descriptors? Do the system logs say anything around that time? Is docker actually restarting it? Can you check the containers uptime?
docker exec -it ipfs_slonig_org /bin/sh
date
Fri Mar 15 22:13:29 UTC 2024
top
Mem: 63651296K used, 2106104K free, 8204K shrd, 4135340K buff, 42111900K cached
CPU: 0.2% usr 0.1% sys 0.0% nic 99.4% idle 0.1% io 0.0% irq 0.0% sirq
Load average: 1.01 0.83 0.86 1/1711 61311
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
7 1 ipfs S 8238m 12.4 3 0.0 ipfs daemon --migrate=true --enable-gc
61222 0 root S 4404 0.0 8 0.0 /bin/sh
61311 61222 root R 4404 0.0 7 0.0 top
1 0 root S 2472 0.0 11 0.0 /sbin/tini -- /usr/local/bin/start_ipfs daemon --migrate=true --enable-gc
As you can see “ipfs daemon” was running about 6 days until the helia stopped connecting.
Could you recommend me which log data to collect? I use Ubuntu 22.04
Isn’t that supposed to just show the current date?
Your docker host must have a syslog (journalctl). What does it mean that helia stops connecting though? What error? What is docker ps saying about the container?
I mean that “8238m 12.4 3 0.0 ipfs daemon” shows that the container is running for a prolong time and wasn’t restated. journalctl didn’t show any notable info at the time. I din’t run docker ps last time, but as I’ve mentioned “top” at the container has shown that the container was not just restarted.
“helia stops connecting” means that get requests take more than 1 min and don’t finish.
Your container has only lived 7 hours so it was likely restarted after dying? Or did you restart it? I think you need to get more familiar with running applications with docker and system administration, I’m sure there is some log in your system that shows the issue if there is any.
And what do you mean by “stops delivering data to helia”? You are also perhaps running some nginx proxy thing on top. You have sent several unrelated errors too as the " closing db" thing hasn’t been seen again. The other errors are not fatal afaik but you don’t even tell if it caused the process to die.
I have no magic ball but I think the problem, if any, is your application, your proxy or your setup and if you cannot make basic troubleshooting like actually showing logs of the moment Kubo dies it’s going to be very difficult to help you further.
Hector, thank you very much for the prompt reply. The container lived 7h because of planned backup operation - I stop it each midnight. Could you provide more clear info which system log would you like to see?