Kubo v0.30.0-rc2 is out!

lidel · August 28, 2024, 6:48pm

Kubo v0.30.0-rc2 is out!

WARNING: this RC has to be run with GODEBUG=asynctimerchan=1 environment variable due to go1.23 timers cause issues with libp2p connmgr and more · Issue #10501 · ipfs/kubo · GitHub.

Feedback is appreciated, as this release include multiple important rollouts for IPFS ecosystem:

/webrtc-direct listener enabled by default (on the same UDP port as /quic-v1
AutoNATv2
UnixFS 1.5 (opt-in preservation of POSIX-like mode and mtime with Last-modified support on Gateways)
Multiple go-libp2p fixes improving memory usage and connectivity to and of NATd peers
Bitswap fixes and perf. improvements

For more information, see

To install:

Code: Release v0.30.0-rc2 · ipfs/kubo · GitHub
Binaries: /ipns/dist.ipfs.tech/kubo/v0.30.0-rc2/
Docker: docker pull ipfs/kubo:v0.30.0-rc2

ylempereur · August 29, 2024, 1:55pm

Hi,

I don’t want to file a bug just yet, but there’s definitely something hinky with rc2 when compared to rc1, which possibly has to do with the changes to the connection code.

The biggest symptom is that reprovides now take vastly longer (actually, they never complete, and the number of connections stays high. 4000+ connections, which is normal during reprovides, compared to around 200 when not reproviding).

My guess is that it’s failing to close certain kinds of connections and eventually reaches its maximum and stalls, pretty much forever. Another symptom is that if I then try to stop the daemon (with ctl-C), it doesn’t exit, until I hit ctl-C again (and I’ve waited a long time).

I’ve only gone through a couple of cycles of this so far, but the behavior was the same in both cases. I’ll continue investigating until I gain a better understanding of what’s actually going on, but I wanted to give a preliminary report, so that others can look into it as well.

Just for info, I’m not using the accelerated DHT, but I’m using the optimistic provide. Exact configuration will be provided when I file a bug.

lidel · August 31, 2024, 1:11am

Thank you for early flag @ylempereur .
This release updated go-libp2p and go-libp2p-kad-dht so would be good to understand where the difference comes from. Could be a regression, or could be effect of connectivity fixes lifting artificial limits that existed before.

On high number of connections, i’ve seen it on a publicly diallable node that is also a DHT Server, so a few follow up questions:
- Are you sure reprovide system is slower or just guessing (is ipfs stats provide showing lower AvgProvideDuration for similar TotalProvides as older version?)
- Do you have custom ipfs config Swarm.ConnMgr, or empty (running default) ?
- Small ask: are you able to check if you experience the same high number of connections if you set ipfs config Routing.Type to autoclient and restart the node?
  - fwiw I see high number of connections only when running with auto. switching to autoclient is a night and day and keeps it at few hundred, which makes me think this is related to DHT server somehow.
Is the ctrl-C occuring every time, or only after running for a while and reaching out >3k connections? (i was not able to reproduce yet)
Shot in the dark: in -rc2 we’ve updated from go1.22 to go1.23, do both problems go away if you run with env variable that restores timers from go 1.22? ( GODEBUG=asynctimerchan=1 ipfs daemon ).

ylempereur · August 31, 2024, 5:02am

I’m unfortunately on vacation right now, so it’s hard to control my node that is running at home. I’ll try and find some time to experiment in the next few days.

A couple things I can tell you right away though:

The reprovide NEVER completes. My node has been running more than two days now, and the first reprovide is still running (it normally takes a bit over 2 hours). It still has over 4K connections, and “ipfs stats provide” still reports all zeros.

Swarm.ConnMgr is {}

I’ll try and test the other 3 things as soon as I get a chance.

lidel · August 31, 2024, 6:58pm

No worries. I’ve run two nodes side by side over night and confirmed go1.23 timers are what makes the difference. Running with GODEBUG=asynctimerchan=1 fixes the regression.

See go1.23 timers cause issues with libp2p connmgr and more · Issue #10501 · ipfs/kubo · GitHub & Release 0.30 · Issue #10436 · ipfs/kubo · GitHub

ylempereur · September 2, 2024, 1:03pm

I can confirm that the fix works for me as well.

ylempereur · September 7, 2024, 12:03am

Confirming that rc3 fixes the problem (without the env var).

Topic		Replies	Views
Kubo v0.30.0-rc1 is out! News go-ipfs , kubo	2	44	August 28, 2024
Kubo v0.30.0-rc3 is out! News go-ipfs , kubo	3	23	September 11, 2024
Kubo v0.31.0-rc2 is out! News go-ipfs , kubo	1	37	October 8, 2024
Kubo v0.33.0-rc1 is out! News go-ipfs , kubo	7	85	January 23, 2025
Kubo v0.32.0-rc2 is out! News go-ipfs , kubo	1	31	November 8, 2024

Kubo v0.30.0-rc2 is out!

Kubo v0.30.0-rc2 is out!

Related topics