Why so many peers needs to constantly be connected?

CocoonCrash · November 30, 2018, 11:10am

Hi everyone,

Woke up this morning asking myself why a full node needs so many connected peers (constantly between 600 and 900) while doing nothing (no content hosted/downloaded)?

I know that Golang can’t get restricted in ressources usage, which is normal, and I don’t think the problem is coming from that.

[trolling]
When I start a computer and browse the Internet, the network stack is not using that much ressources compared to running an IPFS full node (up to 90% of CPU, 70% of RAM etc).
[/trolling]

From a naive design point of view, I would expect connections to be limited to:

Closest peers, part of the same bucket, as described in the Kademlia DHT routing table principle
Peers querying the DHT for a ressource my bucket/node should be aware of (some sort of content hash based spliting to restrict query to a part of the network)
Peers messaging to update ressource list my bucket/node is responsible of (can be huge but effemeral…and in a pub/sub pattern shouldn’t need a direct connection)
Peers I’m downloading/uploading data from/to (which is effemeral too)

I may need a real strong coffee to reset my mind or am in a “dumb-day”, but can someone explain me where I’m wrong ?
Does the huge number of connected peers only comes from some DHT flooding due too many nodes getting in and out of the network ? In that case, isn’t there a technical design problem that needs a quick refactoring and are there ideas being explored already?

chris · November 30, 2018, 3:10pm

Why is between 600 and 900 peers always connected

The ipfs connection manager defaults to a low-mark of 600 and a high mark of 900 peers, so it will always try to have 600 peers and will start pruning peers at 900.

You can either apply the “lowpower” profile or manually drop these settings in the ipfs config file to reduce ipfs compute and network requirements at the expense of content discovery performance.

Why are so many connections necessary

IPFS has more than one content discovery methods other than DHT with some only being usable on already connected peers, most notably bitswap sessions. Additionally connecting to a peer is an expensive operation compared to just maintaining a keepalive for hours.

I am not sure how many connections are required to maintain the DHT but most of the connections are being kept open for bitswap performance reasons and not maintaining the DHT.

With that said IMHO 600 peers at idle is too many with the current content discovery systems implemented. The background resource usage is an issue most new users bring up.

chris · November 30, 2018, 3:34pm

As for solutions I probed around for ideas a bit back with minor success, never got around to writing any proof of concept code to test any of it though.

Most of the ideas would probably end up being either too computationally expensive compared to what was saved or end up too difficult to implement but I am still interested in what rklaehn brought up with utilizing bloom filters, this would potentially cull a huge number of wantlist requests to peers that dont have what you are looking for.

CocoonCrash · November 30, 2018, 4:56pm

Thanks for the reply.

Forgot about the bitswap sessions…(dumb I said)…I’m too DHT focused

Do you know if there’s an option to disable bitswap sessions?
Something like the “–routing=dhtclient” option would be great (I wouldn’t need to go back to a really old IPFS release) as I’m trying to work around the DHT problem basing on some research like the following:

Sub-Second Lookups on a Large-Scale Kademlia-Based Overlay

Thanks.

Topic		Replies	Views
Let's hit max peers! Help go-ipfs	3	957	July 23, 2019
DHT connections limit Help	2	771	July 31, 2018
Why and How does my peer increases Help dht	2	1855	June 5, 2020
Why is IPFS using so much bandwidth when I'm not fetching or hosting content? Help	2	627	November 15, 2018
0.4.23 vs 0.6.0 peer connection management Help go-ipfs	2	369	September 21, 2020

Why so many peers needs to constantly be connected?

Why is between 600 and 900 peers always connected

Why are so many connections necessary

Related topics