Number of peers far exceeds lowWater/highWater settings

markg85 · January 15, 2021, 5:38pm

Hi,

I’m running IPFS on my desktop with the lowWater set to 50 and the highWater set to 300. I got these numbers from this topic. Do note that this has nothing to do with the Brave browser!

Now i’ve always suspected those numbers to be… not so meaningful as you can get a load more peers then you set. Or at least, it’s my impression that you set the minimal and maximal peers. Just very confusingly names with “water” as a suffix.

Now i wrote a tiny script to get my peers and log them in a file with a timestamp:

curr_date=$(date -u --rfc-3339=seconds)
num_peers=$(docker exec ipfs_host ipfs swarm peers | wc -l)
echo "$curr_date;$num_peers" >> ipfs_peer_count

Then, after some minutes i plot the result, that gives this surprising image:

The image contains about 10 minutes of log data.

Here i’m observing 2 things that i don’t get.

As you can see, the number of peers i have a connection to is vastly higher then the “highWater”. Sometimes well over twice as much.
Every minute the number of connected peers drops sharply! I has seen this happen in IPFS companion but never had the numbers to back it up.

I’m rather curious to know what exactly is happening here? There clearly is “some” peer eviscerating going on at nearly exactly every minute.

What i would’ve expected would be a line rapidly rising till about your highWater setting and bounce around that line for the application lifetime. Why bounce? Well, IPFS probably checks if peers are alive so every couple of seconds a few peers likely drop off and new ones get added. But i see no such thing in the graph…

What’s going on?

hector · January 15, 2021, 6:12pm

There’s an explanation here:

github.com

ipfs/go-ipfs/blob/master/docs/config.md#basic-connection-manager

# The go-ipfs config file

The go-ipfs config file is a JSON document located at `$IPFS_PATH/config`. It
is read once at node instantiation, either for an offline command, or when
starting the daemon. Commands that execute on a running daemon do not read the
config file at runtime.

## Profiles

Configuration profiles allow to tweak configuration quickly. Profiles can be
applied with `--profile` flag to `ipfs init` or with the `ipfs config profile
apply` command. When a profile is applied a backup of the configuration file
will be created in `$IPFS_PATH`.

The available configuration profiles are listed below. You can also find them
documented in `ipfs config profile --help`.

- `server`

  Disables local host discovery, recommended when

This file has been truncated. show original

To be closed, a connection needs to have existed for more than GracePeriod seconds and be idle (unprotected by any subsystem).

I suspect the Peer manager kicks on a minute basis, but connections have increased to more than 600 by then. At that point it removes all the connections older than 30 seconds (or whatever configured).

Peer creep is normal otherwise… i’d say, particularly if actively using the node.

markg85 · January 15, 2021, 8:24pm

Hi Hector,

Thank you for that explanation!

Still, it looks very weird to set a max peer value that is then seemingly totally ignored.
I mean, what does “highWater” mean if it doesn’t limit the peers?

And why would peers be added if there are idle peers and if the total number of peers is already above the “highWater” limit?

hector · January 15, 2021, 10:24pm

It’s not ignored. When the reaper comes, it only acts because peers are over that value.

Peers increase because it is needed to perform dht lookups, to keep connections healthy and fresh etc. probably it could be less agressive but the components increasing the connections do not have visibility into the fact that they are limited, so they do what they need and then the peer manager comes in to clean up. (I think, not super familiar with this).

markg85 · January 16, 2021, 12:27pm

Interesting! As the node that you saw the numbers from is a node with a blank-slate startup. It hasn’t visited any ipfs content yet and isn’t pinning anything yet. It’s literally a fresh startup out of the box + that “water” configuration.

I don’t get why peers would increase for “dht lookups” if there are already enough peers connected. I get your point though, if there is a DHT lookup to some content that i don’t have and that none of my directly connected peers have then new connections probably need to be made to reach nodes that do have this data.

Ouch. It might work for IPFS but that sounds like a bit of a brute force way. Just connect however it pleases, the peer manager will come and clean up.

I’m asking this because there are modems out there that don’t like one opening up many connections and crash. Changing the “water” then helps a bit, but as my numbers show, it’s not a fixed upper limit. Now i am not in this situation, my modem doesn’t seem to care, but i was with an older one.

hector · January 17, 2021, 9:49pm

Yes, I think this is a problem. I’ve verified my suspicion: the connection manager runs on 1 minute interval:

github.com

libp2p/go-libp2p-connmgr/blob/0b84e304477ea344a79ef1b10c047de71003af2f/connmgr.go#L241


      
          	// Wait for the trim.
          	select {
          	case <-ch:
          	case <-cm.ctx.Done():
          	case <-ctx.Done():
          		// TODO: return an error?
          	}
          }
          
          func (cm *BasicConnMgr) background() {
          	ticker := time.NewTicker(time.Minute)
          	defer ticker.Stop()
          
          	for {
          		var waiting chan<- struct{}
          		select {
          		case <-ticker.C:
          			if atomic.LoadInt32(&cm.connCount) < int32(cm.cfg.highWater) {
          				// Below high water, skip.
          				continue
          			}

If the amount of connections can grow by 300 in that interval, it is not too helpful in keeping things like memory consumption in check, or in keeping IPFS friendly with shitty routers. I was going to open an issue but there is one already: Trim frequency should never be lower than the grace period. · Issue #51 · libp2p/go-libp2p-connmgr · GitHub

I’ll mention it to libp2p devs, see if someone can take a closer look.

markg85 · January 17, 2021, 10:16pm

Thank you for looing into it!

While that sounds awesome, the proposed solution is like “graceperiod/2”…
That might give the appearance of it working properly. Simply because it would - in this specific case - by trimmed every 10 seconds or so. It still means a lot of connections are opened/closed per second even. With more rapid trimming you don’t prevent equally many connections to be opened, you only prevent they being kept open.

hector · January 17, 2021, 10:18pm

You cannot have both:

Do not let new connections to be formed because you already have enough
Lookup and find and provide content to the network

Number 2 requires opening new connections to new places. It is better to let that happens, but be able to quickly cleanup connections that exist but became unused (as needed).

markg85 · January 17, 2021, 10:25pm

Hmm, that does sound quite sensible!

Still, 300 connections/minute… It makes me eager to know what it’s doing exactly. Sure, it might be making DHT connections. If that assumption is correct, is that for ipfs to be as ready as it can be just incase someone asks it to lookup a resource?

Topic		Replies	Views
Why so many peers needs to constantly be connected? Help	3	801	November 30, 2018
Why and How does my peer increases Help dht	2	1861	June 5, 2020
LowWater - what is the relevance of this? Ecosystem and Usage	1	591	October 30, 2019
Ipfs incoming traffic is more than outgoing Help	2	376	March 3, 2021
Let's hit max peers! Help go-ipfs	3	960	July 23, 2019

Number of peers far exceeds lowWater/highWater settings

Related topics