Number of peers far exceeds lowWater/highWater settings

Hi,

I’m running IPFS on my desktop with the lowWater set to 50 and the highWater set to 300. I got these numbers from this topic. Do note that this has nothing to do with the Brave browser!

Now i’ve always suspected those numbers to be… not so meaningful as you can get a load more peers then you set. Or at least, it’s my impression that you set the minimal and maximal peers. Just very confusingly names with “water” as a suffix.

Now i wrote a tiny script to get my peers and log them in a file with a timestamp:

curr_date=$(date -u --rfc-3339=seconds)
num_peers=$(docker exec ipfs_host ipfs swarm peers | wc -l)
echo "$curr_date;$num_peers" >> ipfs_peer_count

Then, after some minutes i plot the result, that gives this surprising image:

The image contains about 10 minutes of log data.

Here i’m observing 2 things that i don’t get.

  1. As you can see, the number of peers i have a connection to is vastly higher then the “highWater”. Sometimes well over twice as much.
  2. Every minute the number of connected peers drops sharply! I has seen this happen in IPFS companion but never had the numbers to back it up.

I’m rather curious to know what exactly is happening here? There clearly is “some” peer eviscerating going on at nearly exactly every minute.

What i would’ve expected would be a line rapidly rising till about your highWater setting and bounce around that line for the application lifetime. Why bounce? Well, IPFS probably checks if peers are alive so every couple of seconds a few peers likely drop off and new ones get added. But i see no such thing in the graph…

What’s going on?

There’s an explanation here:

To be closed, a connection needs to have existed for more than GracePeriod seconds and be idle (unprotected by any subsystem).

I suspect the Peer manager kicks on a minute basis, but connections have increased to more than 600 by then. At that point it removes all the connections older than 30 seconds (or whatever configured).

Peer creep is normal otherwise… i’d say, particularly if actively using the node.

Hi Hector,

Thank you for that explanation!

Still, it looks very weird to set a max peer value that is then seemingly totally ignored.
I mean, what does “highWater” mean if it doesn’t limit the peers?

And why would peers be added if there are idle peers and if the total number of peers is already above the “highWater” limit?

It’s not ignored. When the reaper comes, it only acts because peers are over that value.

Peers increase because it is needed to perform dht lookups, to keep connections healthy and fresh etc. probably it could be less agressive but the components increasing the connections do not have visibility into the fact that they are limited, so they do what they need and then the peer manager comes in to clean up. (I think, not super familiar with this).

Interesting! As the node that you saw the numbers from is a node with a blank-slate startup. It hasn’t visited any ipfs content yet and isn’t pinning anything yet. It’s literally a fresh startup out of the box + that “water” configuration.

I don’t get why peers would increase for “dht lookups” if there are already enough peers connected. I get your point though, if there is a DHT lookup to some content that i don’t have and that none of my directly connected peers have then new connections probably need to be made to reach nodes that do have this data.

Ouch. It might work for IPFS but that sounds like a bit of a brute force way. Just connect however it pleases, the peer manager will come and clean up.

I’m asking this because there are modems out there that don’t like one opening up many connections and crash. Changing the “water” then helps a bit, but as my numbers show, it’s not a fixed upper limit. Now i am not in this situation, my modem doesn’t seem to care, but i was with an older one.

Yes, I think this is a problem. I’ve verified my suspicion: the connection manager runs on 1 minute interval:

If the amount of connections can grow by 300 in that interval, it is not too helpful in keeping things like memory consumption in check, or in keeping IPFS friendly with shitty routers. I was going to open an issue but there is one already: Trim frequency should never be lower than the grace period. · Issue #51 · libp2p/go-libp2p-connmgr · GitHub

I’ll mention it to libp2p devs, see if someone can take a closer look.

Thank you for looing into it! :slight_smile:

While that sounds awesome, the proposed solution is like “graceperiod/2”…
That might give the appearance of it working properly. Simply because it would - in this specific case - by trimmed every 10 seconds or so. It still means a lot of connections are opened/closed per second even. With more rapid trimming you don’t prevent equally many connections to be opened, you only prevent they being kept open.

You cannot have both:

  1. Do not let new connections to be formed because you already have enough
  2. Lookup and find and provide content to the network

Number 2 requires opening new connections to new places. It is better to let that happens, but be able to quickly cleanup connections that exist but became unused (as needed).

Hmm, that does sound quite sensible!

Still, 300 connections/minute… It makes me eager to know what it’s doing exactly. Sure, it might be making DHT connections. If that assumption is correct, is that for ipfs to be as ready as it can be just incase someone asks it to lookup a resource?