TL;DR
- Since 4 December, the ProbeLab team which monitors the IPFS network observed two major anomalies in Amino (The public IPFS DHT):
- A significant increase in the number of DHT Server nodes on the Amino IPFS network, which appear as offline nodes according to our crawler. The fact that those nodes are mostly offline, has a direct impact on the latency observed when publishing and fetching content from the network (see next plots). Source
- Increased latency in publishing to the Amino (Source)
- Increase latency in looking up records in Amino (source)
- For the raw data from the latest crawl, see the latest report from the Nebula crawler
- This may degrade content and peer routing performance on the network. - We are investigating the cause of this. Upon initial investigation, it appears that this was the result of a merging of the Avail network with the Amino IPFS DHT.
- We’re working together with the Avail team to mitigate this.
Background
There is more than one DHT
The Amino DHT used by IPFS and implemented in libp2p (in its various language implementations) is used for content (mapping CIDs → PeerIDs) and peer routing (mapping PeerIDs → IP addresses).
Peers in the DHT use protocol IDs to declare which network they are a part of, for example, IPFS nodes advertise –among other protocols like /ipfs/bitswap/1.2.0
– the following protocol ID: /ipfs/kad/1.0.0
.
The versatility of this DHT implementation has lead many other decentralised networks to adopt this implementation. Typically, when a separate network adopts it, it should use a different protocol ID, e.g. /specialnetwork/kad/1.0.0
Merging of DHTs
Based on IPFS network crawls, it appears that the Avail Nodes are publishing the same Protocol IPFS Kademlia protocol ID. Although identical Protocol IDs aren’t enough to cause the two networks to merge (since nodes still need to discover each other), the networks do seem to in fact have merged.
On the surface, there is nothing wrong with that, assuming that peers from both networks adhere to the protocol’s spec and respond to the RPC messages. We suspect that since most of the new DHT servers observed appear offline and latency has increased something may be malfunctioning.
What’s next?
We’re investigating the root cause of this and working on a solution along with the Avail Project team, who have been very responsive and are also investigating solutions from their end.
For now, expect higher latencies for DHT operations on the network. A good resource to monitor to get up-to-date information is: https://probelab.io and in particular the DHT Lookup performance, the DHT Publish latency
We will update this post as soon as we have more information.