Offloading DHT to dedicated nodes for scalability

DHT traffic overhead is about 1TB/month. This really doesn’t scale well with number of nodes running on the same network.

Optimal design would be have few nodes running as DHT server only (no hosted content) and other local nodes would use them for dht query and provide.

There are some upcoming changes to DHT - make it more private. While doing this change, it feel its necessary to add capability to announce other node data/forward annoucements.

It would work by original node hosting content will create dht announcement message and sign it. This signed message can be forwarded to dht servers by other dedicated forwarder node which will aggregate announcements from multiple nodes and announce them batched to save network traffic. Signed announce can still be securely processed by dht server.

For 1TB of DHT traffic monthly it seems you have in the ~100M CIDs per node.
This looks like the range where you would want to use IPNI announcements instead of republishing daily.
Or having some DHT extension to extend your announcement without having to retransmit all of your CIDs.
I know some peoples are working on what you described, I don’t think we shouldsign records, it does not protect against anything and use non trivial CPU time both signing and verifying when having to handle 100M CIDs.

You would probably find the #content-routing-wg channel on Slack intresting.

What exactly has an overhead of 1TB/month? (1) DHT server routing table maintenance, (2) serving get requests, (3) being allocated content with PUT requests, (4) reproviding content to the DHT, (4.1) if yes, how many CIDs?

We recently measured that DHT servers are storing on average 1-2M provider records (being republished every 22h). So 1TB of monthly traffic (~100M CIDs per node) seems to be another order of growth.

This is exactly the distinction between DHT server mode and DHT client mode. There are much more clients that servers, DHT servers mostly have public IP addresses, and are likely beefier machines. The DHT clients rely on the DHT servers to route their requests.

1 Like

Its MUCH less than 100M CIDs per node. 1TB is measured on “wire level” - OS monitoring per process.

How bad is my estimation ? I would be suprised if you have less than 10M CIDs per node.

I did forgot about that but you can try ipfs config --json Swarm.Routing autoclient

DHT client is still too heavy. Its more heavy that most people are willing to tolerate on their machines or network. People are not much willing to run p2p software because they fear impact on cpu/network.

DHT Client still needs to communicate with 30k other nodes this eats connection states on firewalls like crazy and current “new dht client” is pretty CPU hungry. Resource limits will make it harder to announce / locate content. While they fix one problem, they make operation less reliable. On other way if you are just providing stored blocks - overhead is much lower and more people can be willing to run ipfs.

Second problem is resource management inside ipfs node. If dht client eats too much resources like connections, there is nothing left for downloading blocks - it started happen when I upgraded to 0.22 from 0.17.

DHT proxy cache server / client will solve that issue. ipfs node can run in “dht proxy client” mode and completely offload dealing with dht to some server with more resources. Ideally it could use DHT to locate public DHT proxy server. Its completely reasonable to have one dht proxy server on network - it will scale much better.