Does pinning content on multiple servers make accessing that content faster?

I’ve tried (and mostly failed) to understand the white paper on IPFS to answer this basic question:

If I pin the same hash on multiple servers around the world, does accessing that content become faster (e.g. measured by time to first byte)? i.e. does ipfs.io determine which server is closest to the client and choose that one to serve? So, practically speaking, should I ask my friends to pin my hash to their servers?

Thanks!

Currently, the we use a protocol called bitswap for content exchange. It works by simply asking all peers we’re connected to for the content we’re looking for and taking the first response. If we don’t receive a response quickly, we go to the DHT and try to connect to peers advertising the content.

So yes, putting the content on multiple peers will speed things up.

Note: If asking everyone for content sounds extremely inefficient, it is. We’re working on improvements to ask fewer peers for content and then widen our search as necessary. You can track the latest work on this in:

(and blame me for holding up this work a bit (sorry))

2 Likes

Do we have some estimates or plans for a better bitswap/DHT?

There are two sides to bitswap performance, throughput and latency.

Throughput

  • Bitswap sessions (by end of 2017, likely less than a month). They significantly reduce wasted bandwidth (freeing up bandwidth for useful data) by asking fewer peers. They’ve already been partially integrated but not fully.
  • IPLD selectors/dagswap (designed by the end of 2017). Instead of operating on the block level like bitswap, dagswap will operate on the IPLD level and you’ll be able to write selectors (simple “programs”/queries) to fetch an entire (sub)dag instead of having to ask for every block (like one does in bitswap).

Latency

  • Better prediction (no concrete plans). Bitswap sessions improve throughput and reduce wasted bandwidth but don’t help with finding content. We currently have a lot of nice metadata about where content should be that we don’t use. Ideally, we’d “seed” sessions with a set of peers who are likely to have the content in question.
  • Faster DHT (probably by November). The DHT wasn’t nearly as concurrent as it could have been so we’re improving that.
2 Likes

That is AWESOME! :slight_smile:
can’t wait to have a taser IPFS with that!

Thanks @stebalien that helps.

Do you have any idea how often a piece of web content is found through the DHT versus bitswap? It seems to me, naively, that bitswap is great for CDNned stuff (jquery, etc.) whereas rare content (a picture of my cat) might always go through the DHT. Is there a benefit (additional to bitswap sharing) to running a server for hosting DHT records then?

By the way, I’m intrigued about contributing where I can! I program Go so I’ll be looking for easy tickets as soon as I have a good understanding of whats going :slight_smile:

The DHT and bitswap serve different purposes:

  • The DHT is used to find things (routing). That is, find peers with content, lookup the IP address of peers (map peer IDs to internet addresses), resolve IPNS records.
  • Bitswap is used to actually retrieve content.

We don’t generally put “content” on the DHT as:

  • Using the DHT is expensive. One generally has to establish multiple connections (expensive!) to new peers for every DHT request (as each DHT request will often be served by a different peer). Downloading a single file would likely take multiple requests.
  • DHT nodes don’t get to choose what records serve. In general, we want nodes to choose what data they serve for legal/moral reasons. If we put content on the DHT, DHT nodes would end up serving arbitrary content instead of just routing information.

So, for example, when looking up a picture of your cat, a node will probably ask the DHT for any peers “providing” the picture of your cat (really, they’ll ask the DHT for any peers providing the root block in the set of blocks that make up the picture of your cat) and then they’ll bitswap with those peers to actually download the picture of your cat.

Is there a benefit (additional to bitswap sharing) to running a server for hosting DHT records then?

Yes. It helps others find content, peers, etc. faster and more reliably.

By the way, I’m intrigued about contributing where I can! I program Go so I’ll be looking for easy tickets as soon as I have a good understanding of whats going :slight_smile:

That would be awesome! Take a look at the issues labeled “help wanted” for good places to start. Feel free to hit us up on IRC (#ipfs-dev on freenode) any time if you want to have any quick questions/want to have a more synchronous conversation.

1 Like