Is using ipfs for large scale content distribution a good idea?

From @jillesvangurp on Wed Jan 04 2017 22:42:24 GMT+0000 (UTC)

One usecase that comes to mind for ipfs is hosting large amounts of content for e.g. games, virtual worlds, or flight simulators. This is currently something that requires expensive hosting, lots of bandwidth, and typically some sort of CDN to do cost effectively. Many smaller providers of content struggle with distribution for this reason and typically this stuff is done through bit torrent in the form of huge archives that take ages to expand after downloading.

To give one example, somebody created X-plane scenery for spain based on open data (i.e. distribution would be legal) that includes detailed 3D meshes, and high resolution photo scenery, etc. Awesome stuff; check here for details. The whole of spain comes in at about 0.5TB, distributed over gazillions of files of varying sizes. In other words, downloading all of this is rather time consuming. A typical use of this data would be taking off somewhere in Spain, and flying along some narrow path loading objects on a need to have basis. In other words, you’d be unlikely to access more than a tiny fraction of the data and would only need the high detail at low altitudes. So downloading 0.5 TB is probably a colossal waste of bandwidth for most users.

Question: would this be at all a usecase for ipfs to handle; or even remotely a good idea? Could it scale to these levels? Ideally, this stuff would load just in time, Google earth style. Or perhaps more similar to Flightgear’s terrasync. That would require pretty high sustained throughput for the downloads and reasonable latency to start downloads of probably many files concurrently.

Copied from original issue: https://github.com/ipfs/faq/issues/216

1 Like

From @flyingzumwalt on Wed Jan 18 2017 19:45:25 GMT+0000 (UTC)

Yes that is one of the target uses of IPFS. We absolutely intend for it to work at those scales. @whyrusleeping is eager to make it work at exabyte scale. In the meantime, here’s some of the stuff we’re doing to make IPFS work smoothly at large scales:

From @corysimmons on Fri May 19 2017 01:25:22 GMT+0000 (UTC)

@flyingzumwalt High level, how will IPFS be able to store exabytes? Is this in the paper or a blog post?

From @flyingzumwalt on Fri May 19 2017 01:53:49 GMT+0000 (UTC)

Making IPFS performant at Exabyte scale is one of @whyrusleeping’s favorite things to bring up in Roadmap discussions. He might be able to provide some technical reference points off the top of his head. We haven’t had the opportunity to focus on that kind of scaling yet, but I’m optimistic that we will get to tackle it soon. At the very least, we have a list of known optimizations on this front that I hope to get on the roadmap this year, now that we have the basics of the InterPlanetary Test Lab ready to use.

From @whyrusleeping on Sat May 20 2017 03:29:36 GMT+0000 (UTC)

@corysimmons when we say “store an exabyte”, what we really mean is that the ipfs network will be able to have 1 exabyte of data in it, not that a single machine will have 1 exabyte of storage.

From @flyingzumwalt on Sat May 20 2017 15:43:21 GMT+0000 (UTC)

Ah yes. I mis-remembered. The roadmap item @whyrusleeping proposed was making individual nodes work at Petabyte scale – a worthy goal.

Regarding how the entire global network will scale, I’m not aware of any formal studies or blog posts about it.

From @corysimmons on Sat May 20 2017 22:34:10 GMT+0000 (UTC)

Pretty sure I’ll have to stare at the papers for a few days to figure out what’s going on, but tl;dr: IPFS’ “storage” is just p2p seeding/leeching right?

From @flyingzumwalt on Sun May 21 2017 03:49:48 GMT+0000 (UTC)

@corysimmons getting other nodes to hold your content is an out-of-band concern for the IPFS protocol. There are a number of tools evolving to handle this. For example, you can use ipfs-cluster to form participating networks of nodes who share a pinset. Alternatively, Filecoin will allow you to pay the network to store stuff for you.

This thread on discourse gets into the topic: IPFS for community-led research

Another use case might be something like managing updates of Operating Systems like linux.
Currently almost all linux repos use multiple centralized servers containing the upgrade packages, serving them up on ipfs will make sure, everybody is able to download these packages using ipfs using constant speed without using multiple centralized servers which might not just be more costly for companies to maintain but also unrewarding.

EDIT:
Though will need a better linux geek to confirm if something like this makes sense.

In this regard, there are also *nix and FOSS distributions via torrent that could essentially move to IPFS at some point, whether integrated or not. Even Firefox (at least on Windows) has an updater app which then fetches the actual software from one of the central servers; this could also be done over the IPFS, if Mozilla integrated ipfs into Firefox.