IPFS Propagation

johnalanwoods · November 19, 2018, 10:49am

Hello,

I’m trying to understand the way a file propagates over IPFS.

I’ve spent a lot of time googling and can’t seem to get a straight answer on this.

If a file is uploaded to IPFS, does it replicate to some standard? Do a certain number of nodes try to ensure 2x or 3x replication? Or, is it the case that a file only exists on the machine that ‘uploads’ it until another machine requests it and only THEN is it replicated to the other machine?

I understand that files are chunked and if another file already exists with the same hash then the chunk is not duplicated… how does that work?

MatthewSteeples · November 20, 2018, 10:39am

Hi John,

To answer your first question: Files only propagate through IPFS when they are requested by a node. If you’re running a node and you upload a file to it, no other nodes will pick up that file by default. If you were to then request that file (by hash) through ipfs.io then a copy would remain on that node for a certain amount of time. If someone then requested it from their own IPFS node then they would receive a copy from the 2 nodes that already have it. Unless the file is “pinned”, then the nodes would delete it when their stores were getting full to make space for other files.

Your second question might vary depending on the implementation, but currently files are stored in 256k chunks. Each of these chunks is referenced by a hash of its contents. Those hashes are then combined together into another hash which represents the full file. Because files (and chunks) are referenced by these hashes, and the hashes are deterministic (always the same for the same data) it means that the contents only need to be stored once.

Hope that helps,
Matthew

johnalanwoods · November 20, 2018, 1:31pm

Thanks Matthew!

So just to confirm, If my node download a copy of a file, it will delete it after some time?

I thought the idea was to have many ‘seeders’?

johnalanwoods · November 21, 2018, 10:07am

I understand there is a garbage collection process, I thought files were kept ‘forever’ by their owning node…

Is there any docs on this?

chris · November 21, 2018, 9:17pm

Files are not ‘uploaded’ anywhere when you first add them, merely just makes them available for other nodes to request.
Pinned files are exempt for the garbage collector.
When adding a file to IPFS it is also pinned by default.

Having the primary replication method be based on the users interest allows the system to scale capacity and demand very closely together.

For data durability the primary replication method does nothing to assure it, that is where something like ipfs-cluster comes in. This orchestrates multiple nodes that you have control of to ensure 2x, 3x, 4x, etc replication.

Topic		Replies	Views
Is there any doc on IPFS file replication, how it avoid single point failture?	10	4713	November 8, 2017
Does IPFS store full size file on every node? Help	4	1150	June 24, 2019
How does IPFS decide what to store and replicate and what to get rid f	3	423	September 15, 2021
How IPFS backs up data Help go-ipfs	9	711	March 12, 2020
About the availability and distribution of IPFS Help	3	546	July 1, 2019

IPFS Propagation

Related topics