How IPFS backs up data

Problem:When uploading a new file, will it be backed up to other nodes or just to itself?

I set up a private network for testing: When a new file is uploaded, the file will be cached in blocks in the current node and will not be distributed to other nodes for backup data blocks.
And I analyzed go-ipfs Source code. No code was found to distribute data blocks to other nodes for backup.
Is my analysis correct? And is there a detailed process for analyzing IPFS upload files?

Welcome wwwgel,

your analysis is correct. Adding a file to your client only allows the network to access it, if another client has either knowledge about the file in terms of the checksum after adding it, or if you shared this info with another node.

If you’re searching for a way to distribute data reliably within the IPFS network, the cluster-implementation might fit your needs:

I have no idea what this is supposed to mean. You’re free to publish any file you want. Like if you run a webserver, you’re free to publish anything you’re want to the internet.

The difference is, that the webserver and the person who’s downloading/viewing stuff do exchange enough information what the person downloading/viewing can help to spread the file to the network as well, and can fully replace your webserver. :slight_smile:

May I understand that one single node works such as a webserver?
When I publish a new file to IPFS network with my node, I surpose my node is the only node which has a copy of that file.
As soon as other users try to reach that file, certains nodes of those users will cache that file.
So the more users try to reach that file, more nodes will cache that file?

That’s basically the concept.

In case of otherwise available files, like a ISO for a Linux distribution, another user can also add the exact same file this way, and the Content-IDs would match.

This means you doesn’t have to transfer files from one node to a different one, by can also add the same file on different locations to get the same effect.

If you want to ‘hold’ specific content, the command is ‘pin add’ on the CLI. This will receive any data necessary from the network, according to the Content-ID and avoids that the data is deleted by the garbage collector.

Additionally all data you add in the files tab on the GUI will be spared from beeing garbage collected, when you run low on disk space.

The default disk space is set to 10 GB, if 9 GB is filled the garbage collector will try to make new room by dropping stuff you have not pinned.

So if someone accesses your file, it’s definitely in their cache, but it might not be pinned or just partly cached. If the user is running low on disk space, your data might be dropped again from his cache - since it’s not pinned, when just accessed.

Running a cluster on the other hand guarantees that a given amount of copies are hold in the network, allowing for redundancy and parallel downloads of the data.

I understand, thank you very much

You’re welcome!

Explain a bit what you’re trying to achieve with IPFS, and we might figure out if there’s an application build upon IPFS which can provide the functionality. :slight_smile:

Best regards

Ruben

I want to use IPFS to make a CDN system

Cloudflare already offers a gateway to IPFS, you can use it, if you’re talking about getting stuff from IPFS to the clear web in a CDN way.

If you’re searching just for a pin-service, there are pinata and Infura for example.

Excuse me, I have a new question.
When I publish a new file to IPFS network with my node, only my node owns a copy of this file. How do other nodes find this file on my node?
I read the IPFS white paper. IPFS should use KADEMLIA DHT, so according to my understanding, the block should be stored on the corresponding node according to the hash value, So that other nodes can quickly find the node that stores the data.
But from the current analysis, the blocks are not actively backed up to the corresponding nodes. Feeling incompatible with the design of KAD.