Hi, Its a nice concept that will revolutionize the internet. but a doubt got in to my mind when I am reading the key points of “Here’s how IPFS works”. which is IPFS will remove the duplicates of data, stores versioned data and have a facility for network node to store only content it is interested, then if the network node is not accessible, what happens when somebody tries to access content hosted by that node?
If the node which hosts the certain content is unique and the node turns off its daemon, no one can access that content via IPFS network. However, in the IPFS, multiple nodes can host the same data block. DHT(Distributed Hash Table) knows the information which node host(or have) certain block, so someone requests the data by content-address(IPFS address), he/she can download the data at least one hosting node exists.
Thanks for the info. But again if we allow multiple nodes to host the same data block, we might move out of our basic principle which is removing the duplications of data ( I personally love this concept very much). Can you guide me how we are planned to do it or if its already done,can you explain me how it works?
Deduplication doesn’t refer to only storing data on one node, it refers to only storing duplicate data once per node. Data can be stored on as many nodes as want it.
Yes, As @leerspace said, removing duplications of data occurs if you load the same data in your local IPFS storage. It consists of merkle DAG, which can figure out whether certain data block is already occupied in our local IPFS storage. But many nodes can have the same data block in their local.
Plus in my opinion, in perspective of request node, they request the data through the global IPFS network and they don’t need to know who have the data(they just know the content-based address). In other words, the request node thinks the host of data is only one(because they can access only through content-based address). I think this is a kind of deduplication: no need to know multiple address for same contents.