Trying to understand how IPFS verify the data received to be sure a node did not sent some incorrect content.
From my understanding, a node received the data but also the params to rebuild the MerkleDAG of the file and check if the cid obtained is in the want-list.
Is it correct ? If so where can I learn more about this process ?
In IPFS things are requested by CID. A CID corresponds to a single-unique block. That block can be interpreted and have CIDs of other blocks (children) etc which allow traversal of MerkleDAGs.
Everytime a block is downloaded, IPFS hashes it according to the parameters of the CID/Multhash that it was requested with, and verifies that the result matches. I suspect this happens in the bitswap code.
IPFS, a hypermedia protocol created to increase the web’s resilience by addressing data by its content rather than by its location, is the glue that keeps everything together. Instead of using URLs for this purpose, IPFS employs CIDs, which point to the server where the data is stored.
This is incorrect. CIDs don’t point to a server. CIDs are generated from the data they represent. This means that the data could be on any server. Once you retrieve that data from any server, you can verify it by hashing it and checking that the resulting CID is the same.
It depends on how you retrieve that data:
If it’s over Bitswap you are getting blocks that allow you to verify each block of the file incrementally, rebuild the MerkleDAG and once you have all the blocks you can calculate the root CID and verify it.
If it’s using an IPFS gateway, this can be done in either a verified or a trusted manner. See the docs for more information HTTP Gateway | IPFS Docs