About ipfs add and MFS

Hi,
I’m confused about MFS file:
First, after using ipfs add <file>, when I add an identical file to MFS, the CIDs of the file and the CIDs of the leaf nodes (chunk data) are different from the CIDs and chunk CIDs obtained by adding the file with the ipfs add command. But the file is the same, and the size of the chunk is the same, and the encoding seems to be CIDv0, what is the difference between the two CIDs?

I think MFS packages the file with more info than just the raw file content, and includes UnixFS related stuff like maybe attributes, timestamp, name. I’m not sure, but it’s something like that.

1 Like

Can you add some more details on what exactly you did. If I’m not mistaken you can’t really add files directly to MFS. It’s more like you add them to ipfs and then copy them to MFS. A file, depending on the size, can be broken into multiple chunks so it’s not completely surprising that the leaf node chunk CIDs aren’t the same as the file CID.

For example, having a file cat.jpg, add it to ipfs first by ipfs add cat.jpg, and get a CID of this file,
ipfs ls <CID> can list sub-blocks cids;

Similarly, you can write it to MFS, ipfs files write /cat.jpg cat.jpg --create, and using ipfs files ls -l / to get the CID, using ipfs ls <CID> get a list.

As you can see, the CIDs of the file or leaf nodes of the files shown twice are different, probably because the encoding of the leaf nodes is different, but the CIDv0 shown means that they use the dag-pb codec.

It looks like it’s because the data stored in MFS isn’t being written as unixv2 but as IPLD. Interesting to know. I’m guessing that if you add the file and then copy it into MFS it will just add a link but if you add the file it will have a Unixv2 node and if you add it with 'ipfs files write …` you’ll have a second ipld node with a data section.

In the go-ipfs repo, I found that the UnixFS Node in MFS uses the Trickle layout, which causes the leaf node type is raw (UnixFs Raw), but the UnixFS uses the Balanced layout, where the leaf node type is the file type, and even if the final Dag shape is the same, the CIDs are not same.