You can add a hash as a sub-file in a directory like this with object patch.
If youâre adding âfile_added.txtâ to root_folder/folder1/, where $DIR_HASH is root_folderâs current hash, and $FILE_ADDED_HASH is the hash of âfile_added.txtââŚ
Assuming youâre correct @jaidedau Iâm pretty confused by this. Are you saying this will give the same hash (root_folder hash) as if file_added.txt had always been there from the beginning? Or does it just give some arbitrary different hash that happens to point to all the same files?
Also it seems really bizarre that âobject patchâ is the way you add a file to a folder. Seems like there should just be some option on the âaddâ operation itself where youâd specify you want to add to some existing folder. Pretty much every time I âlearn moreâ about how IPFS handles folders I get even more confused than I was the day before. lol.
So, if I understand correctly, there is no need to update all of ancestor directories. Only the root directory.
Until now, I though that EVERY directory contains data (filenames, hashes for them, sizes etc) but if I understand correctly, only the root directory stores that information for the complete hierarchy.
Am I correct?
If not, why I must update only the root directory and not all ancestors?
Good question Nick, I wouldâve assumed if there was some merkle tree at the core of all this, there would be a merkle node somewhere in that tree representing each folder in a hierarchy of folders.
I guess maybe when you add a folder to IPFS itâs just taking all the path names relative to that folder, and gathering them recursively deep, but then just taking the results of that recursion and treating that as a simple flat âlistâ of files. So maybe regardless of how deep the directory structure is, itâs just always treated as if it was a bucket of files without really any inherent âstructureâ except for what you can glean from analyzing the slashes in the file names of the contained files.
The only âdownsideâ to this, if there is one, is that after adding a large directory structure, you donât end up being able to reference any of the sub-folders by their own CID. I might be wrong.
EDIT: This post was proven to be slightly incorrect because each subfolder does have itâs own CID determined by whatâs under it, and I clarified it better in my next post (below).
Folders in IPFS are just merkledag objects. The reason you only need to give the hash of the root directory is that object patch automatically updates the necessary subdirectories. Each directory holds only the information on itâs direct children. In IPFS, a directory hierarchy is a tree.
The hash of b/ changed from QmeWH5Rkv69fUBuazFM23iNXvi2da1FnQCR6arxCg2n6QM to Qmd6qmfeLAtc5ebemL8hxfLo84HqU1qtEytVi9i2JwNB4x, and the hash of a/, the root, changed too.
So adding a file to the folder in the above example was a two step process:
Add file itself, to get itâs own CID
Use the CID to âpatchâ the file into the âfolder structureâ ROOT as a link, because directory structures are apparently stored as links (trees of links really)
In other words, adding the file to IPFS v.s. putting that file in some specific folder, are two completely separate independent operations?
One final obvious question arises:
If we did the patch directly onto â/bâ instead of â/aâ (in the example @jaidedau gave), would that also mean that â/aâ would AUTOMATICALLY get updated too (w/ a new hash) representing the folder that contains the new file? Iâm guessing ânoâ is the answer.
Probably the patch only is able to update things recursively underneath that level of the folder right? So what I said earlier is âpartiallyâ true, meaning once you upload a folder strucutre, itâs kind of itâs own âbucketâ (at the root), and when you add files anywhere under a data structure you need to still be always patching the ROOT (bucket), because the patch will only affect things âunderneathâ (recursively) itself rather than at higher levels up. Iâm saying all that as a question. âŚto ask if Iâve got it right yet?
A patch can only affect those under it, and if you patch âb/â instead of â/aâ, â/aâ isnât automatically updated-- a hash can only refer to one object, so the old hash of â/aâ can never have different subdirectories, or be updated. object patch has gotta give you a new one.
You can use MFS if you need root-folder hash to auto-update after modifications. And it is easier to work with when adding, removing or moving files around. ipfs files stat <path> will give the root hash of the path-dag as it is.
I have a filestore directory tree added with âadd --nocopyâ.
Auto updated means that when I delete or add or modify a file or directory in filestore, all of these ipfs links will automatically updated? (Ok not all, only needed)
Documentation about file systems states that I must inform ipfs for that changes. So it is deduced in the above jaidedauâs solution.
I am thinking to write a Python script which found differences with previous scan and patch the differences.
If youâre using the âobject patchâ way of updating a folder (non-MFS folder), you must always patch the top level root itself, and youâll therefore get an entire new root CID as a result. You canât just patch some sub-folder and expect any new root CID to automatically come into existence.
But MFS is different, and more like a file real system, so that parent folders ARE always automatically aware of things happening in child folders.