You can add a hash as a sub-file in a directory like this with object patch.
If you’re adding “file_added.txt” to root_folder/folder1/, where $DIR_HASH is root_folder’s current hash, and $FILE_ADDED_HASH is the hash of “file_added.txt”…
Assuming you’re correct @jaidedau I’m pretty confused by this. Are you saying this will give the same hash (root_folder hash) as if file_added.txt had always been there from the beginning? Or does it just give some arbitrary different hash that happens to point to all the same files?
Also it seems really bizarre that “object patch” is the way you add a file to a folder. Seems like there should just be some option on the “add” operation itself where you’d specify you want to add to some existing folder. Pretty much every time I “learn more” about how IPFS handles folders I get even more confused than I was the day before. lol.
So, if I understand correctly, there is no need to update all of ancestor directories. Only the root directory.
Until now, I though that EVERY directory contains data (filenames, hashes for them, sizes etc) but if I understand correctly, only the root directory stores that information for the complete hierarchy.
Am I correct?
If not, why I must update only the root directory and not all ancestors?
Good question Nick, I would’ve assumed if there was some merkle tree at the core of all this, there would be a merkle node somewhere in that tree representing each folder in a hierarchy of folders.
I guess maybe when you add a folder to IPFS it’s just taking all the path names relative to that folder, and gathering them recursively deep, but then just taking the results of that recursion and treating that as a simple flat “list” of files. So maybe regardless of how deep the directory structure is, it’s just always treated as if it was a bucket of files without really any inherent ‘structure’ except for what you can glean from analyzing the slashes in the file names of the contained files.
The only ‘downside’ to this, if there is one, is that after adding a large directory structure, you don’t end up being able to reference any of the sub-folders by their own CID. I might be wrong.
EDIT: This post was proven to be slightly incorrect because each subfolder does have it’s own CID determined by what’s under it, and I clarified it better in my next post (below).
Folders in IPFS are just merkledag objects. The reason you only need to give the hash of the root directory is that object patch automatically updates the necessary subdirectories. Each directory holds only the information on it’s direct children. In IPFS, a directory hierarchy is a tree.
The hash of b/ changed from QmeWH5Rkv69fUBuazFM23iNXvi2da1FnQCR6arxCg2n6QM to Qmd6qmfeLAtc5ebemL8hxfLo84HqU1qtEytVi9i2JwNB4x, and the hash of a/, the root, changed too.
So adding a file to the folder in the above example was a two step process:
Add file itself, to get it’s own CID
Use the CID to ‘patch’ the file into the ‘folder structure’ ROOT as a link, because directory structures are apparently stored as links (trees of links really)
In other words, adding the file to IPFS v.s. putting that file in some specific folder, are two completely separate independent operations?
One final obvious question arises:
If we did the patch directly onto ‘/b’ instead of ‘/a’ (in the example @jaidedau gave), would that also mean that ‘/a’ would AUTOMATICALLY get updated too (w/ a new hash) representing the folder that contains the new file? I’m guessing “no” is the answer.
Probably the patch only is able to update things recursively underneath that level of the folder right? So what I said earlier is “partially” true, meaning once you upload a folder strucutre, it’s kind of it’s own ‘bucket’ (at the root), and when you add files anywhere under a data structure you need to still be always patching the ROOT (bucket), because the patch will only affect things ‘underneath’ (recursively) itself rather than at higher levels up. I’m saying all that as a question. …to ask if I’ve got it right yet?
A patch can only affect those under it, and if you patch ‘b/’ instead of ‘/a’, ‘/a’ isn’t automatically updated-- a hash can only refer to one object, so the old hash of ‘/a’ can never have different subdirectories, or be updated. object patch has gotta give you a new one.
You can use MFS if you need root-folder hash to auto-update after modifications. And it is easier to work with when adding, removing or moving files around. ipfs files stat <path> will give the root hash of the path-dag as it is.
I have a filestore directory tree added with “add --nocopy”.
Auto updated means that when I delete or add or modify a file or directory in filestore, all of these ipfs links will automatically updated? (Ok not all, only needed)
Documentation about file systems states that I must inform ipfs for that changes. So it is deduced in the above jaidedau’s solution.
I am thinking to write a Python script which found differences with previous scan and patch the differences.
If you’re using the “object patch” way of updating a folder (non-MFS folder), you must always patch the top level root itself, and you’ll therefore get an entire new root CID as a result. You can’t just patch some sub-folder and expect any new root CID to automatically come into existence.
But MFS is different, and more like a file real system, so that parent folders ARE always automatically aware of things happening in child folders.