I’ve read what I can on the --raw-leaves but it’s a little light on the exact details like “Use raw blocks for leaf nodes” or “Allows files to be added with no formatting in the leaf nodes of the graph”. What I’m wondering is what exactly is going on here. Does this imply that there is no chunking applied? What “formatting” is usually applied?
Usually IPFS uses IPLD objects, raw leaves works more like a regular file system.
Chunking isn’t changed by using raw leaves. If you expect that your data is consumed as stream or in one piece, you might want to set trickle for a little more efficiency.
If your data isn’t compressed and you might add files in a newer version with little changes, you might want to use the rabin chunker instead of the standard one.
This might be an interesting introduction:
This would affect the block hashes wouldn’t it? This means it adds another way to break de-duplication in addition to choosing different chunkers as summarized in my deduplication summary. This means the same file added with different chunk or leaf settings will not deduplicate at all.
I understand that IPFS is new and still developing and experimenting with these settings, but it might be nice to expend some effort on making the defaults the best-practice recommendations for good de-duping, and if you play with the advanced settings make it clear that you are breaking deduping.
BTW, IMHO raw leaves are better and should be the default
IPFS is still alpha software. Until it’s ready for production usage many of the stuff you mentioned will be changed.
Raw leaves are not “better”. They have their place and usecase.
You can break “deduplication” in many ways, but why would you? Usually you add files with the standard settings. That using an experimental flag breaks stuff is to be expected.
Ok I think I get it. Raw leaves aren’t wrapped in IPLD and raw leaves are just plain chunks of binary file content.
Can you give a use case for when you might choose raw leaves and when you wouldn’t?
You would use --raw-leaves for binary data like encrypted, compressed or encoded formats. Just because the format is simpler and the regular use case is to consume the file as a whole by reading it in one go from ipfs.
MFS currently uses raw leaves, but there’s work on a UnixFSv2 which will be IPLD compatible.
A typical example of data you wouldn’t store as raw is a git.
I’m not familiar with MFS, what is it?
I’m trying to round out my understanding of IPFS and I have to say that I find the the multiple use of the term “file” to be a bit confusing. There’s File from IP(F)S, filestore, and UnixFS that is referred to as just “files” in the ipfs command, and FuseFS notion of files IPFS mounted files.
I think I’ve got it mostly sorted out but just wanted to share what it looks like from a new user perspective.
EDIT: Got it… MFS = Mutable File System another use of file M(F)S
Yeah, it’s currently not ideal.
UnixFS is just the data structure which is used to represent files and folders.
MFS is the system which uses UnixFS to store files and folders and represent it to the user, despite that changes on files and folders will always change the CID of the modified object, plus all CIDs of folders above it.
So if you got a CID of a file or a folder you can be sure, that it will never change.
MFS adds human readable file and folder names to IPFS objects… so if you add a folder with -r flag via ipfs add
it will not appear in the GUI under ‘files’ not will the CLI-Command ipfs files ls
list the newly added folder.
In this case you’ve created a CID which you can share with somebody of the folder, but it’s totally static.
If you want to modify it easily, you have to link it to the MFS via a ipfs files
command:
ipfs files cp /ipfs/<CID> /time_travel_stuff
There will be no data copied, since it’s just a link, but you can now edit the folder by the ipfs files
commands.
Note that, if you now delete the folder, the folder will still be pinned, since you’ve added it with ipfs add
.
Currently everything which is either pinned or inside the MFS (or both) won’t be cleaned up by the Garbage Collection process.
If you want to share a folder which can be modified, you need the IPNS system. It’s a private/public key system where you can release different CIDs under a static hash sum.
To use it you generate a key with a subcommand of ipfs key
and then publish the new signed CID for your key with ipfs publish
.
This way your /ipfs/<CID>
will be .
Still not really human readable, but it’s now static. So if you share the link, the person can always resolve it and gets the latest CID you published under this ipns.
To create human readable URLs you can use DNSLink, where you create a TXT-Record on a (sub)Domain in the regular DNS-System and this TXT-Record contains the /ipns/<hash>
.
So every time you resolve the URL you get the same ipns hash, which then gets resolved to the latest CID you published which then loads a MFS folder and displays the content.
Easy, isn’t it?
A IPNS and IPFS link can also be resolved by the ipfs.io address. I’ve forwarded some domains to this via simple http redirects.
But the domains also contain a DNSLink, so if you want to receive the content with IPFS, you can with the same domain name.
Take a look:
I had seen that. I’ll watch the video eventually but it was a little disappointing to have the docs follow up a summary with, “…and go watch a video”.
I haven’t watched a video on IPFS and I’m still very new to this project. The CLI documentation helps a little. But I agree, documentation could be better on some corners.
Feel free to extend the documentation as you go, and send pull request. They are surely highly appreciated.
Thanks that’s really helpful. I’ll have to read it over a couple of times to make sure I’ve got everything right in my head.
Feel free to do so, I won’t charge you extra
For anyone following along I noticed that there are bounties for IPFS issues including documentation. I was probably going to do it anyway but this is cool. https://github.com/ipfs/devgrants/projects/1
The new beta docs also has links at the bottoms for suggesting new content that opens a GitHub issue or fork for submitting a PR.
I think the IPFS docs could really use some illustrations and I’ll see what I can do about that. I’ll submit an issue for how to keep the look and feel of illustrations consistent across the docs.