IPFS objects form a DAG, so whenever I add or pin something I never know if it is not already used and pinned from somewhere else in my application. So when do I know that it is safe to unpin something?
I would have to keep track manually of how many times I have referenced the object from anywhere within my application to make sure I only unpin it once I really donât need it anymore.
I would have hoped for the pinning system to do some kind of reference counting to ensure that it is kept track of how often an object is pinned. Without that, I donât see how I can ever confidently unpin something because it could be used from somewhere else.
$ ipfs pin ls QmR9pC5uCF3UExca8RSrCVL8eKv7nHMpATzbEQkAHpXmVM
QmR9pC5uCF3UExca8RSrCVL8eKv7nHMpATzbEQkAHpXmVM indirect through QmZNFvNfWjtrj5iSjfs4Y6x4AR3vgzT1brnqCvw6MwEpJU
Yes it is, but now via QmZNFvNfWjtrj5iSjfs4Y6x4AR3vgzT1brnqCvw6MwEpJU aka tree2.
So the indirect pinning works exactly as expected. Not sure if it is via reference counting or via another mechanism, but I donât really care.
The question now becomes:
How can I get this âsafe behaviourâ that takes into account that an object can be pinned from multiple places in a DAG for things that I directly pin, to get more fine grained control over what is pinned?
Is there some detailed documentation for how the pinning system works, including internal approach and performance characteristics, or is it just âuse the source, lukeâ?
I donât think you should think of something being pinned as a reference count. Youâre in a decentralized environment. The pinning only means that itâs pinned from your perspective. If you need it to be pinned, then you have to pin it. You canât rely on the fact that someone else pinned something. Thatâs what it means to be decentralized.
Yes, I am very much aware of this. I am talking solely about a single node. Letâs say I have some piece of data which is very common. It might be linked from many places in my application (we are not talking about a blog or website here, but a pretty complex database application).
Now for whatever reason I donât need one particular link anymore. How do I decide that it is âthe lastâ link, and I have to unpin?
This is exactly identical to handling pointers to objects on a heap. There are basically three approaches:
The malloc approach
You have to make sure that you delete (pin rm) an object on the heap only when there are no more references to it. If you mess up, all hell breaks loose.
The smart pointer / reference counting approach
There is a counter that keeps track of how many pointers point to a location, and you automatically delete (pin rm) once the last pointer is gone. This is famously not working for cyclical references like a double linked list, but these are impossible anyway in IPFS since it is an acyclic graph.
The garbage collector approach
You perform global analysis on the heap and detect when a pointer is no longer referenced directly or indirectly from a ârootâ. Can be very fast, but is a very complex piece of software.
I was kind of expecting ipfs to follow the reference counting approach, since a DAG can not have cycles so this should work pretty well. But this seems to not be the case. It seems to be more like 3., which unfortunately in my case forces me to do 1.
In my current understanding it is more like (3), if you pin recursively. A recursive pin is like a GC root on a garbage collected heap. The problem is that this is very coarse. What if you want to have more fine-grained control over what is pinned? For example, you have some nested dag objects, and somewhere you got a dag link to this: Main Page
Obviously pinning everything recursively is not the answer, since it might not even fit on your device.
Sorry to be a noob, but please help me understand something. In your example above, you never do a ipfs pin add on the files. Are they pinned by virtue of being part of the folder that you added (but didnât pin)? What happens in your example if you do ipfs add... and then ipfs pin add... Does it matter?
The default behavior for ipfs pin add is to also pin the root hash. This can be overridden using the --pin option.
If you do ipfs add followed by ipfs pin add it is redundant if you used the default ipfs add behavior. In either case the ipfs pin add should complete quickly since the content should be cached in your repo.
@tjayrush I donât see any typos in what I wrote. If there are specific things I wrote that donât make sense to you Iâd be happy to clarify. Otherwise if nothing I wrote made sense maybe someone else can put it more clearly.
edit: looking at the example above that I think you were referencing, I might not have understood what youâre asking. Please feel free to disregard what I wrote if it doesnât apply. Apologies if Iâve only added noise.
Which confused me a bit. Did you mean to say âThe default behaviour for ipfs add isâŚâ In the original example, the OP doesnât do ipfs pin add, they only do ipfs add -r. Iâm just learning, so Iâm not sure.