Quick question before I head down a rabbit hole, is there any way or possibly discussion on adding different GC strategies? I’m thinking I’d like to have a simple LRU strategy where a node basically acts as a cache.
Everything is possible if you are willing to make it.
The main reason IPFS use a full → flush strategy is that GC is fucking expensive.
Currently IPFS runs a touching GC.
When a GC is triggered IPFS list all recursive pins on your node, then for each pin it list of all of the subblocks and “touch” them, and then on all subblocks listing their subsubblocks and repeating the same process recursively until no blocks are left.
You then have a huge list of “touched” blocks, all touched blocks are descendant of one of your recursive pin.
Everything not touched is removed.
Since this require scanning all your pins and all their tree that really expensive.
Ideally this would being removed and replaced by a refcount, each block would hold a counter counting all references (blocks that have this block as a child), when you pin something new you add one to that refcount, and when you remove a pin you decrement it.
The main reason to do this is that right now if you unpin a block you don’t know if all of it’s tree also need to be removed, because maybe part of it is also part of the tree of an other pin.
With refcount you just list all subblocks and decrement their refcount, if the refcount reach 0 that mean the block you just unpinned was the last block needing that block and you can safely remove it.
If a refcount) is bigger than 1 that mean there are multiple pins leading to that block and removing one doesn’t mean the block can be purged safely, you need to remove both (you need to remove as many as stored in the refcount).
Note: touch GC is not actually that expensive, it depends a lot on how often you run it and your trigger strategy, but if dialled just right touch GC can be less expensive than refcount.
However refcount support incremental updates, if you unpin 3 blocks out of a 7 block tree, you just need to scan 6 blocks or so (all blocks you remove + all of their direct child, not 2nd degree childs).
Touching GC is all or nothing.
Also touch GC pack all the GCing overhead at one time.
If you use a touch GC your GC might take 100s to run.
If you use a refcount you might total 130s however the refcount will be spread out in 1300 events 100ms each.
Refcount are also more predictible.
With a refcount same code will always almost have the same latency. With touch GC there is a lot to do with when the GC runs. For example you might have all requests be very fast, however if a request happen to land when the GC runs, your program is just stalled for 1m straight.
Agree but I figured it’s a good idea to ask if anyone thought of inventing a wheel before thinking it’s an original idea
Thanks for the detailed feedback. So sounds like something interesting to dig into that will probably lead in a thousand different directions.