Is there any documentation on how gc is performed? Is there any criteria? LRU, size etc? or does it just start randomly deleting unpinned content until it gets below a certain threshold?
I’m also wondering how that works with files stored using urlstore or for that matter filestore since gc’ing them doesn’t really free up much space since the actual raw blocks are stored on the fs or some remote http server.
It would be kind of cool if there was an in between for urlstore where it would cache the blocks locally and gc if you needed space but still store the url reference in case someone requested that content. Could you do something like that with two IPFS daemons and some fancy routing between them?
The GC will scan the whole repo and delete everything which is not part of the MFS and/or not pinned.
There’s currently no way to wipe just “parts” of the cache.
I’m currently working on a solution for this, which involves to track which blocks are are pinned and which not. This was once part of IPFS but the performance impact of the implementation was quite large.
It won’t take the size in consideration for the metric, since most blocks are expected to be filled 100% to the maximum block size of 256 KByte.
The metric will be a custom one, calculated by the age and the amount of accesses, with some non-linear algorithm to avoid that stuff will poison the cache forever when there are large amount of accesses for a while.
The idea is to calculate how much new space is used per interval and delete the oldest and least accessed blocks which are neither pinned nor stored in the MFS, until the drop is roughly the same size.
Details can be found here:
Note that neither URL store nor filestore is part of this, since the algorithm couldn’t clean up that “data” anyway. The metadata blocks will be handled like any other blocks, though.