When are files no longer accessible? After the file has been unpinned or only after it has been GC’ed?
I’m specifically asking because I’d like to use ipfs-cluster and have data that needs to be aged off at a specific time. The exact time it is deleted from disk isn’t as important as making sure users can’t access it. It would be a large number of files so I’d like to be able to unpin it and lazily delete it after with a GC.
I think I answered my own question. It is still available until it has GC’ed. It would be a nice feature to have. Something like a hard-unpin or just delete where it would stop sharing the file as well as immediately unpin.
I don’t think it is possible to GC specific pins right now. You may want to open an issue in go-ipfs.
That said, once content has been fetched, that content has been copied, and it can be potentially provided by anyone (that happens on the WWW, it’s not just specific to IPFS, but it’s even easier on ipfs).
Without knowing why you want to make something unavailable from your peers, I can say that the reason why IPFS does focus much on making something unavailable after it’s been available is that it’s very much pointless in a content-addressed network like this.
I was thinking of it in the case of a private network like ipfs-cluser where you’re in control of all the nodes. I think I might be able to accomplish what I’m looking for at the gateway. I wouldn’t really need specific pins GC’d but just that unpinned content was GC’ed eventually.
I’m thinking about the case were say you had user data that you were only able to retain for 30 days. It would be ok to still have the data on disk for a short time after but it needs to be inaccessible on exactly 30 days but it wouldn’t be ok for it to be accessible on day 31 because the GC is taking a long time.
It seems like it would be pointless on a distributed system but not because it’s content-addressed.
Is it unreasonable to do a daily garbage collect as a cron job or similar on each machine?
If your threshold is 30 days, you could make it weekly, etc.
‘ipfs repo gc’
Once unpinned I believe it should get garbage collected if you run it manually. After gc’d from both your storage nodes and the gateways, assuming no one outside your network is sharing it, it should be effectively gone.
I’m interested in a time based unpinning, but not time-based garbage collection. I’m probably simply going to keep a separate table of when pins were made.
It’s perfectly reasonable but in this case it’s a legal requirement and when the lawyers say gone in 30 days they don’t want to hear, "it’ll be gone whenever the GC gets around to it’. Think medical data that can’t be retained for more than 30 days. Unlike what many people would like to think about the profession they understand that there are practical considerations so they’re fine if it still exists on disk for a little while but access must be restricted at day 30 and it must be deleted in a reasonable time after.