I’ve noticed that while adding and serving files is a solved problem in IPFS, file discovery is not.
What are some idiomatic methods to advertise a list of files?
Of course, the most obvious solution is to share a directory CID from the Files API using IPNS or DNSLink.
However, this naive solution doesn’t necessarily scale because it simply downloads the whole list. Imagine downloading the whole filesystem tree into memory every time you open the file explorer.
Sane APIs would have you limit your queries GET /files&start=0&end=100
so that both ends don’t experience undue load, or offer changes since a timestamp GET /files&since=1234567890
, but this isn’t possible in IPFS because it only serves static files.
It should be possible to partially alleviate the issue by structuring the CID in an intelligent way, e.g. by serving “pages”–lists of lists–which the end-user can dereference at will, however this technique is not friendly to the network if the files are constantly mutating. The catalog will lean towards centralization and be poorly distributed.
An alternative could be through implementing the file catalog similarly to an append-only database that experiences occasional “garbage collection”. This reduces the amount of churn on the network and should make it possible for the user to query using a timestamp. I think it sounds similar to OrbitDB’s feedstore.