IPFS for personal use: keeping track of "current version", discoverability, and IPFS file system

I’ve set up two private nodes using the instructions at https://github.com/ipfs/go-ipfs/blob/master/docs/experimental-features.md#private-networks and it worked great. Now I’ve got some questions about the experience of actually using it…

Here’s my use case: I’ve got a bunch of projects with audio files. IPFS seems like it could be very useful for snapshotting the projects, sharing between systems, and de-duplicating files across projects. So for example, I’ve got a project with the following structure:


I add it to IPFS with ipfs add -r my project and get a hash hash1. Then I add a file audio3.wav and add the dir again to get hash2.

Now I’m a bit stuck. How do I know which version is the most current version of the project? From what I’ve seen so far, IPFS doesn’t have any metadata to tell me that hash2 was created more recently than hash1 – which makes sense, because I’d get the same results if I had added all three audio files first and then deleted one!

But still, I want some record of the versions so I know what’s current.

It seems to me like there needs to be an index of hashes that sits outside of IPFS – is that the typical pattern?

Another question I have is about discoverability of my IPFS content. How do I get any insight into what’s sitting in my IPFS? So far the only thing I’ve figured out is ipfs ls hash, or clicking every DAG link in the webui. That’s pretty long-winded. I guess the way we solve that with regular file systems is with names :slight_smile:

Actually, I may have just figured something out using ipfs files. I can create a named directory structure:

ipfs files mkdir /myproject
ipfs files cp /ipfs/hash1 /myproject/version1
ipfs files cp /ipfs/hash2 /myproject/version2

Now, if I do an ipfs pin ls and pass it the hash of /myproject (retrieved via ipfs files stat /myproject) then it doesn’t show up in the pins list. But when I do ipfs repo gc it doesn’t remove them. So does ipfs files provide a sort of “soft” pin, where whatever’s referenced in ipfs files won’t get garbage collected? Is it reliable, or should I manually pin each version to be safe?

Related: Can I ipfs files ls on a different node? I’d like to be able to see the state of the project that’s on a different computer… so far the closest thing I can think of is doing an ipfs name publish myprojecthash, but if I could query node1 for what it thinks /myproject is that would be great.

Also, if I ipfs mount to make the IPFS file system readable through standard command-line tools, is there a way to access the named /myproject path? Or do I have to get the hash and access it that way? e.g. I’d love to be able to ls /ipfs/myproject but so far I seem to have to do ls /ipfs/$(ipfs files stat --hash /project1)

My understanding at this point comes down to this: IPFS provides a flexible distributed file system, with little-to-no built-in tools for assigning meaning to the contents of the file system. If I want to incorporate meaning, I have to do that myself. Is that accurate?

I think that’s the correct conclusion. It sounds like partly what you need is a version control system? I just posted an experiment of mine. It uses the MFS (ipfs files ) for keeping track of versions like git. It’s not quite usable yet, but could be interesting nontheless.

Yeah seems like it.

Your response made me think about why I’m interested in IPFS for this sort of thing in the first place. I think it really comes down to two things:

  1. de-duplication of files in IPFS
  2. easy ability to copy a sub-directory

I’ve got a bunch of projects that aren’t necessarily related, but might share some data (they just happen to reference the same media files). Most of the time I don’t need access to the actual project dir – once I’m finished with the project, I archive it. So IPFS is attractive for that purpose, I can have multiple versions of multiple projects all stored in a single IPFS without duplicating any space. git will do that too, but then I’d have to keep everything in a single repo and I’m not able to split projects out very easily.

With IPFS I can store all of my projects in a single IPFS, but if the storage gets to be too big then I can copy a subset of it over to a different IPFS and remove those references from my main one.

Definitely something to explore more…