Work-plans for kubo, helia, & other Shipyard IPFS projects in 2025

There’s already a lot here and both @hector and @adin shared some good insights. I’d like to add a couple of points.

  • Mainnet as a shared participatory global network is one of the most useful things about IPFS. Alas, there are reasons why Mainnet has problems, but recent advancements have alleviated many of those. I’d say we’re about 70% of the way there. Granted, there are inherent trade-offs with the Mainnet approach, like increased latency and the overhead of relying on a “forgetful” DHT. But the global namespace/singleton nature of IPFS Mainnet makes it unique and unlike any other protocol/network out there.
  • We already support a lot of “non-UnixFS” use-cases in the implementations/tooling Shipyard maintains. Albeit without support for large blocks and incremental verification. I’ll leave this for the moment, but I agree that this is an important project. We should tackle this collaboratively with a real-world use-case .
  • There’s more work to improve developer experience around these use-cases. But in my view, such improvements should find the sweet spot between helping adopt CIDs in your protocol/app as well as on-ramps to IPFS Mainnet. For example, if an application developer chooses to adopt DASL CIDs, it should be easy to also make that data available through IPFS Mainnet (by either users or the app builder).
  • CIDs become especially useful if you can retrieve data without special knowledge about how your app leverages them. For this reason integrations with Mainnet are where a lot of the potential lies in my opinion.
  • I see a huge opportunity is combining WebSeeds (by which I mean http gateway endpoints with their data announced to either DHT/IPNI/app-specific delegated routing endpoint) with some of the new emerging use-cases like AT Protocol/Bluesky so as to make it easier to make data available on Mainnet.

What about UnixFS:

  • UnixFS is mostly useful for representing files and directories.
  • If you are just working with files (no directories), perhaps you can forgo UnixFS and just use hashes with raw data like AT Protocol does for blobs. It may take some work to for interop with Mainnet (incremental verification and all other aspects discussed in Supporting Large IPLD Blocks)
  • There are still challenges and drawbacks with UnixFS (some came up in CID Congress) that we need to address:
  • same data results in different CIDs, aka hash equivalency across different systems or CID determinism. (for all the reasons mentioned in Should we profile CIDs?)
  • Much more pre-processing (chunking) is necessary to get CIDs than just generating a raw hash.
  • Some find the dependency on protobufs (especially if they already depend on cbor) undesirable, particularly in web environments.
  • Once data is merkelised as UnixFS it often needs to be stored twice (this is an implementation detail, not a hard limit).

Many of these problems are soluble and the proposed work plans address many of these while being realistic in scope.

Would love to hear any feedback on this.