Those are great questions. I’ll answer these to the best of my knowledge.
It’s hard to answer this question without addressing what IPFS actually is.
So here’s my working definition of IPFS:
IPFS is a set of protocols that enable content discovery, routing and verified transfer using content addressing – the process of addressing data based on hash fingerprints.
In practice, this is achieved by representing data (including files and directories) using Merkle DAGs. UnixFS specifies how to represent (both serialised as binary and as structured data in a programming language) a filesystem of files and directories using these Merkle DAGs which includes chunking, tree (DAG) layout, and metadata. To do this it uses Protocol Bufffers.
(this paragraph is my reading of the history)
At some point, developers wanted to do more than just files and directories with IPFS and was IPLD was born. IPLD became a superset of UnixFS in so far as it generalised manyof the ideas from UnixFS and introduced more codecs (in addition to Protocol Buffers) like dag-cbor and dag-json.
Why was arguably IPLD useful as an abstraction? Because IPFS was pretty good at solving the problems of content discovery, routing, and transfer of content-addressed data, expanding the kinds of data it could move around would unlock new possibilities.
Moreover, it could allow IPFS to interoperate with existing content-addressed (hashed-addressed to be more specific) systems like Git, Blockchains, Bittorrent, Docker images etc. (see this proposal for some of the current constraints that limit this interoperability)
This is all to emphasise that while UnixFS is the oldest use-case for IPFS (and arguably the most useful since everyone’s familiar with files and directories), IPFS itself is all about content addressing and has many different approaches to:
- Content discovery and routing - the DHT with Kademilia and more recently indexers and delegating routing
- peer-to-peer transfer of content-addressed data: HTTP, Bitswap and potentially other new protocols that will be developed.
When it comes to implementations, it’s useful to think about IPFS implementation through 3 different key properties:
- Programming language
- Use-case
- Platform
For IPFS to run everywhere and be available to every networked device, e.g. automation robots in a factory, mobile phones, browsers, and large data centres, we need different implementations of IPFS.
The most obvious example is browsers which, until WASM came around, only supported JavaScript for programmability. And so we have the js-ipfs implementation.
Another is low-powered and resource-constrained devices for which ipfs-embed was developed.
They don’t have to, but UnixFS is the most common content-addressed data supported by IPFS so the answer is likely yes. Alternatives to UnixFS include some of the stuff offered by IPLD like dag-json and dag-cbor (the comparison here isn’t direct), or WNFS which improves on UnixFS by introducing hierarchical encryption, metadata, and other features.
How to store blocks is an implementation detail – a very important one – but still an implementation detail. Since the current block limits are 1-2 MB in IPFS, any key-value store capable of storing such sized blocks could be used to implement a block store.
Some examples I’m aware of:
- Iroh’s block store uses RocksDB – a key-value store in
RustC++ based on LevelDB (which is used by Bitcoin core) - Elastic IPFS stores CAR files and indexes them in DynamoDB
- GitHub - ipfs/go-ds-s3: An s3 datastore implementation
Hope that’s helpful. Would be interesting to get @hector’s perspective on this