Two IPFS addresses for one file - how this happened?

TLDR: IPFS supports content addressing and there are multiple ways to hash a file into an address. So while a single CID will always only verify those bytes, each set of bytes could be represented by multiple different CIDs.

Some examples:

  • You could use a different hash function (e.g. SHA2-256, SHA2-512, SHA3-256, Blake3, etc.)
  • You could use the UnixFS specification to encode your files / folders instead of just hashing the file bytes (within the IPFS ecosystem this is predominantly how files are worked with unless they’re small, e.g. under 2MiB, in which case they might just be hashing the file bytes)
  • While using UnixFS you could choose any number of possible ways to ingest your file
    • You could use fixed size chunks that are smaller (e.g. 256KiB) or larger (e.g. 1MiB), you could have content-based chunkers like those based on Rabin fingerprints, have larger/smaller fanouts/depth of the tree for larger files, etc.
    • People even come up with interesting content/application specific chunking schemes like IPFS Custom File Chunking for WARC and WACZ , all of which are compatible and readable by IPFS applications that implement UnixFS.
    • You can use https://dag.ipfs.tech to visualize a few of the possible configuration options

Note: There are some people interested in enumerating some of the most common “profiles” / ways of encoding files/folders into UnixFS in this post Should we profile CIDs? .