Hi.
Lurking in Github repos, I see a huge effort is ongoing to upgrade default CID from CIDv0 to CIDv1.
I’m trying to wrap my head around the concepts of CIDv0 and CIDv1, and all the concepts of the Multiformat “stack”. What I understand from https://multiformats.io/ and https://github.com/multiformats/cid is that, if you have a block, and want a CIDv1 out of it, you must:
Hash_function(block) = a hash (as a binary)
Concat(hash_function_code, digest_length, hash) = a multihash (as a binary)
(Here, if the hash is sha256, and the length is 32, it almost a CIDv0. We just have to encode it in base58btc, right?)
Since we want a CIDv1:
Concat( multicodec_code_for_multihashes, multihash) = a multicodec (specifically a multihash's multicodec, as a binary)
(and the multicodec_code_for_multihashes = 0x31)
Concat ("0x01",multicodec) = something almost useful (as binary)
(and “0x01” is the version of the CID)
Encode(previous binary) = a string of characters
Concat(code_for_this_encoding, previous string) = a multibase, and more specificaly a CIDv1 (as a string)
So to sum up, for a particular block of data:
- there is only one CIDv0
- there are a lot of CIDv1
- the different CIDv1s depend on the hash function, digest length, encoding type choice
- If we except the “0x01” for CID version, a CIDv0 is just a particular flavour of CIDv1: the one with the base58btc-encoded untruncated sha256 hash (which is 32 byte-long)
Is everything above correct?