What is the formula to calculate cid from sha256 hash

mctrivia · June 12, 2023, 12:02pm

For a new project i am using libIPFS to provide IPFS interface inside a C++ application. It works well but it is missing some functions I am use to having. The biggest being if I have the SHA256 of some data how to I get its CID.

For example if I know the SHA256 how do I get to bafkreicr2pggml4j5bjv3hhxi5i5ud4rgnnaqphrfs2mtoub77zfiwbhju

Sorry I forgot what the SHA256 for this example was but know this is a CID i have gotten in past from SHA256

Jorropo · June 12, 2023, 12:20pm

CIDs are not plain hashes of the data, we first chunk and layout the data in a unixfs merkle tree.
This enable parallel incremental verification, streaming and seeking.

If you really want to use your sha256 hashes as-is there is a PoC of doing incremental verification here:

Supporting Large IPLD Blocks

Note that sha256 is not a merkletree so the performance ceiling is worst (hopefully not that bad) and there would be no support for seeking, fetching random file offsets or streaming the file (all merkletree can do), actually this would stream the file but backward.
This would need a some work but could be done.

mctrivia · June 12, 2023, 3:27pm

Your half correct. In my particular situation the data was already hashed and the old hashes needed to be used. Raw mode allows the sha256 to be used direct and the cid given is for data that was done in this way.

Jorropo · June 12, 2023, 3:40pm

Raw node is limited to 2MiB. that why I linked Supporting Large IPLD Blocks which allows both to reuse your old hashes and support any size of blocks.

mctrivia · June 12, 2023, 4:58pm

2MB is more than enough for json data. Either way I dont see the math to make the conversion from SHA256 to cid in your link anywhere.

Discordian · June 12, 2023, 6:41pm

Does this help you? It’s the specification (and explanation) of CIDv1: GitHub - multiformats/cid: Self-describing content-addressed identifiers for distributed systems

If needed, I can try to dive deeper with you.

mctrivia · June 12, 2023, 6:57pm

A quick skim looks like this is what I am looking for. Will dive in to it in more depth now. Thanks.

mctrivia · June 13, 2023, 2:59am

Thanks it turned out to be pretty simple. If anyone is curious

“b”+base32(0x01551220 + hash+0b000)

Jorropo · June 13, 2023, 10:21am

This is only valid for single block raw sha256 files. Discord ( ipfs ) also explains how to get other hashes from the table.

For multiblock graphs there is no solution.

mctrivia · June 13, 2023, 10:48am

thats all that is important when looking up data that was already hashed before IPFS was around.

Topic		Replies	Views
Receive a file with an known SHA256 sum from IPFS Help	14	2015	January 21, 2023
Manually calculate the IPFS CID v2 Help	7	3040	July 19, 2021
Why does the same file result in different sha256 in cid? Help go-ipfs	4	537	June 14, 2022
How to compute the ipld scheme CID in golang or others? Help go-ipfs	1	34	August 23, 2024
How to calculate Cid locally? Help	18	3462	March 19, 2024

What is the formula to calculate cid from sha256 hash

Related topics