How to convert from hex to CID

Let’s say I get the cid when I push a file to ipfs. now, I need to store it in solidity’s bytes variable. because it’s a bytes variable, i can’t directly send cid as it will fail. So I transform the cid into hex depending on which base the cid uses. The hex goes to solidity.

After some time, solidity throws the event which I catch in my another client(javascript). Now I need to somehow get the CiD again from this hex. How do I do this ? The tricky part is that I don’t know if while getting the hex, which base was used which means I could have gotten the hex of cid v0 or v1.

What would be the right way ? Definately, I need to know which base was used so that I can decode from hex to cid again.

any ideas ?

You should just think if a CID as a string. Like most of them start with “Qm” but not all. Those first two bytes do encode everything about the algorithm, etc. So if you just don’t worry about the details and think of it as a big world-wide hashmap where every “key” is a “string” then you’re done. That’s it. You’re probably just overthinking it (if I can say that politely. :slight_smile: Good luck. Correct me if I totally misunderstood.

@wclayf

The problem is that on the smart contract, my type is bytes and I can’t change it so I have to pass bytes type to the smart contract. The cid Qm... is not that type so I need to convert it to hex first and then that hex gets sent to the contract.

If I directly send the hex and then in my other client, I transform from hex to string, it will work, but I am also trying to optimize things…

So, let’s say our CiD is QmRLVtwER37MfXvrxEmCFpARhH32s8u1ydcbDpNPSNPi1e. We can transform this into hex in 2 ways:

WAY 1 on this website - Best String to Hex Converter Online to Convert Text to Hex.
WAY 2 on this website - Bitcoin Base58 Encoder, Decoder, and Validator

If you do both, you will notice that the second way produces less sized hex which means that would be much better, but if I use the WAY 2, then in my other client, where I have to decode, I have a problem, because from hex to string won’t work exactly, I would need base58 hex to string, but the question is that at that time I won’t know if the HeX needs to be base58 decoded or any other base decoded, because all I have is hex.

Does it make sense ?

I think maybe you just want to convert the “Base58 string” (CID) directly to ‘bytes’ array, and then back, and not ever convert to any hex. :slight_smile:

@wclayf

There’re 2 ways. 1 - as you said, bytes array, and 2 - using hexes.

Way 1:

var buffer = Buffer.from('QmRLVtwER37MfXvrxEmCFpARhH32s8u1ydcbDpNPSNPi1e')
var myBuffer = [...buffer]
await contract.methods.functionName(myBuffer).send({from: from})

Way 2:

 var buffer = Buffer.from('QmRLVtwER37MfXvrxEmCFpARhH32s8u1ydcbDpNPSNPi1e')
 var hex = buffer.toString('hex')
 await contract.methods.functionName(hex).send({from: from})

The idea is that they both produce the same gas costs, because I think, in way 1, even though myBuffer size is much less than hex in Way 2, myBuffer still gets sent as the hex value which means the gas costs stay the same and there’s no win situation here.

Makes sense ?

IMO you should always keep it as a string, unless some JS API you have no control over insists on using a byte array. But in every case I can think of converting to HEX is wasteful and accomplishes nothing.

If I don’t convert it to hex, then How do I do it ? :smiley:

The smart contract only allows the parameter as bytes and I can’t change it now. So in js, i have this CID - QmWJMqj5WhaFeUezsrq9ebm2h9Z8RYq5WNtpM8t96xDLtF . As I said, smart contract doesn’t allow this, so what value do i send to the contract ?

@wclayf

Post a link to the docs for the API you’re calling and I can try to help. I don’t know what that ‘send()’ method is expecting.

@wclayf

This is the contract contract.sol · GitHub and we should call setBytes function.

This is my CiD QmcGPbKtsw8o6ezZ821HHRZuLusnUoXYU9Ddjq81WSYM3n

I am using web3 here is the link - web3.eth.Contract — web3.js 1.0.0 documentation

Let me know in case there’s anything needed. send is nothing other than specifying from which account the call should be made (if you understand how smart contracts + ethereum works)

The ‘from’ in that javascript is the identity of the person doing the save, and the other code (how to store a string/bytes on Ethereum) is totally unrelated to IPFS.

Yes, it’s unrelated to ipfs but i still thought that who knows ipfs and ethereum both might be able to help.

Hi @novaknole, I’ve added basic docs in docs: cid conversion by lidel · Pull Request #734 · ipfs/ipfs-docs · GitHub but posting below in case there are follow-up questions. Hopefully this will help reduce complexity of your code.


CID itself should provide you with everything you need. You can represent it as text or binary.

CIDv1 in the text representation is self-describing and includes info about base encoding,:

bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi

In case of smart contract you describe, binary representation indeed makes more sense.
In JS you can leverage the cids library to normalize any v0 to v1 and extract raw bytes as Uint8Array from the entire CIDv1:

let cid = new CID('bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi').toV1()
const binaryCid = cid.bytes

Those are the bytes after the multibase prefix (it is only present in the text form). See cid spec for more details.

I suggest including explicit normalization step via .toV1() because it removes any ambiguity when older CIDv0 is used, and you only worry about supporting v1 in your codebase (which will become default soon).

Be mindful that if CID comes from a third party, it can use custom hashing function and the length may vary – your code responsible for storing binary representation should account for that.


PS. If you really want CID as HEX (eg. to show in this for for debug), you can either convert cid.bytes (Uint8Array) to HEX by hand, or… (this one is bit silly, but fun) use built-in support for base16 encoding and skip the f (multibase prefix):

> cid.toString('base16')
'f01701220c3c4733ec8affd06cf9e9ff50ffc6bcd2ec85a6170004bb709669c31de94391a'

> cid.toString('base16').substring(1)
'01701220c3c4733ec8affd06cf9e9ff50ffc6bcd2ec85a6170004bb709669c31de94391a' // "cid as hex"

If you want to convert such HEX to a valid CID, prepend it with f (if all lowercase) or F (if all uppercase). You can also use cids lib for converting it to base32 to get the original text representation (does not matter in practice, both point at the same multihash)

:sparkles:

@lidel

Thanks a lot for the nice explanation. Though, I have a couple of points/questions.

  1. Let’s say I do cid.bytes and send it to smart contract. Then another clients(js) can get it from smart contract. now, The idea is that another clients should be able to fetch whatever is on that ipfs, so they will need to convert cid.bytes to cid again. How can they do that ? They won’t be able to get the cid again because they don’t know which multibase encode to use since the cid.bytes doesn’t contain multibase at all. Makes sense ? cidv1 can use many kind of different multibases, so this solution (sending cid.bytes doesn’t seem to me correct). What do you think ?

  2. I think, one way or another, the hex (base16) solution seems the better one than the previous one, because I know that in other clients that fetch from smart contract, they will prepend f to whatever they get. Though, I am not sure how to get the cid from this - f01701220c3c4733ec8affd06cf9e9ff50ffc6bcd2ec85a6170004bb709669c31de94391a This hex is the one that you posted above.

convert cid.bytes to cid again. How can they do that ?

You can create a CID instance from a Uint8Array and then serialize it to string via new CID(bytes).toV1().toString() (docs)

They won’t be able to get the cid again because they don’t know which multibase encode to use since the cid.bytes doesn’t contain multibase at all.

Multibase does not matter, it is just relevant to text representation, does not change the meaning behind the CID, all base variants point at the same hash.

Unless you have a preference to always use the original multibase, or use custom one for some reason, just use the default base.
FYI cid.toString() for CIDv1 is the same as cid.toString('base32').

I am not sure how to get the cid from this - f01701220c3c4733ec8affd06cf9e9ff50ffc6bcd2ec85a6170004bb709669c31de94391a This hex is the one that you posted above.

You already have it.
The f0170.. is a valid CIDv1 encoded in base16 :slight_smile:

The key takeaway here is that multibase does not change the underlying identifier, it only changes the way it is represented in text. Multibase impacts length, character set, case-sensitivity, but does not modify the underlying bytes with a cryptographic identifier.

1 Like

Finally makes sense. I think i was confused with multi base and what you explained now finally makes sense.

Thanks a lot @lidel and amazing cids library out there.

1 Like

Thanks for this discussion, cleared up a lot for me.
For more context on this, I find this answer useful too: solidity - How to store IPFS hash using bytes32? - Ethereum Stack Exchange