Best practices for storing CIDv1s in EVM smart contracts

I’m trying to compile best practices for storing CIDv1 in EVM chains.

Just the hash

I saw some cases where people just store a sha256 hash because it fits into a single bytes32

The whole CID as binary

Others suggest two concatenating two string32 (see https://ethereum.stackexchange.com/questions/17094/how-to-store-ipfs-hash-using-bytes32/17112#17112) since the gas cost is for 32 byte multiples anyways.

Storing as a string

The old NFT Storage docs recommended storing IPFS URI strings:

So do the IPFS Docs:

A base32 string encoded CID with the IPFS URI scheme, like ipfs://bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi is 66 bytes long, which would increase the gas cost to 3*32 byte allocations.

What’s more, strings cannot be returned from contract functions:
https://ethereum.stackexchange.com/questions/11556/use-string-type-or-bytes32

Do you really need to store the ipfs:// scheme?

If you need the path too, you probabbly want to keep the scheme. But since the CID can just point to anything (unixfs directory, file, dag-cbor etc) in a dag, I’m not sure there’s much value in storing a path on-chain if you can just store the root CID you’re interested in.

So if you don’t need the path, why not just name the the state variable cid.


I’m not really a smart contract developer, so any input here would be appreciated.

3 Likes

A quick peek into the ENS resolver code reveals the following:

Internally, the content hash is:

  • Stored as a bytes value in the resolver’s storage.
  • Returned as-is by the contenthash() function.

And the bytes type functions as follows (solidity docs)

Because bytes are dynamically sized they can’t be used as keys for maps.

As for gas costs:

Feature bytes32 bytes
Type Fixed (32 bytes) Dynamic (0 to ∞ bytes)
Storage slots used 1 1 + ⌈len / 32⌉
New write gas cost 20,000 20,000–100,000+ (length-based)
Read gas cost ~2,100 Higher, depends on length