Let’s say I receive a hash Qm... that’s supposedly an IPFS multihash. How, then, can I verify if this is true, short of running ipfs get or ipfs cat and receiving an error, if it’s not an IPFS hash? Does ipfs have some form of multihash verification routine? (Meaning that it checks, whether (a) there’s an actual IPFS object “waiting” behind that hash, or (b) at least that the multihash has been correctly calculated, and is not just a random string meant to mimic an IPFS multihash.)
So, without actually fetching the file, that’s not possible; you can check if the CID/multihash is well formed but not if there exists a corresponding file (existence isn’t a property of any of the hash functions we use except for the identity hash function). One could use snarks instead of hashes to guarantee that the author of the link knew the corresponding data but that would be prohibitively expensive and almost certainly not worth the effort.
I use ipfs object stat, Python code using local go-ipfs install with running daemon:
def ipfs_stat(ipfs_hash):
'''
Returns a dictionary of key, integer value pairs for an ipfs object, or False if the stat
fails.
'''
stat = subprocess.getoutput( 'ipfs --api /ip4/127.0.0.1/tcp/5001 object stat '+ipfs_hash )
if not 'Error:' in stat:
stat = stat.split('\n')
stat = [s.split(':') for s in stat]
dstat = {}
for s in stat:
dstat[s[0]] = int(s[1])
return(dstat)
else:
return(False)
Thus, “if ipfs_stat(potential_hash)” can be used as an availability check / validation, and the return dictionary gives useful information like cumulative size which is nice to know. Using ‘ls’ will fail on a file, and ‘get’ without knowing the size of what you’re about to get is generally a bad idea.
I had had a look at object stat, too, but the command only returns output immediately, if the file is actually on your node. If it’s a remote object, e.g. QmUkPucZ1WUxwGqR979YAKj2UfUsqpSze6MPDcmhtbzmst, then it still takes very long, and if you’re unlucky, it will take forever, even though it’s a valid IPFS object. With an invalid IPFS object, it also takes very long. (Or even forever?) Would a timeout command suffice? I doubt that, because you could kill stat on valid IPFS objects, and then you’d have a false negative.
There is now a standalone utility in go-cid (https://github.com/ipfs/go-cid) cid-fmt that will format a CID in various ways. A simple way to verify a CID using the utility would be: