We deal with a lot of blockchain data that references metadata held in IPFS.
There is published guidance for how IPFS URI references should be made on blockchains; but people don’t necessarily follow it.
So we have to deal with cases where these metadata URIs aren’t ipfs://
URIs, but rather are references to IPFS gateways, and more specifically, are references to arbitrary third-party IPFS gateways, rather than to the canonical ipfs.io
/ dweb.link
hostnames.
Users permanently cache the URLs we return, so we want to give them URLs that they will still be able to fetch from years later. But we have experienced flakiness with these arbitrary third-party gateways; and there is also no guarantee that they will be “permanent” in the way that IPFS references should be. In short, we don’t trust them.
So, currently, when we recognize that a metadata URL is a reference to one of these third-party IPFS gateways, we rewrite/canonicalize the URL — either into an ipfs://
reference (for long persistence), or into a URL pointing to a trusted IPFS gateway, e.g. ipfs.io
(for immediate fetch.)
Two questions:
-
Is recognizing+rewriting third-party IPFS gateways in URLs like this a best practice, or should we be leaving them alone?
-
Is there a simple probe that can be used to determine whether an arbitrary HTTP host is an IPFS gateway? Does the IPFS Companion browser extension use logic like this, or does it have a manually-curated list of recognized IPFS gateway domains?