More information on how IPFS gateway works

Hello! I would like to learn more about how the IPFS gateways work to funnel data through ipfs.io. Is there a particular doc file I could read? How do IPFS nodes setup a gateway that is used by other IPFS nodes?

Thanks!

1 Like

The public gateways are just normal IPFS nodes with the difference that they have automatic GC on.

We currently don’t have any documentation on how to run a public gateway, but basically we’re running NGINX in front of the default gateway of go-ipfs. You can see the NGINX configuration here: https://github.com/ipfs/infrastructure/blob/master/nginx/nginx.conf

Anything in specific you want to know? Easier to answer scoped questions than open-ended ones.

2 Likes

Thanks for the information.

I guess my biggest confusion stems from my understanding of a web server being a centralized node. So you’ve got ipfs.io and then IPFS gateways make it possible to access distributed data via ipfs.io/ipfs but that’s the part I don’t understand. How do you make it possible to access content from many nodes through ipfs.io/ipfs?

I apologize if there is an obvious answer to this question, but hoping you could point me in the right direction so I can learn. Thanks!

1 Like

Right, that’s a good question.

It’s indeed correct that the gateways are (currently) centralized. It’s a important step in our upgrade path of the web to be able to use existing technologies to access distributed content. The trade-off currently is that the HTTP part becomes centralized, as that’s how HTTP works.

So in short without too many details, this is the flow from you adding content on your IPFS node to your browser showing the content from one of our gateways:

  1. You add the content on your local IPFS node, getting $HASH for future use
  2. Your local IPFS advertises the content to all the nodes around it, effectively telling the world that you have $HASH available in case anyone needs it
  3. You go to https://ipfs.io/ipfs/$hash in your browser (or any other public gateway)
  4. The HTTP gateway receive a request for /ipfs/$HASH and checks if that content is available locally
  5. If it’s available locally, just serve the content directly
  6. If it’s not available locally, ask the network for which peers the content is available at
  7. Once we know some peers that has the content, start downloading the parts of the content from the peers
  8. When chunks come in via IPFS, serve what we can via HTTP to the browser
  9. Now you can see the content you added at step #1! \o/

Basically the flow is the same within IPFS, but without outputting it via a HTTP endpoint.

5 Likes

i saw this link on our forum here, and here is the GIST:

2 Likes

Thank you, @VictorBjelkholm and @nothingismagick! Really helpful.

For step 7. Once we know some peers that has the content, start downloading the parts of the content from the peers

Does the gateway download and pin the requested content from other peers, and would it mean the data stored will increase proportional to the varying content requested to the gateway?

Found the answer. In case someone has the same question. Yes, the gateway will download and save the object to the datastore. Though it will not pin it and will be a candidate for GC.

2 Likes

Correct! Important to note if you’re planning to run your own gateway: by default the configuration of go-ipfs does not have automatic GC turned on and you would have to change that yourself. That’s part of the DatastoreConfig: https://github.com/ipfs/infrastructure/blob/04b32f7090f6a40b3778fe54054b4eb64a65eb75/ipfs/config.tpl#L6-L14

2 Likes

I’m sorry, but what does GC mean?

garbage collection - clean up memory from stuff that isn’t needed.

could you tell how to setup a new public gateway? thanks

Correct me if I’m wrong but this worked for me.

Setup an IPFS node on a public server
Edit the Config file of the node’s address section: change the Gateway to: /ip4/0.0.0.0/tcp/
“Addresses”: {
“Gateway”: “/ip4/0.0.0.0/tcp/8080”
},
The value of 127.0.0.1 means that only the local machine can access the gatway, but the 0.0.0.0 means any machine can now use the server as a gateway.
Open up the firewall on the desired port.
Enjoy the benefits of your own public gateway… faster loads for local added files.

1 Like

Hi All
I know its very old post. But, for some reason, I am unable to host my reactjs components in IPFS. Any help?

Hello. I want to know that what does "all the nodes around it " excectly mean? It is all the peers which it already has connected to or some peers whose peerID is close to the CID by XOR distance ?
Thanks

The latter, that is Kademlia-DHT provider record placed in peers found to be XOR-close to the CID.