IPFS Performance in AWS

Hi

I’m building a dApp that requires us to scrape NFT metadata over potentially large collections (let’s set a baseline at 10k or so) in as fast a manner as possible (aiming for sub 30s). The assumption is that this metadata is stored on IPFS. While the number of files is large, the total size is only a handful of MBs.

As a first step, I set up a single IPFS node on AWS but found that ipfs gets are prohibitively slow (>3-5 mins for 10k collection)

As a fallback, I plan to scale horizontally (say 5-10 nodes) and place the IPFS nodes behind a load balancer. However, even with this approach I see exponential slow-downs getting into the higher numbers of file retrievals (5k+)

Questions

  1. is there anything glaringly obvious that I’m missing? I’m new to the dev side of IPFS so I wouldn’t be surprised if I’m just doing something entirely wrong.
  2. is there something I can do to make ipfs get faster? some configuration etc
  3. what hardware specs would influence an ipfs gateways performance the most? cpu? memory? network i/o? I’ve played around with a few different types of AWS instance and found similar results.

Any help would be greatly appreciated
Thanks in advance

That’s 18 milliseconds per retrial. That’s pretty fast isn’t it? …considering the network latency and the fact that each call is at minimum one round trip?

for context - I can retrieve 10k metadata files hosted on normal http server in <7s with the same infrastructure - this too is individual file retrievals not a glob

just wondering if I’m missing anything groundbreaking… I used this guide for setup Host a decentralised application with IPFS and AWS | by Alexander Lechner | Coinmonks | Medium

question - does go-ipfs do any sort of parallel processing of requests in its queue? or is the entire thing sequential?

I suggest you greatly increase the configuration values for Bitswap in go-ipfs:

For reference, there is a section here about how to configure ipfs for production at larger-scale: Download and setup - Pinset orchestration for IPFS