I have to transfer data between HPC clusters and have tried IPFS but performance was very bad. I think the issue here is that the file system used (lustre) really doesn’t like large amounts of small files, so the default block store slows everything down.
Are there ways to store the blocks that are friendlier to these kind of file systems?
I have done some reading and there used to be some general performance caveats [1]. Have these been fixed?
[1] How to tune a private IPFS swarm for large files?
If the file-system on which your IPFS repo is stored is the issue, it may be worth trying one of the other IPFS storage backends, such as pebbleds
. See Kubo CLI | IPFS Docs
It might also be worth trying to stress the file-system by using flatfs
: if that doesn’t change the observed behaviour then it is probably not the performance of the IPFS repo and/or its backing store that is limiting you.
I have done some reading and there used to be some general performance caveats [1]. Have these been fixed?
I started that thread! I’m pleased to say that the network transfer speeds have improved dramatically since I posted that. Where I used to get ~5 MiB/s I now frequently see 50-70 MiB/s.
That said, I have been experiencing occasional hangs in transfers.