Hello!
It seems that it’s work speed high enough(ipfs add/get)
But can someone provide any real actual speed comparison between traditional http file upload and download and using ipfs?
It will be very useful!
Ideally it can be faster than downloading resources from a single server, as IPFS relies on a peer-to-peer network… but it depends on how spread are the files and the resources/specs of each peer
I am interested as well to look at actual data/comparison if any official data is already there.
[Would be cool also to use it to compare different versions of IPFS]
1 Like
Yeah, agree with that.
And before IPFS nodes start downloading file, it takes some time to find the locations of the slices of file with DHT.
HTTP protocol also need to find the location of file by resolving DNS, but it’s much faster than IPFS, I think.
And IPFS nodes will download “Duplicate Data”, which lead to lower performance.
opened 05:43AM - 17 Jan 18 UTC
closed 05:29PM - 29 May 20 UTC
#### Version information:
ipfs version --all
go-ipfs version: 0.4.13-
…
Repo version: 6
System version: amd64/linux
Golang version: go1.9.2
#### Type: Bug
#### Severity: Medium
#### Description:
When running `ipfs get` on a given file, the more nodes that have the file and supply it to the test node, the more dupe data the node receives. This leads to lower performance and massive amounts of bandwidth waste.
Test Setup:
Testing was done with EC2 medium instances that all lived on the same subnet. Bootstrap was updated to ensure all nodes could find each other correctly.
`ipfs swarm peers` was used to confirm that the nodes were connected before doing `ipfs get` on the test file.
Before testing different file sizes, the ipfs daemon was stopped, `.ipfs` was deleted and the node was re-provisioned using `ipfs init` and the `ipfs bootstrap add` command. This ensured no files were cached and the stats were only for the test file in question.
Test files used were as follows:
5.1GB - sintel_4k.mov - QmWntgau1qWJh7hos91e6CqEzSWfaSn7permky8A3WJEnS
1.1GB - Sintel.2010.1080p.mkv - QmUwZFGPptdF5ZG58EdozjDSXYugPsxe1MwPZFQ4vZmAsb
649MB - Sintel.2010.720p.mkv - QmcdSfr63CHZ3sJkubrozeRmT4bo2DqpD8DKPFfhNby4FB
Test files can be found here: http://download.blender.org/durian/movies/
Replication procedure:
1. Configure a fresh ipfs node
2. Add the test file to the node (Node 1)
3. Configure another ipfs node (Node N)
4. Run `ipfs get HASHHERE` on Node N
5. Record output of `ipfs stats bw` and `ipfs bitswap stat`
6. Repeat steps 3-5 until you have Node 6 retrieving the test file from Nodes 1-5 (Which have each done an `ipfs get` on the test file over the course of this testing).
See full dataset here: https://gist.github.com/natewalck/c739b57b1e90dfe2092344f78bf7de78
For each node that the test file was retrieved from, the duplicate data rate increased in a linear fashion with the number of nodes that served the file. For instance, if Node 3 retrieved the test file from Nodes 1 and 2, the duplicate data can be expected to be 100% of the file size. If Node 4 retrieves data from Nodes 1, 2 and 3, the expected duplicate data is around 200% of the file size.
`iftop` was used to validate that the *actual* traffic incoming to the node matched the data observed in `TotalIn` from the `ipfs stats bw` command.
In the chart below, each of the test files is compared as number of nodes vs duplicate data received. As you can see, it is almost completely linear.
<img width="801" alt="screenshot 2018-01-17 00 37 36" src="https://user-images.githubusercontent.com/867868/35027423-a4ee0500-fb1e-11e7-8d49-06aa735aae57.png">
I'm not sure if the situation is better for small files (it probably impacts the transfer of smaller files to a lesser degree due to their size), but this seems like a rather large issue for big files.
One use case for ipfs is a distributed yum/software repo. With the current bitswap/wantlist performance, it would be difficult to host rpms and serve them out to clients in a performant fashion.
I'm not sure where to start looking to optimize this issue, but I wanted to investigate it and provide some data. Is it possible this is caused by a node requesting all of the same blocks from other nodes in its wantlist, receiving the blocks from all of the nodes at nearly the same time and then requesting the same block yet again to all the nodes, etc?
Thanks for all the work you are doing on IPFS, it is a fantastic project! :)