Deduplication Ratio

wclayf · August 27, 2021, 3:41pm

I think there may be a chunker (not sure if it’s the default) that uses some kind of statistical analysis to make chunking result in a very high number of duplicates (high deduping factor) happening, even in cases where perhaps a large file has a single block of bytes added to the front of it which would theoretically make every chunk hash differently. But with this special algo/chunker it will still divide up the data so that dedup happens a lot.

UPDATE: I found the old discussion where I learned about his:

Topic		Replies	Views
Question about deplication go-ipfs	3	535	December 1, 2018
IPFS and deduplication Ecosystem and Usage use-cases-and-apps	7	904	June 4, 2022
How does IPFS handle redundancy? Help	3	1774	May 20, 2017
IPFS Propagation Help go-ipfs , research	4	2668	November 21, 2018
IPFS file distribution clarification Help	6	868	November 2, 2020

Deduplication Ratio

Related topics