Thank you for your responses.
users would have time to do it, without substantial incentive.
If there’s little incentive to change those, what’s the rationale for having them as switches? Are they perhaps mostly for experimental purposes and no-one is actually expected to use them?
So assuming that IPFS doesnt change the default settings frequently
This is a concern that I have. How often can we expect it to change? For one, it seems like the hash algorithm is already expected to change (SHA256 to BLAKE2b) - I get that IPFS is likely still in the experimental phase at the moment, but such a change could really impact an actual environment.
And this is ignoring potential other implementations.
If the current defaults are not expected to change, shouldn’t they be documented and/or written as a specification? This would allow others to implement alternative applications which are compatible, and don’t end up partitioning up the network. This documented hash should be exact, that is, contain no ambiguities/tunables such as a customizable chunking algorithm.
Unfortunately, flexibility and hashing don’t really mix, since hashing has to be exact. That is, the same input must always give the same output hash (and not 10 different hashes depending on settings used). I get that some flexibility may be desired, for example, if a cryptographic hash has been broken, and it seems that the multi-hash format is intended to deal with this, but such changes should be rarely performed, and considered somewhat breaking.
Is there such documentation available, or is it still being worked out, or something else?
Further, I don’t think each file will only exist once in the network (zero duplication)
I interpret “zero duplication” as referring to not having wasteful duplicates (or zero duplication in named resources). That is, if you and I have a copy of the same file, and person C wished to obtain a copy, he can obtain it from either or both of us.
However, if the hash generated by you differs from mine, there is now a duplicate resource on the network. That is, person C can no longer download the file from both of us, and if he tries to download from both hashes (since there’s no way to determine whether the underlying file is identical), he will have two copies of the same file on his disk.