We do so much better now.
We live in a world where cryptography exists.
1st, is this a good idea? or am I just reinventing the wheel for some solution that is already out there?
Yes it is, that would work, that actually a very well known problem.
So your solution is pretty good however is has an issue, you need to be online for checks to happen plus you as the company can remove entries from the database, both things that other actors wouldn’t like in theory.
The other way to do it is with a signature.
Instead of having a database of hashes, your company holds a private key that it uses to encrypt hashes of correct files producing “signatures”.
Then when someone wants to check a file authenticity, they take their tool, hash the file, decrypt the signature with your public key and check that both results match, if they do that mean the file is correct.
Other parties usually like that more because they only have to download the public key once from you and then can do all checks offline themself, that avoids downtime (a simple http server cluster easly hits 100% uptime while a database of hashes that more likely to hit ~99%) and if they have a LOT of files to check, they are not bound by your database speed, they can just use more servers and parallelize. It’s also best privacy wise because with the database you can know which person is checking which files while the signature you can only know that once person download your public key, you don’t know how much files they check nor which one.
However you as the company might dislike the signature as if you get hacked and private key is leaked you can’t remove bad signatures, you basically need to start over with a new key set.
For example on how the signature works you can open Index of /debian-cd/current/amd64/iso-cd
There is 3 types of files :
*.iso
, thoses ones contains the actual data.
SHA{256,512}SUMS
contains the hashes of thoses files.
- and
*.sign
which are signatures of the hash files.
2nd, is this doable with IPFS technology?
Yes you can but there is no real reason to do so.
IPFS don’t just hash files, it make a lot of work to make them shareable in the network.
Basically IPFS cuts your files into 256Kib chunks (we call them leaves), then it creates metablocks (we call them “Roots”) that contain the list of all blocks to download, and if thoses roots gets above 256Kib too it create roots that list other roots until you get to a single block listing all original chunks and the hash of that block is the CID.
If you only care about the hash part just hash the file, the chunking part of IPFS is only usefull because we are sending chunks over the network, this allows us to download different chunks from multiple nodes at the same time.
Or should I be looking for some other technology to do this with?
The tech you need is called SHA256 and Ed25519 (there are debates about which is the “best” one, but thoses are very strong bets).
Crypto is hard, but realistically this completely falls under the category of things that faster to do rather than searching for something already doing that. (note I don’t recommend you implement SHA256 and Ed25519 but thoses algorithm are figurated out, just google : “Sha256 <your programing language>” and you will find likely the best result in the top 3 stackoverflow links, same for Ed25519).
3rd, anybody out there think they can do this? Are you interested in talking about it further?
I could, basically hash, sign on a server in your company.
And in the website part, hash decrypt and check they match.
It’s like 1~3 days of work for PoC.
The biggest questions are how your existing software is gonna interact with the database or signing servers and how are you gonna protect them from hackers ? Which will take up to a few weeks or month of work (and you can hire a front end to make it pretty in the mean time).