I am working on a similar project in Postgres. I will need to rewrite some things I think but I am sure it is faesible.
As of right now, I don’t know if it is the best choice but what I can tell you is you will need a lot of space, and I mean a lot.
ipfs-cluster, maybe just on the primary database, and use
ipfs-cluster-ctl add <file> --expire-in 99H and run the garbage collector
ipfs-cluster-ctl ipfs gc frequently might be a good choice but what about some data file that has never been changed like the system database files ? The files would need to stay static and the use of IPNS might work. This means whenever a system database file is edited by the SGBD, it would have to pin the file and publish the new hash in association with the IPNS key.
And then when you finally added the file, you can unpin it and cron the garbage collector everyday.
But there is something I think we can’t use with IPFS.
1 - When you run some queries on your SGBD sometimes you need some temporary files for the query to compute. They need to be really fast, and since they are temporary I don’t think I would add them to an IPFS node.
2 - Assume you edit a tuple, this tuple is stored in a datafile. Once the file is edited, you will add it to your node, unpin the last version of the file (or not, it depends on your use case, maybe for going back in time scenarios). For a time you will have the two versions of the file. What happens when you have 4 GB of data modified, 400 in the same day? What if you run a IPFS cluster ?
3 - I think the performance will be a very big problem. First on the standbies (if they have not pinned the file) will need to download it from the primary node, and it might take some times to find it. Unless you are using a cluster and this might not be a problem, but not only you will need to unpin the edited file on the primary but you will need to unpin it on the standbies also. And you will need to run the garbage collector on all your nodes and have some system for letting know your nodes what files to unpin. And the second is on the primary, what if someone access different tuples but are stored in the same file data ? You will need to add some type of locking mechanism, and this is for me the hardest part.
Transaction Logs and database backups won’t be a problem.