IPFS and multicloud

I think IPFS is a good FS for storing data in the open science ecosystem. However, at least in Germany, this is not yet on the roadmap, cf.

Marius Alfred Dieckmann, Alexander Goesmann, Kilian Schwarz, Christoph Wissing, Alexander Sczyrba, & Sebastian JĂĽnemann. (2022). NFDI Multi Cloud concept (1.0.1). Zenodo. NFDI Multi Cloud concept

I would like to promote the idea of using IPFS to store data in the open science context. However, to do that, one needs to find a solution that is easy to use for non-technical users. While pinning services like web3 storage or filebase are really user-friendly and demonstrate that this would be an option, I think it would be better to also show that less commercial tools can help users to work with files on IPFS in the same way as they are used to work with files today. In the scientific community (in Germany) nextcloud has become quite popular recently. However, the nexcloud plugin

is broken at the moment. Before considering fixing that, I was wondering if there are similar tools like nextcloud that support reading and writing files to IPFS.

Moritz

3 Likes

Wilkommen, Moritz! I think the repo you found was a proof-of-concept or prototype, all commits to which seem to have been made in a 2 week period. AFAICT it wasn’t a project of IPFS core developers or even done with any help from them, just an experiment.

I also think that in the 3 years since, much better tooling has emerged for running a Kubo service within a private network (say, a NextCloud!) to store private files and sync them with, e.g., a backup service over time-- I would distinguish between the “private network” use-case (closer to most NextCloud on-prem/self-host use-cases) and the classic “public DHT” use-case (for, e.g., publishing and archiving OpenScience data sets, etc). If you have the bandwidth to embark on something big, I would recommend setting up two different modules, one for each-- a Kubo as an internal “bucket storage” service, and a public gateway that “pins” (hosts, persists, publishes) only what is already synched to it from the other one as a kind of “pinning service” for trusted content. (One common way of derisking such a service is to flip the Gateway.NoFetch flag to true). Perhaps it’s more work than you were volunteering to do updating @justicenode’s module, but it’s also the 2024 configuration that I would most recommend to the NextCloud self-hosting/own-your-own-infrastructure crowd (and universities!). Who knows, maybe they would be willing to help! Shared work is always funner.

Reach out if I can be of any help! I’m @ bumblefudge on github and in the filecoin public slack.

1 Like

Thank you. I found the link to the Nextcloud plugin here

Can you suggest alternatives?

We are currently facing challenges with Nextcloud.

  1. Due to organization policies, It is often impossible to create accounts for all users who should have access to the data. Thus, the files (primarily small documents) are shared publicly. (For research purposes, this is not a severe problem as no confidential data is shared, but data that is often premature or “work in progress”)

  2. If the data structure within Nextcloud is reorganized, links break, and external users can not return to the files they used.

If Nextcloud is not the way to go:

  • Now, as of December 2023, are there IPFS / filecoin backed open source solutions as alternatives to Nextcloud, Dropbox, OneDrive, and Googledrive that skilled IT admins can install who can contribute stable nodes with sufficient compute and storage resources but not buy services from external companies like google, Microsoft or even SME.

Oh, sorry, I think I misunderstood the original question. Whether you end up staying within the framework of NextCloud and dockerizing a NextCloud-friendly version of kubo or just running it independently, I think my advice probably still holds:

  1. Stand up an “internal” kubo “server” running inside your perimeter/lab/identity system is the best place to start, so participants you trust can upload data for internal sharing. (Sidenote: w3up might be helpful if you want to use CAR files to structure and future-proof large and multi-part uploads into single CIDs).
  2. A public gateway that a subset of the kubo’s contents gets synced to for publishing might also be worth standing up if you also want datasets or final publications to be archivally and globally accessible.

As to your question about whether a reasonably skilled Admin can do this kind of stuff without too much intervention or phoning a friend, I think it’s always tricky to estimate level of work or skill prerequisites. In some cases, people get the basic stuff up and running in an afternoon, in other cases no amount of support is enough :sweat_smile: . Historically, tutorials, docs, and dockerizations weren’t really sufficient to hand off to the average admin (if the latter is expected to conform to university-level security requirements and integrate with a complex identity/authN system), but docs are improving at a good rate the last few months and hopefully will keep improving into the new year.

In any case, poke around, talk to an admin or two, and report back if you have more specific questions?

this other thread may be helpful (if not yet then soon):

1 Like