Hi everyone, I have an application question and am looking for general feedback or to initiate a discussion.
I cofounded our-sci.net, and cofounded photosynq.org, two projects focused on empowering communities to build research capacity. That means large groups (100s or 1000s of people) should be able to design and implement experiments, collect comparable, validate data, and analyze the results to address research questions of interest to the community. The Gathering for Open Science Hardware (openhardware.science) community has overlapping needs as well.
Data is collected both as sensor data (USB or bluetooth connected devices to your phone/computer) plus some survey data inputted by the user. In PhotosynQ we built a survey app and backend from scratch (4 years ago), but using today Open Data Kit (opendatakit.org) is a much smarter way to go for data collection. The data is stored as XML or JSON, and contains meta-data like user, location, time/date, etc. etc.).
So far, our approach has been the standard SAAS approach - we create server (RoR + PostGRESQL), let users collect/submit/analyze/discuss on it. Server + development costs money, so we must charge users for server access (reduces access), or we let users fork and set up their own server (beyond the capacity of most communities, requires lots of documentation effort + support to make reasonable).
I hate this “pay” or “host” model. It seems IPFS could really change the game by eliminating server costs and dramatically expanding access…
Am i right? I can see some issues right off the bat:
It’s unclear reading through the documentation if IPFS automatically backs up data in other locations in the network. So if I collect data (stored as data1.json on my phone), does it automatically back up parts or all of that on other devices on the IPFS network? Of course, when i go home to look at the data, I would download it to my local machine, but is that file replicated on other machines by default as soon as it’s created?
Let’s imagine a community wants to use IPFS for community-led research projects. And let’s suppose we built a js/html program, located on IPFS, which they can use to collect data, collaborate, analyze data, post results, etc. We’d probably also need an android app for data collection as well. At that point is it as easy as saying “OK - everyone in the community who wants to take part, install these programs and off you go! Data is automagically distributed to those in the network and scales up to mega / giga / terabytes?”. Besides adding features / fixing code bugs, are there other issues/costs that I’m missing on the developers side, or other downsides for the users?
… I’m sure there are more.
Just to elucidate things further, here’s some actual use cases.
Someone in Ukraine wants to research the best ways to repopulate native plants in the forest, currently grown only in universities. It’s surprisingly hard to transfer the plants without killing them, so the research involves identifying the best transplant methods for each plant, and identifying the hardiest varieties which can survive the transplant process. That means lots of individual tests where people collect plant health data (most survey questions, some handheld sensor data maybe) and outcomes (transplant worked / not) on many species using many transplant methods. Data is collected on the phone, and analyzed on the computer.
Someone in the US wants to develop new open source seed varieties by growing new seed varieties (crosses) in gardens across the US. Gardeners would collect data (survey, maybe some sensor) about the plant as it grew, collect the finished seed, and send it on to someone else to plant and measure. This data would be analyzed to identify which breeds performed best in which soil/climate/management conditions, to create a matrix of optimized seed varieties. Data is collected on the phone, and analyzed on the computer.
I’m intentionally simplifying things so don’t point out every flaw in these projects plz but hopefully this gives you an idea.
I’d love any thoughts or feedback or interest –