Standard for IPFS over sneakernet?

jphastings · June 3, 2024, 12:27pm

Hi folks! Is there a standard for using physical media as the medium for IPFS transport? ie. A standard for IPFS over Sneakernet?

I’d like to build a mechanism for low bandwidth communities to share megabytes-to-gigabytes of data with all the benefits of IPFS.

My current sketch-of-a thought is:

Create a standard disk format for “I’m requesting the file with ” requests.
Standardize on (a) common disk format(s) to use that are understood my most machines/OSes natively, eg. FAT32
Decide to keep all complete UnixFS files being transferred in directories on the disk (rather than inside a CAR file, so they’re accessible on machines without IPFS tooling; more below)
Standardize on a normally hidden area on disks like this which contains:
- A list of “requests” being made by previous users of the disk
- All the data chunks that aren’t complete UnixFS files (eg. Partially complete transfers, other IPLD entries etc; all associated with their CIDs, just like within an IPFS node)
- A set of pointers associating CIDs with the aforementioned complete UnixFS files & their location on the disk. (To avoid duplicates, but keeping the trustless assurances of IPFS)

With this standardised, a machine with a running IPFS node (or similar program) could check inserted disks for the “hidden area” mentioned above, and (if one exists) search the local IPFS peers for the CIDs described in the requests, adding the resulting data to the disk appropriately if found. Such a disk could be shared between low bandwidth communities, collecting unfulfilled requests from participating machines, and (later) providing the requested data for participating or non-participating machines.

That added benefit (beyond just keeping a kubo node’s data directory on the disk) being that this disk works as expected for machines which don’t speak IPFS.

Is anyone else thinking about this? Any obvious issues with the sketch I’ve laid out? Anyone else interested?

Jorropo · June 3, 2024, 12:40pm

exFAT is pretty good in that regard, it better than FAT32 and is it’s successor.
Bigger than 4GiB file support, transaction on metadata so you shouldn’t loose the whole FS in case of power failure or other, just files being written to.

This prevent incremental verification so range requests, streaming video, streaming copies, … wouldn’t work, the smart thing to do is to use --raw-leaves option (which should be default now or soon), then you can have both deserialized file and car file, the car file contain dag-pb proof blocks, and the raw blocks can be red from the non ipfs file.

jphastings · June 3, 2024, 2:28pm

use --raw-leaves option

Awesome — yes, that’s exactly the kind of thing I’m thinking of; provide all the benefit of IPFS’ metadata while making the UnixFS files available to dumb ExFAT clients. (Which is also an excellent suggestion btw, thank you!)

A lot of this seems “obvious” right now, which is a good sign for an achievable project — and a strong endorsement of IPFS’ design!

zacharywhitley · June 3, 2024, 3:11pm

How do you separate the dag-pb proof blocks from the raw blocks when creating the car file? Wouldn’t you also have to be careful to make sure the raw file was added with the same parameters that were used on the file to generate the dag-pb?

Jorropo · June 3, 2024, 3:12pm

This doesn’t exists, it’s definitely achievable all the “tech” already exists.
You need to write glue and decide how the UX is gonna be.
If you want to use Kubo you can imagine something like ipfs ext-volume mount /path/to/your/usb/stick/ipfs or something like this.

Jorropo · June 3, 2024, 3:47pm

There is no code to create such car files, this is 20 lines of code using go-car and boxo, 15 of them are boilerplate. Nothing in car files require you to put all the blocks, you store whatever blocks you want.

This is an issue in the inverse case, without the dag-pb proofs you need to precisely repeat all the hashing steps to check the CID.
With the dag-pb proofs you only need to know to map raw-CID → filename, offset, length.
You could imagine something smart where you use the 1 to 1 correspondence between the unixfs and deserialized file, so if you need path/to/file chunk 1024-2048 you open path/to/file on the disk, the problem is that this information exists in higher level of the stacks and the blockstore layer only deals in blocks and does not know about this.
So it would require writing more code and doing rearchitecturing.

The simplest solution is to implement the blockstore.Blockstore interface and store the leaf map into an sqlite, badger, bbolt (pick your favorite) embedded database.
Note: if you do that only implement read-only, for the write mode it will be easier to parse the unixfs files yourself.

jphastings · June 3, 2024, 4:20pm

careful to make sure the raw file was added with the same parameters that were used on the file to generate the dag-pb

Absolutely; as I see it the metadata I’d be storing in “hidden area” would include all the IPFS paramters the device is using (though the CID contains most of this information as part of the multihash).

zacharywhitley · June 4, 2024, 12:02pm

Thanks. Great info. I’m not familiar with the code but it’s that basically what urlstore and filestore is doing? The mapping part at least.

Topic		Replies	Views
IPLD and IPFS - A Pitch for the Future ⚾ Protocol ipld	0	722	August 31, 2022
IPFS and sneakernet Ecosystem and Usage use-cases-and-apps	7	4339	June 30, 2021
Sharing protocol (not IPFS)	20	1269	April 28, 2020
Encrypted file support, disk volunteering, auto-distribution among peers, streaming video Ecosystem and Usage	3	436	August 7, 2021
GT Systems IPFS and Filecoin questions Help	24	2326	February 19, 2018

Standard for IPFS over sneakernet?

Related topics