Encryption, private data, and private swarms with IPFS

Using IPFS for user apps with user data typically requires encryption. Since IPFS does not natively provide encryption, this is something that developers need to find a solution for.

Encryption of user data is a broad topic, and with user-owned data, this is even more challenging as it typically requires some form of key management.

Tools, libraries and services offering encryption over IPFS

In the last couple of months, I came across several tools and approaches to this problem, so I’m trying to curate a list of tools, libraries, and services that address this.

Libraries and tooling built on top of IPFS/IPLD

  • WNFS is a filesystem built on top of IPFS by the Fission team. WNFS is pretty clever and uses a unique symmetric encryption key for each file and directory while encapsulating the encryption keys in the actual IPLD nodes. This concept is known as a Cryptree (more on this in this talk https://youtu.be/3se17NAS-Lw?t=435)
  • Peergos is another filesystem built on top of IPFS that also implements Cryptrees, the same pattern used by WNFS.
  • Ceramic created the dag-jose codec for IPLD which allows storing encrypted data.

Usability

WNFS is the only one I’ve tried, thanks to some help I got from the Fission team.

The nice thing about WNFS is that it works in browsers, and Fission’s work on WalletAuth means that the encryption keys for the private filesystem are derived from the user’s blockchain wallet,e.g. via metamask.

As far as I know, WNFS is agnostic to where you derive the root key from, and they also have an example that uses the WebCrypto API to generate non-extracable keys in the browser. In this case, when using multiple devices, each device has its key and access is delegated with UCANs.

APIs and services offering encryption with IPFS

Unlike the projects above, the following are hosted services that handle encryption for you. I haven’t actually tried any of these, but they seem suitable for trusted setups where you want to ensure data isn’t public by default, but are willing to delegate trust to the service to manage encryption (in a somewhat similar way to how you use Google Drive and Dropbox):

  • Fileverse an encrypted user storage and sharing app built on top of IPFS that integrates with cryptowallets.
  • Chainsafe files an encrypted user storage/sharing app built on top of IPFS
  • Lighthouse storage a hosted IPFS service with encryption. They manage the keys for you. They have an SDK which means you can use this programmatically in apps.

Private IPFS Swarms

Another thing that’s worth flagging is private IPFS swarms.

Private IPFS swarms do not encrypt any of the data, instead they limit network participation and communication to nodes that share an encryption key.

from the config docs:

It allows ipfs to only connect to other peers who have a shared secret key.

Some more details about private IPFS swarms from a GitHub comment:

Basically, everyone on the network uses the same symmetric key to encrypt all traffic (on top of the other encryption we do). This means you can’t join without this symmetric key.
Forward secrecy: connections are already encrypted and secured with a Diffie-Hellman handshake before they’re re-encrypted with this shared secret. So yes, it does have forward secrecy.
However, if you leak the secret key, anyone with access to the secret key can now join the network unless you rotate the secret key first.


Any services, tools, or libraries I missed? Let me know by replying! I’d love to know

5 Likes

Another user app that I forgot to mention is https://fileverse.io/

1 Like

WNFS seems really nice, thanks for sharing this is a great topic. Have you found any information out there on how to use dag-jose with kubo ? Saw this, haven’t found yet an example of how it’s done with IPLD.

The link to the ceramic.network blog post on dag-jose doesn’t open here (other links work), copying the URL manually works though, thanks.

2 Likes

Just came across this guide and example repo. It delegates encryption to GPG.

1 Like

I haven’t yet, but I’m sure someone from the Ceramic team could maybe chime in.

Hi Daniel, the frameworks look very interesting, thanks for researching. Do you perhaps also know whether these frameworks allow to change the encryption?

We are looking for some ways to host private data on IPFS, and the ownership over the data can be changed. We are thinking of changing the encryption once the data changes from owner, thereby making it able for the new owner to access the data, while the old owner should be unable to access the data.

Looking forwards to your thoughts.

My company is working on a solution that posts a “key file” to IPFS, which is a binary blob of asymmetrically encrypted keys that allow one to many / many to many encryption without payload replication. Each file is encrypted with a symmetric key, and that key is then asymmetrically encrypted using a few different schemes that allow one or many users to decrypt the symmetric key. Rekeying is as easy as reencrypting the file with a new key, then regenerating the key file based on the new rule sets / keys.

1 Like

Very interesting. Do you have a link where I can find more information about it? You can also send me a PM if you prefer that.

Some of the libraries we are developing are at:

Anything with “key” in the repo name. Unfortunately they are un[der]documented, but sometime in the near future we will be pulling it all together.

1 Like

Dropping this here following a conversation for further investigation:

  • Privy
  • Medusa
  • Arcana

This is pure gold, thanks so much for collecting resources here.

Broadly, seems like there are two options today:

  • Putting encrypted data on public IPFS network
  • IPFS private swarm (persistent, transient, whatever)

The former needs a comprehensive write up on risks, trade-offs and threat model - was a non-starter in some folks designing standards for personal data stores in the W3C+DIF collaboration group.

The latter needs comprehensive documentation and examples that are not solely Kubo. Eg you could have web pages on two separate computers connected over an E2E encrypted connection (websocket, web-rtc, etc) sharing unencrypted content-addressed data… and is that an IPFS private swarm if it never went on the IPFS DHT?

3 Likes

That’s a good summary as far as my understanding goes.


Regarding IPFS private swarms, I thought it’s also worth pointing out that it also has many trade-offs that are worth considering. There was recently a proposal to deprecate pnet and the discussion includes many interesting insights about private swarms.

The TL;DR of that thread (from my understanding):

  • Double encryption is inefficient
  • Key management of a symmetric key can be a pain as the number of nodes in the swarm grows
  • it isn’t supported by all protocols (only TCP and WebSockets)

Agreed

I think private swarm is the term for this specific feature. But I agree that there are other ways to exchange CID privately that we should maybe address at some point in the docs.

FWIW, we know that not all pinning services publish to the DHT but their content is still discoverable with the help of Bitswap WANTs.

Here’s another npm package that uses AES encryption to store files on IPFS

2 Likes

One thing I’d like to prefix this with: if you are writing an app, don’t use an encryption library that hasn’t been audited by cryptographers. Look for public audits. Also, if you’re doing anything non-trivial cryptographically, get it audited before telling people to use it.

I can also expand on Peergos, as you’ve mentioned us.

We implement cryptree+ which has much better privacy properties than the original cryptree. In particular, we don’t make ciphertext public, you need to auth to retrieve it. This is enforced with an extension to bitswap, which you can read about here:
https://peergos.org/posts/bats

This means we create a third category which is:

  • private encrypted data available with auth via the public DHT

We also have a unique sandbox that let’s anyone write apps using peergos, without needing to worry about storage, encryption, identity or access control. An app has a simple rest api which is handled locally in the browser using service workers. This means user’s own their identity and data, and can take data between apps easily. The apps are also sandboxed to protect against data exfiltration. The idea being that you should be able to run an untrusted app over your private data and not worry about it stealing the data, or tracking you. You can read more about this here:
https://peergos.org/posts/a-better-web

All our data is stored in ipld (dag-cbor). You have 2 encryption keys for each file or directory (one for the data, and one for the metadata). We have a super cool feature which is zero-IO seeking within a file. This means that you can seek within an arbitrarily large file without doing any intermediate IO. This is achieved without the server knowing which encrypted chunks are part of the same file.

Peergos works in browsers, and the browser client treats the server as untrusted - verifying hashes and signatures on anything it receives. You can log in through any device. Your keys are derived at login.

You can read more in our tech book:
https://book.peergos.org

If you want, I can get you a free account on peergos.net to try it out, just email me.

4 Likes

@ianopolous. Thanks for Suggestions regrading ipfs-encrypted Node.js module. I love to know more about peergos

2 Likes

I have written about some of the quetions I had as I was reserching Cryptree.
Problems-with-access-control-mechanisms
I want to discuss.

1 Like