Sharing protocol (not IPFS)

Valmond · November 4, 2019, 4:10pm

Hi everybody!

I’m following IPFS (Kademlia, block chain tech etc. etc.) since quite some time and in parallel I’m developing a “sharing protocol” which is not IPFS, but maybe a bit!

The main difference is that the data can be changed without the need for updating the link, but not only, of course.

Would you like to host efforts like this here on the IPFS board?

There are lots of similarities for those kind of protocols, especially when it comes to the networking part I guess, and I think IPFS can benefit from my protocol and vice versa.
Plus it’s always nice to group and chat with like minded people

Cheers

kikoncuo · November 4, 2019, 5:51pm

I´m not one of the developers.

I´d recommend you take a look at IPNS.

If you have barely started maybe it makes more sense to collaborate to the project? If you already have something, maybe the devs can help you guide the decision

Cheers and best of luck!

Valmond · November 4, 2019, 8:22pm

Thank you!

I have looked at the IPNS and for what I understand it’s a centralised “DNS” system, right? I really like the idea about distributed data but I felt there were some shortcomings with distributed hash tables, the impossibility to change the data without redistribute the link (or use some central system), and the lack of incentive (without going with some sort of crypto, like filecoin).

I have been working for quite some time on my project, and I think IPFS has strengths where I have weaknesses and vice versa, that’s why I thought hanging out here could be beneficial for all of us

Akita · November 4, 2019, 10:21pm

IPNS is not centralised and let you deal with mutable data. What you are referring to is DNSlink, which is, indeed centralised (and uses IPNS), as another layer to point to an IPNS or IPFS record.

If you want decentralisation + mutability + choose the name, use the combo Ethereum Name System + IPFS, for example

Akita · November 4, 2019, 10:27pm

Also, are you able to share a bit more about your project, ideas and goals?

Valmond · November 5, 2019, 11:32am

Hi Akita!

Interesting, I did a quick search but didn’t find anything about how the IPNS works “under the hood”, if you have any information I’d be more than happy to check it out.

I did toy with the idea of mixing crypto into my protocol, but I think it’s beter if you can do without;
it takes up lots of disk space for the history that I don’t need, consumes lots of electricity (globally), can become a speculative asset etc…
I also think that a crypto is still centralised, even if no one really controls it (except the looming 51% attack/control problem of course).

Thait said I figured out another way of doing, also based on incentive: mutual sharing.

So my protocol, Tenfingers is it’s little name, is built roughly like this:

A node layer (much like IPFS, Kademlia…), the nodes reach out to known nodes and gather information about the existance of new nodes…

But here comes the difference; instead of sending data to a specific node, you switch data with a random node (obviously, the nodes have to accept).
So you store (and share) their data, and they’ll store (and share) yours.
You do this with a handfull of nodes, producing a list of addresses where your data can now be found.

The addresses gets packaged in a ‘link’ file, which can be used to access your data.

So now, if you want to change the data (say you update your blog), you just need to send the updated data to the nodes in your address list.

All agremented with RSA & AES for security and user liability (you can’t know what you are sharing for example, except if you have access to the link file).

I didn’t go into details, like you can bootstrap a new node with a link for example, addresses are authentified by data owner,
you can boost download speed as you have a bunch of addresses pointing to the same data etc. etc.

but this is the gist of it anyway

Cheers

Akita · November 5, 2019, 1:40pm

It looks a lot like IPFS, with the extra “friendship” concept on top of it.

Basically, when you have at least 3 ways to retrieve a value from IPNS: via a DHT (decentralized), or via DNS (that’s the centralized method), or via ENS (decentralized). (There is also via PubSub (peers interested by the same topic) and more to come.) An IPNS record is typically an IPFS multiaddress of a static content. But as the owner of the IPNS record, you can update it anytime to point to another static content (typically an updated version of your content).
Check this tutorial to see how to use it.

What do you mean? Ethereum is bound to become centralized eventually, but Bitcoin-like seems pretty decentralized to me…

If it’s random nodes, how do you know 1) it’s reliable?, 2) it won’t store more on your node than you on it?

Looks like an IPFS multiaddress of static content, or an IPNS record for mutable content.

Looks like the Pubsub feature, with other nodes subscribing to the topic “YourNode”, and you publishing to that.

Do you mean encrypted files? This is not included in IPFS, you will have to encrypt before upload. Some libs exist already.

In IPFS, you can bootstrap with a link to another node, and it will send you addresses of peers it knows. Is that what you mean?

That is baked in in IPNS: its address is derived from peerID, and the content is signed.

Baked in in IPFS.

Hope it helps!

Valmond · November 5, 2019, 2:47pm

Thanks for all the information, it’s really nice, and very close to what I want to achieve!

So, for the questions, in order:

Basically, when you have at least 3 ways to retrieve a value from IPNS:

So, maybe I’m missing something here, but for me:

ENS: well it’s decentralised (except things where money is involved usually ends up in the hands of the few) but there are all the other problems I listed (that I personally doesn’t like, it would definitely function).

DNS: ok it’s centralised

DHT: either I have missed something but if this is a Distributed Hash Table, then the data is not centralised, but the access to the data is, just like IPFS!

So for me, something is still missing (again, for me).

What do you mean? Ethereum is bound to become centralized eventually, but Bitcoin-like seems pretty decentralized to me…

So for the centralised part of bitcoin, I don’t know what’s the status today, but at a certain moment, the biggest pool had over 50% of the hash rate, so it effectively controlled bitcoin (they’ll probably won’t do anything as they earn money with it, but it’s IMO far from fully decentralised!).

If it’s random nodes, how do you know 1) it’s reliable?, 2) it won’t store more on your node than you on it?

For the reliability, it doesn’t matter that much (as you’ll use several) but a service checks the quality of the nodes and if a node is deemed too bad, it’s dropped (with their data) and another random node will be searched for.
A node only stores data when it needs other nodes to share its data. So when you share a file, and set the redundancy to 10, you’ll store 9 (as you store one yourself) data from other nodes (and the 9 nodes will store your data of course).

The sharing is also negotiated (WIP but the idea is to configure some percentage range with a maximum size), so you won’t store 1GB when the other node only stores your 5kb text file.

Also, you’ll only store data when you need your data shared, that’s the “incentive”! For what I have understood, Kademlia, IPFS always stores data (but I am aware I do not know very much about this).

snip (interesting information but no questions per see)

Do you mean encrypted files?

Yes, the files are encrypted, the communication too (throwaway AES256 over a RSA4096 ‘handshake’). The addresses are encrypted and signed. The data uses AES256.

In IPFS, you can bootstrap with a link to another node, and it will send you addresses of peers it knows. Is that what you mean?

Yeah I do that too (and yes your assumption is correct), but if you have an old version where the bootstrap nodes doesn’t exist any more, then a recent link file might get you started (as it contains addresses to, hopefully, valid nodes).
This is just icing on the cake though.

Baked in in IPFS.

Nice.

There are some WIP here, like the dropping of stale nodes. I don’t have a big enough test net just yet, but again, it’s not rocket science IMO.

Akita · November 5, 2019, 4:12pm

I don’t follow. The DHT is distributed, meaning you can ask any peer, and you will end up finding one of the several peers having the address. This peer will tell you about the address of several (if not all) of the peers who have the content. How is it not decentralized?
What you mean is that only one, or a few nodes have the content? In that case I understand, but I have 2 remarks: the first is that this is the situation you will be at the beginning when you create data, before you spread it. The second is that in a distributed system, you can only spread data if another node accepts to download it. This means either it being nice, you paying it (via fancy crypto such as Filecoin, or dead-simple cloud contract), finding a buddy as you propose, or having control on the second node (your NAS at home).
IPFS doesn’t care about the incentive for others to host your data. This is an upper layer you want to build.

Got it.

Meaning that on average, you will store 90% of you capacity for others. (btw, if you want to use regular end-user laptops for that, you will need way more than 10 buddies to ensure correct reliability, as the churn rate will be high).
As a rational user, my interest will be to tell that I host data for others, when really I don’t. When they spot me, I will be dropped by them, so I will chose a new 10th buddy, and make it download my data for redundancy. I’m fine with my data hosted 10 times and no diskspace shared. Or even spawn a new peer if i’m notorious

Communications are encrypted in IPFS, of course. Addresses are signed. Why would they be encrypted? If you only want some peers to access them, just send them to them only.

On IPFS, you can add any node as a bootstrap node. No need to use the defaults if you don’t want to/ they become unavailable.

This is an incoming feature too (maybe completed by now).

To sum up, I think your use case is well covered by IPFS, except the “buddy data hosting” part. You can even make your network a “private network” not interacting with other IPFS nodes.

By contributing to the buddy hosting for IPFS, you would be able to leverage all the work done by the community. The rest is for free!

Unfortunately, it very well might be… Good luck!

Valmond · November 5, 2019, 6:41pm

I don’t follow.

Yes I’m sorry, I’m unclear in my wording here.

What I was trying to express, is that in a DHT (barring usage of ENS for example to resolve this specific problem), when a publisher changes the data it shares (say a blog page), the user will, in some way or another, have to go through a central server (or central something like getting an email from a mail server) to get a new link to the new data.

You can’t just have a blog page, and then change the blog page, without the user have to go to, say, your (http) web page and get the latest link to this blog to see the new version.

With my protocol, once you have the link to the blog, you can just hit refresh and see the new version.

It might seem like a detail, but for me it’s the whole difference

This means either it being nice, you paying it (via fancy crypto such as Filecoin, or dead-simple cloud contract), finding a buddy as you propose, or having control on the second node (your NAS at home).

Or sharing the other nodes data if (and only if) they share yours. That’s the incentive I think will work too (all yours already do work of course).

As a rational user, my interest will be to tell that I host data for others, when really I don’t.

So, yes you are right, you can do this. But slowly, a node will find its “buddy list” of reliable nodes and use mostly them.

Meaning that on average, you will store 90% of you capacity for others.
… Addresses are signed. Why would they be encrypted?

Okay yes, there will be a waste of storage space for sure, so it will not be practical for storage of very large data (yet, storage space is kind of cheap and will only get cheaper).
This waste could be quite limited, say you share with only 2 other nodes for example because the addresses are encrypted (second question) and can be sent around to nodes without giving away the information:

So when your laptop hooks up to your work wifi and gets a new public IP, this address can be broadcasted to the concerned nodes so that the user, when downloading the data, can update the link file with your new address. So links can be valid even after IP changes.

make your network a “private network”

That’s kind of cool, I will think about it, but for later.

By contributing to the buddy hosting for IPFS,

You mean run an IPFS node? Guess I’ll have to look into that now

Unfortunately…

Yeah, we all dream of big exploits, and rarely know what will happen in the future. We’ll see if my experience will fit the bill

Akita · November 6, 2019, 9:03am

This is incorrect. The user will lookup in the DHT a first time and ask “what is the last IPNS record published by this peerID (AKA the blog I like)?” The DHT will provide this record, the user will check that it was signed by the rightful publisher, and read it: it’s an IPFS CID of the last version of the blog. the user will then lookup the DHT a second time, this time for addresses of peers having this content (the original publisher, or people having fetched it and having it in their cache). The user will then connect directly to them and download the content concurrently.
When “asking the DHT”, there is no central server involved either.

That is what I called the buddy(ies): Use Case: Pinning Buddy System · Issue #36 · ipfs/ipfs-gui · GitHub

I see.

I don’t get what you mean by that. If there are not encrypted, users need to know them to access the content. If they are, users need to know the decryption key to access the content. So in practice, this decryption key is just another way to define an address. They can spread it just like unencrypted addresses.

Baked in.

In IPFS you even have several addresses per peer (local network, puplic IP, via TCP, UTP, QUIC, websocket, someday bluetooth, your own custom transport, TOR, etc) (All are not implemented yet, though.)
Peers broadcast all the addresses they can be contacted through (we call that “multiaddresses”) to the DHT.

No, I mean by implementing this feature in IPFS, since it’s the only missing part to cover your use-case

Valmond · November 7, 2019, 4:17pm

This is incorrect.

Interesting. But are not CID a hash of the data?
So how is a publisher updating its CID (or data), by signing off some sort of CID_old=>CID_new data?

I don’t get what you mean by that. If there are not encrypted, users need to know them to access the content.

For the encrypted addresses, it’s so that they can be sent around to any node, but only read by the link, which contains the AES to uncrypt them.
This is so only the persons having access to the link, can get information concerning that specific data.

No, I mean by implementing this feature in IPFS

I have to think about this a bit

Akita · November 7, 2019, 4:31pm

Yes, an IPFS CID is basically a hash of the data. Nobody can change this mapping from a hash to a string of bytes.
But an IPNS record, is a signed record mapping a peerID (or yourBlogID) to an arbitrary CID (aka the CID of the blog’s last version, or of anything else).

They update their IPNS record from peerID=>CID_old to peerID=>CID_new

Ok, the link would basically be the decryption key, right?

Akita · November 12, 2019, 1:40pm

Just so you know, IPFS Cluster is a tool to coordinate pinning across several nodes.
The plan to implement “collaborative pinning”, which is close to your use-case.
Maybe you can toy around and see what’s possible right now with the different consensus.

Valmond · November 12, 2019, 4:32pm

Hi Akita, sorry for not answering earlier

Okay, so I tried to hunt down some information about how this IPNS actually works, but couldn’t find very much, and I didn’t have the time to read the source code :-p

I did stumble onto a hackernoon article though, which raises some questions (from the site):

Note: IPNS is still a bit shaky and forgets published names after about 12 hours. You might want to run a cron job to republish every 8 hours or so.

This is part of what I remember when I checked out IPNS last year, (IIRC) it’s sort of published every 12 hours or so (or takes 12h?),
which hints at a centralised system (even if it is not).
Do you have some information about how it works behind the scenes?

BTW, my protocol is now up and running, yay :-), including updating data. It could surely (also) be used to publish information like this and the workflow is really simple.
Maybe you’d like to check it out one day? There is still some work to do but mostly refactoring, packaging and documentation.

I’ll check out the IPFS cluster and the consensus, seems interesting.

Cheers

Akita · November 12, 2019, 4:54pm

Check this out to have a good introduction to IPNS.
More generaly, @vasa-develop gathered some - resources.

This is by design. It is precisely because it is decentralized.
Once you publish an IPNS record, it will be gossiped in the network. Nodes all around the network will keep it. When you update it, there will be two versions in circulation (the old and the new) precisely because there is no single place to perform the update. And if you close your blog, an old IPNS record will still point to it. You don’t want that.
So IPFS nodes will consider an IPNS record invalid if

they learn about a fresher record,
the record expires (12h).

It helps keeping the DHT clean not to keep records from 5 years ago.

If the publisher still wants it valid, they should republish it regularly. It’s like a “still valid” message to the commmunity.

No, but it’s long for now. IPFS is growing very fast, but the core team is working on it.

Sure!

Akita · November 22, 2019, 4:25pm

Also related:

Bluebie · November 23, 2019, 12:45am

IPNS is getting some pretty big updates soon. If you check out issues related to the design of IPNS over PubSub, theres a lot of discussion around building much faster IPNS resolution and publishing, as well as goals to have longer expiry times supported, and the ability for people who pin/cohost your site/content to maintain the availability of an IPNS entry, so the author doesn’t need to republish every few hours to maintain availability. It is fully decentralised. The time it takes currently to publish and resolve IPNS entries is because the IPFS node spends quite a bit of time crawling the DHT contacting nodes that are likely to know something about that IPNS record, and trying to make sure the local node has the most up to date record - not just accepting the first record it finds since that maybe out of date.

PubSub based IPNS has been demo’d at IPFS Camp a month or two ago (there’s a youtube video online) and it’s very fast and effective, but not yet widely deployed. By using PubSub, IPNS records can stay up to date in every interested node, instead of only being updated when the original author republishes it and contacts that node through the DHT.

Valmond · April 25, 2020, 9:09am

Hi Akita!

I have an first operational version of my sharing protocol, that lets you, with a small linkfile, share data.
Would you like to check it out?

Cheers

Valmond

Akita · April 26, 2020, 9:37pm

Hey! Why not! Keep in mind i’m no expert though.

Topic		Replies	Views
Ipfs key and hash saving Help	1	352	March 31, 2019
IPNS discovery using DIDs Protocol ipns , did	8	507	February 14, 2024
NKN and IPFS, that's a real possibility now! Ecosystem and Usage	0	647	September 30, 2019
Mutable & Replyable Storage (Novel? IPNS/DNSLink alternative?) Ecosystem and Usage	2	374	February 14, 2022
Internet Version 2 Ecosystem and Usage	2	697	April 28, 2018

Sharing protocol (not IPFS)

Related topics