# Design Exploration
This document explores some areas in which it would be n…ice to improve IPNS and some ways in which to do so. I'm posting it here so I/we don't lose it and in case anyone else is interested in picking up this work. This is a very rough document with some good ideas but not nearly enough thought put into them.
## Design Goals
First, we need to decide on what we need out of IPNS (what we have (checked) and what
needs to be improved):
1. [ ] A consistency/threat model:
- [ ] Consistency
- [ ] Between IPNS addresses.
- [ ] Within a single IPNS address.
- [x] Forgery Proof (signed IPNS records).
- [ ] Freshness Guarantees (IIRC, the current guarantee is "not expired").
- [x] (?) Censorship Resistance (DHTs can be censorship resistant).
- [x] (?) DoS Resistance (DHTs can be DoS resistant).
2. [ ] Mobile/IoT friendly lookups:
- [x] No CPU intensive operations (e.g., really expensive crypto)
- [ ] Minimal bandwidth usage (DHT lookups require many round trips).
- [ ] No continuous background communication to avoid wasting power and
bandwidth (IIRC, the DHT does a fair amount of background communication).
- [ ] Minimal storage requirements.
3. [ ] Low latency initial lookups
4. [ ] Low latency updates (PubSub).
5. [ ] Avoid having to constantly re-publish IPNS records.
This document focuses on discussing a system to tackle goals 1-3. We're currently working on using PubSub to tackle goal 4 and goal 5 could probably be bolted on to any system solving goals 1-3 with some incentive model.
Given these goals, the main design decisions from my perspective are:
* Funding Model
* Consistency/Threat Model
## Parties
First, so we can have some consistent terminology, I'll define the following
parties:
* Provider: An entity participating in the server-side of the IPNS distributed
system.
* Publisher: An entity who publishes IPNS records.
* Client: An entity retrieves IPNS records.
## Funding Model
With the DHT, we amortize the cost over all participants. However, due to the
low-latency requirements of this system, we'll probably have to go with a more
centralized model and therefore may need a way to fund it.
### Nice People Pay
One solution is to assume that some set of "benevolent" organizations will run
IPNS providers on their own dime. I say "benevolent" in quotes because
organizations may be willing to act as a provider for many reasons:
1. Metadata: They want to know who requests what IPNS records, how are IPNS records
clustered, etc.
2. Political: Decentralized systems tend to be censorship resistant and tend to
promote free speech.
3. Invested: We (and other companies that rely on IPFS) might want to run one as
it increases the value of IPNS, IPFS, and all related technology.
**Pros**
* End users don't have to pay.
* We don't have to implement a payment system.
**Cons**
There's no such thing as a truly free lunch. I'm always weary of free
centralized (or semi-centralized) services as anyone offering one usually has an
agenda (which may not be in societies best interest).
### Publishers Pay
We could use a peering system (like ISPs) where publishers pay a single provider
and that provider agrees to store and serve the publisher's IPNS records and
exchange them with their (the provider's) peers.
**Pros:**
* Someone is directly paying (no free lunch problem).
**Cons:**
Publishing IPNS records would cost something. This could be alleviated by
allowing free short-lived records and/or allowing free limited accounts.
Furthermore, publishers could always just publish to the DHT if they don't
care about latency.
### Clients Pay
An alternative would be to let clients pay. For example, clients could pick a
few trusted IPNS providers, pre-pay for some number of request tokens (ecash?)
using a crypto currency, and then return a request token each time they make a
request to one of their chosen providers. Alternatively (far future), ISPs could
simply provide IPNS resolution as a service to their customers.
**Pros:**
* May incentivize IPNS providers to store and serve popular IPNS records.
Depending on the consistency model, this may not be relevant (IPNS providers
may have to store *all* IPNS records).
* Doesn't necessarily involve manual peering agreements.
**Cons:**
Costs users which could be a significant barrier to entry (*any* price is a
barrier to entry, even if it's absurdly low). However, we can always start off
by providing our own free IPNS provider and move to a paid model if necessary.
### Conclusion
As I believe we could upgrade to either (or both) of the latter two payment
models later if needed, we should probably hold off on them for now. However, we
should leave room in the protocol for a payment system.
## Consistency/Threat Model
I'm going to mix the consistency and threat models because they're inextricably
linked: the ways in which we trust each party determine how each party can
effect the system's guarantees.
1. Authenticated: It must be (practically) impossible to create IPNS records
without the associated private key. Luckily, we already have this property.
2. Censorship Resistant: I would like to make it infeasible to censor IPNS
record updates. While we *do* want to support censorship of IPLD objects
using a voluntary blacklist, IPNS address censorship is too easily abused.
Unfortunately, this censorship resistance could be abused to turn the IPNS
system into a censorship resistant data-store (by storing the illegal/illicit
data in the IPNS records themselves).
3. DoS Resistant: The system must be resistant to DoS attacks. Unfortunately,
it's generally impossible to be completely resistant. Note: this is not the
same as censorship resistant. I consider censorship to be censoring
individual records while allowing the system to continue functioning.
3. Consistent: Exactly what consistency model we want is still an open question.
I consider this part of the threat model because a malicious provider may
attack the system by presenting clients with inconsistent state.
In the following discussion, I assume:
1. There is a set of "important" (non-garbage) IPNS keys/records.
2. There is a set of "trusted" IPNS providers.
I then break privilege into three categories:
1. Unprivileged: Entities that don't control an "important" IPNS key and aren't
in the "trusted" set of providers.
2. Publishers: Entities that control an "important" IPNS key.
3. Providers: Entities that control a "trusted" IPNS provider.
For reference:
* [ ] Means I intend to allow the attack (trust the party not to do this).
* [-] Means I intend to try to stop the attack but may not be able to entirely
thwart it.
* [x] Means I intend to prevent the attack (don't trust the party not to do this).
### Unprivileged
I divide attacks from unprivileged entities into (exhaustive):
* [-] DoS: Anyone may attempt to prevent this system from functioning (DoS). The
system will have to have some DoS resistance but there's no way we can
completely protect against DoS.
* [x] Forgery: Attackers may try to forge IPNS records. This will be
(practically) impossible (enforced by crypto).
* [x] Censorship: Attackers may try to censor IPNS records. It should be
practically impossible to censor an individual IPNS record without causing a
wider DoS. While we can't stop a global adversary from shutting down the
system (or disconnecting individual clients), we should be able to prevent
such adversaries from censoring individual records.
### Publishers
In addition to the unprivileged attacks listed above, publishers can try to
perform the following potentially undesirable actions (non-exhaustive?) with
respect to their *own* IPNS addresses:
1. [ ] Collision: Publish multiple different IPNS records for the same IPNS
address with the same timestamp.
2. [ ] Backdate: Backdate an IPNS record.
3. [ ] Partition: Present two different IPNS records to two different parties at
the same time. These IPNS records don't necessarily need to have the same
timestamp so this doesn't quite fall under the collision category.
All these problems are present in DNS+HTTPS. Furthermore, I believe these
problems are mostly out-of-scope for this system and we can probably layer a
"trusted" IPNS system on top. I've outlined a few ways to discourage such
behavior in Appendix A.
Therefore, I'm inclined to largely trust publishers when it comes to how they
manage their own IPNS addresses. We may end up choosing a consistency model that
forbids these actions but I don't consider that a goal.
### Providers
Providers may perform the same attacks as unprivileged entities however, the
threat model is significantly more nuanced. Furthermore, they may try to violate
the chosen consistency model.
Given that providers have reputation, the extent to which they can attack the
system isn't a simple can/can't. Therefore, I break the extent to which
providers can attack the system into the following categories:
1. Can hinder (e.g., slow down).
2. Can do but may loose reputation.
3. Can do with sufficient collusion.
4. Can.
#### DoS
For DoS, we can limit providers to attacks 2 and 3. If all servers in a client's
"trusted providers" set refuse to operate, there's nothing we can do about it.
If a subset of providers try to DoS the system, they'll loose reputation.
#### Forgery
As before, we prevent forgery using cryptography so we should be able to prevent
all attacks of this type (assuming the adversary has limited computational power
and the crypto is sound).
#### Censorship
A trickier attack to defend against is censorship, mostly due to the legal
issues involved (and the fact that this *won't* be a fully decentralized
system). I believe the best solution is to prevent silent censorship and allow
clients to try multiple providers (in different legal jurisdictions) if they
wish to work around censored records (possibly falling back on a fully
decentralized system like a DHT).
Unless we go for a fully decentralized system, we can't outright prevent
providers from censoring IPNS records because, given sufficient collusion, the
trusted set of IPNS providers could simply erase all evidence of an IPNS
record's existence. Furthermore, simply discouraging censorship by punishing
providers that censor IPNS record updates (category 2) will not be sufficient as
providers in some (all?) jurisdictions will inevitably be legally compelled to
censor some IPNS addresses (or face being shut down entirely). These are
just inherent issues with centralized systems and governments.
However, we should be able to discourage *silent* censorship by requiring that
providers inform clients that an IPNS record is being censored instead of
returning an old IPNS record or claiming that one doesn't exist. While some law
enforcement agencies won't *like* this requirement, I believe most will accept
that a provider can't do otherwise and abstain from simply shutting down the
provider entirely. This way, clients can try another provider in a different
legal jurisdiction if they encounter censorship. This will increase the latency
for retrieving highly censored records but I believe it's a reasonable
compromise.
So, at the end of the day, I believe we can restrict providers to attacks 1-3
(prevent them from outright censoring IPNS records).
#### Consistency
Finally, we need to decide on an actual consistency model.
In this section, I don't consider the threat model (under what attacks does the
system remain consistent) as this section is already more detailed than I would
like. Furthermore, doing so would (mostly) be a waste of effort as there's
little point in considering the possible threat models of consistency models we
don't end up using.
In this section, I consider the following consistency models:
1. Monotonic Consistency: For any given IPNS address, if a client observes an
IPNS record with timestamp T1, the client will never accept any record with a
lower timestamp T2 < T1. I believe this is effectively the consistency model
of the DHT.
2. Strict Consistency: There's a total order of all IPNS record updates.
3. Explicit Causal Dependencies: IPNS records explicitly state the minimum
version/timestamp of all IPNS addresses on which they depend.
4. Causal Consistency: If a publisher observes a set of IPNS records for
addresses `A[]` with a timestamps `T[]` and then publishes an IPNS record
`R`, any client that observes `R` and then looks up `A[i]` will receive an
IPNS record for `A[i]` with a timestamp greater than or equal to `T[i]`. This
is a common consistency model in shared memory systems.
5. Application Level Consistency: Instead of enforcing a consistency model at
the IPNS level, we can enforce it at the application level as needed.
##### Monotonic Consistency
This can be enforced client-side by simply rejecting old IPNS records when a
newer one is known.
**Pros**
* It's really simple (enforced entirely by the client).
* IPNS providers could be simple caching resolvers built on-top of the DHT
(without modifying it).
**Cons**
* The only real guarantee clients can rely on is that the IPNS records they get
haven't expired.
##### Strict Consistency
Enforcing strict consistency would require some form of global byzantine
consensus system to ensure global consistency and some way to verify that an
IPNS record belongs to the current agreed upon state.
See Appendix B for details on how this system might work.
**Pros**
* Strict consistency generally makes developers' lives easier.
* Strict consistency would imply censorship resistance.
* Should be fairly easy to design (see Appendix B)
**Cons**
* It requires that *every* provider agree on the entire state of the world at
any given point in time.
* It's not interplanetary. At the very least, we'd need a separate IPNS network
per "latency zone".
##### Explicit Causal Consistency
In practice, strict consistency may be overkill. Instead, we could consider a
system where each IPNS record describes the minimum state of the world on which
it depends. That is, the minimum version/timestamp of all IPNS addresses an IPNS
record depends on. Clients would use this information to determine the minimum
acceptable timestamp for any given IPNS record.
**Pros**
* Doesn't require any cooperation between providers.
* Works well in a partitioned network (all consistency information is encoded in
the IPNS records themselves).
**Cons**
* Potentially large IPNS records.
* Publishers (applications) need to somehow track dependencies. Simply treating
the entire "read set" (set of IPNS seen to date) as the dependency list isn't
feasible as it would make IPNS records massive.
##### Causal Consistency
Instead of explicitly listing dependencies, IPNS records could include, for
every provider used by the publisher (likely just one), a "pointer" to the
current state that provider's timeline. Clients would then use this information
to verify that their provider's state is at least as up-to-date as the listed
states.
Note: This sounds really inefficient but I've convinced myself that there are
reasonably efficient ways to do this, especially if we trust third parties
(e.g., a set of trusted providers) to do the actual consistency verification.
**Pros**
* Unlike strict consistency, this is interplanetary. One can think of this as
strict consistency from the publisher's point of view.
**Cons**
* This would require a lot more "original" design work than a strict consistency
system (unless I just haven't read the relevant material).
##### Application-Level Consistency
It's worth noting that explicit consistency can be achieved at the application
level by including the oldest acceptable IPNS record along with IPNS links. For
example, given:
minimum_ipns_records = {
"/ipns/$a": {
timestamp: ...,
link: "/ipld/...",
},
"/ipns/$b": {
timestamp: ...,
link: "/ipld/..."
}
}; // Can be a separate object (shared between multiple IPLD objects).
thing = {
"minimum_ipsn_records": minimum_ipns_records,
...
}
The application would fall back on the listed IPNS records if the ones it
retrieves from the system are older.
**Pros**
* By doing this at the application level, applications can enforce any
consistency guarantees they need without paying (in performance) for any
consistency guarantees they don't.
* Even if an IPNS link goes dead (the newest IPNS record expires), applications
will be able to resolve the IPNS address to some valid IPLD name.
**Cons**
* It shifts the burden to application developers.
# Appendix
## Appendix A: Publisher Shenanigans
Below, I outline a few ways we can deal with misbehaving publishers with crypto.
Even if we choose a consistency model for the global IPNS system that doesn't
allow these kinds of shenanigans, the ideas presented below may still be useful
in partitioned networks. Note: you don't have to read this section, I mostly
included it to have these ideas recorded somewhere.
### Collision
We could use some special crypto to ensure that issuing two IPNS records with
the same time stamp reveals the secret key. This will likely not be an effective
deterrent for short-term attacks so it may not be that useful. However, unlike a
byzantine agreement system, this solution works in a partitioned network.
### Backdate
We can build an unbroken chain of IPNS records where each record can have at
most one successor. We should be able to enforce this property by using some
form of single-use signature scheme where making two signatures with the
same key reveals the key. This would obviously need to be significantly fleshed
out.
Unfortunately, unlike a byzantine agreement system, this would still allow
publishers to backdate up to their last published IPNS record.
However, like the crypto solution to collisions, this can be used in a
partitioned network.
## Appendix B: Strict Consistency IPNS
Given that I like this option the most, I've put a bit of thought into
how it might be implemented. Unless I'm mistaken, we should be able to build
much of it on-top of IPLD (which is one of the reasons I like this option).
First, at every time step, the system would need to agree on what IPNS record
updates to accept. To do this, we'd use some form of byzantine agreement system.
However, the threat model isn't the same as that of cryptocurrencies. Instead of
double spending, malicious parties will likely want to remove an IPNS address
from the system for either monetary (extortion, competition) or political
reasons (censorship). We may be able to use this to simplify some parts of the system.
Second, we'd need a way to distribute the IPNS records along with proofs that
each IPNS record belongs to the current block. To do this, I'd store all IPNS
records in a single merkle tree (using IPLD) that all providers agree on (using
the byzantine consensus system). The root hash would name the current state and
the proof of membership would consist of a path of IPLD objects from the root to
the IPNS record.
Finally, to prove that the root hash is authentic without forcing clients to
verify the entire blockchain, providers could all sign the current root hash
(e.g., using IPNS records) and clients could fetch a quorum of such signatures
as needed.
Note: This m/n signatories system may not be necessary in some byzantine consensus systems. For example, given a expected consensus based system, it should be possible for peers to *know* if a block is valid for some timestamp (we could even have an expected N system where we usually end up with duplicate blocks but that's not really an issue for us because we don't have the problem with double spending).
### Example IPNS Tree
**Given:**
* /ipns/aaa
* /ipns/aab
* /ipns/abb
**Tree:**
```
root = {
"a": ipns_a
};
// ---
ipns_a = {
"a": ipns_aa,
"b": ipns_ab
};
// ---
ipns_aa = {
"a": ipns_aaa,
"b": ipns_aab
};
ipns_ab = {
"b": ipns_abb
};
// ---
ipns_aaa = IPLD_RECORD_AAB;
ipns_aab = IPLD_RECORD_AAB;
ipns_abb = IPLD_RECORD_ABB;
```