As it appears storing peer ids is subject to GDPR because eventually it may reveal the node’s IP address. As such I need a way to identify users based on peer id but store something non-unique like incremental numbers ex. 1,2,3,4,5 in the distributed database that everyone has copy of. The code will be open source so it shouldn’t be possible to trace back the peer id based on the incremental id.
While a long string of letters and numbers may not be a “Johnny Appleseed” level of human-readable specificity, your PeerID is still a long-lived, unique identifier for your node. Keep in mind that it’s possible to do a DHT lookup on your PeerID and, particularly if your node is regularly running from the same location (like your home), find your IP address.
The public key of a natural person is a unique identifier and its use in online services is generally associated with other types of information that make it possible to identify and profile the person holding such a key. Under these conditions, the public key is personal data that uniquely identifies a person and thus its processing is subject to the provisions of the GDPR, although it can be considered as a method of pseudonymisation insofar as it can conceal a person’s real name.
How much you anonymize depends on how much you store. Same as anoymizing IPs by storing like 192.168.X.X. Sure if you only have someone with a 192.168.X.X IP then could identify. But not having the full peer Id makes it difficult to look up a peer ID, and you would be relying on just waiting to see a peer ID that matches the partial information. If you store only 4 letters at the end, then probably there will be several peer IDs like that in the network.
Going through the IPFS desktop Peers interface I can copy any of the peer ids and then scan them for the sequence. Also because it’s unique it would easily identify the person. The app has to be able to identify that single person based on what is persisted but no one should be able to. That’s why I was thinking of mapping peer ids to incremental numbers for example as they are not unique. But to map them without persisting them means I need to make the code closed source so no one knows what is being mapped.
Take the 2nd half of the peer I’d (or even the whole peer ID) and then hash it. That increases the amount of effort that someone would need to undertake to reverse it, as they’d need to start with a PeerID and hash it to see if it matches
Maybe it helps.
In my app I use orbitdb and I create an identity which is not linked to the peerid of the ipfs node on top of which orbitdb is mounted.
Each node however can prove that the ipfs peerd id is controlled by the orbitdb key and this information is revealed only if necessary. Communication between two peers starts in a public rendevous-point whose name can be inferred by the peers just knowing what they want to do and theyr orbitid key
In this channel peers reveals to each other and if this process goes through then something else will happen