I’m new to the pubsub system in IPFS. I have a few questions regarding the privacy properties of pubsub:
Are the topics globally enumerable either explicitly (as a built-in feature) or implicitly (e.g. by crawling DHT)?
Is it observable for other peers who is the publisher of a message to a topic if the topic name is known?
If the topic name is unknown, is it observable in general which peers are publishing to topics? If so, is the name of the topic deducible or only the fact of publishing?
Is it observable for the rest of the network who are subscribed to a topic if the topic name is known?
If the topic name is unknown, is it observable which peers are listening to topics? If so, is the name of the topic deducible or only the fact of subscribing?
PubSub has a couple of implementations, perhaps the most common is the GossipSub protocol which is more of a libp2p concept, even though we support PubSub in IPFS as a transport for IPNS record.
As far as I understand, GossipSub doesn’t really aim for privacy.
They aren’t. PubSub topics are not advertised to the DHT (though you can the DHT for bootstrapping/finding initial peers) by publishing a provider record for the hash of the topic.
In order to subscribe to the topic, a node P needs to locate one or more nodes in the topic and join the overlay. The initial contact nodes can be obtained via rendezvous with DHT provider records. Source:specs/pubsub/gossipsub/episub.md at master · libp2p/specs · GitHub
GossipSub clients can require message signing, which allows linking messages to PeerIDs. I believe Ethereum consensus relies on this. This arguably also serves as a thin layer of sybil resistance (since generating a key and signing messages is more computationally expensive than not)
I’m pretty sure that without knowing the topic name, it’s harder to observe which peers are publishing to the topic. But generally, it’s still possible, by:
Using the PubSub RPC messages to find out which topics they’re subscribed to.
I believe topic names are hashed (I’d let someone with more knowledge chime in), so you might be able to determine the hash of topics specific peers listen on rather than the string name.