I’ve seen the warnings about modifying your bootstrap list, for example here:
Only edit this list if you understand the risks of adding or removing nodes from this list.
Don’t change this list unless you understand what it means to do so. Bootstrapping is an important security point of failure in distributed systems: malicious bootstrap peers could only introduce you to other malicious peers. It is recommended to keep the default list provided by the IPFS dev team, or – in the case of setting up private networks – a list of nodes you control. Don’t add peers to this list that you don’t trust.
But neither place explains what bad things can happen if you add a malicious peer to your bootstrap list. How are bootstrap peers used and how can they be abused?
Basically, in a distributed network like a blockchain or IPFS, all the peers connect to each other without a central server. However when you set up a new node it doesn’t just scan the entirety of the internet looking for someone it can connect to, instead it attempts to connect to one or more “bootstrap” nodes, which are predefined and ship with the software itself in most cases. Once connected to a bootstrap node your node will then learn about any other nodes connected to the bootstrap node, and any nodes those nodes know of, and any those know of, and so on, until you’re (hypothetically) connected to all nodes that are out there. In the case of IPFS specifically, it’d be like if you went to microsoft.com to download Windows 10, but the microsoft.com you went to wasn’t the genuine one, it was in fact a Trojan site designed to look like it. So when you download Windows 10, you could get the genuine Windows, it’s not impossible, but more likely you’ll receive a modified Windows, (or in IPFS, a list of other malicious nodes instead of the genuine friendly ones). And if you have bad Windows, then anything you download through Windows could also be modified, or in the case of IPFS, your node could request a file, and the swarm of malicious nodes could send you a different file instead, your node wouldn’t know the difference because the only reference it has is fraudulent.
Wikipedia has more information: https://en.m.wikipedia.org/wiki/Bootstrapping_node
Simply put, distributed networks verify content by a sort of “majority rules” method, if the majority of nodes agree that a file is what you wanted to download, that’s what you’ll get, if two nodes have two files with the same identifier, they could both try to send you the file, one of which is presumably not what you wanted, a bootstrapping node serves to make sure you connect first to friendly nodes on the network and that unfriendly nodes can’t take over so easily.
What CyberVenus is describing is an Eclipse attack (What is an Eclipse Attack? | The Radix Blog | Radix DLT ).
It is the decentralized equivalent of a Man In The Middle attack.
In the specific case of IPFS (or BitTorrent), this attack would partially fail, because you ask for a content by its hash, so on receiving a malicious content, you would hash it and discover it’s fake.
If you bootstrap only to malicious peer(s), they can:
- Watch all your requests and track what you ask for and what you provide way more easily. (NB: IPFS is not private, so with some effort, an adversary can do that nonetheless)
- Deny you access to content (by pretending it wasn’t found on the DHT)
- Deny you publication of content (by never propagating updates to the DHT and pretending they did and/or never relay requests from outside to you)
- In some case mess with your discoverability if you are behind a NAT and the eclipsing peers don’t cooperate to help you hole-punch through it. It’s in their best interest to do so not for you not to “break out” of the eclipse
- Cut your access to the network altogether
- Maybe more…
If you connect to at least 1 good node:
- Malicious peers can watch part of your requests and track what you ask for and publish way more easily.
- In some case mess with your discoverability if you are behind a NAT, a malicious peer is chosen to help you go through and the eclipsing peer doesn’t cooperate to help you hole-punch through it.
- Slow down you access to the network since only “good” peers ill cooperate
- Maybe more…
Didn’t know what it was called. Thanks for that.
Silly me assumed that you’d have to manually hash any files separately after they were downloaded in order to determine they were fake, didn’t occur to me that the IPFS client should hash the files itself after downloading them… guess it makes sense it would though.
An additional notes: the “majority rule” wouldn’t help in case you are totally eclipsed, eclipsing nodes are cooperating to provide the same wrong file.
But like I wrote earlier, in IPFS, BitTorrent or other content-addressed network, the client will check integrity upon receiving.
Thank you both for the responses!
So if I’m understanding correctly, as long as you don’t remove any honest nodes, which would enable an eclipse attack, the dangers of adding more bootstrap nodes (of unknown honesty) are:
- They can mess with DHT functionality, which does depend on peer cooperation.
- They will be able to see what content you’re requesting.
- But couldn’t they anyways just by adding you to their swarm?
- They can hinder your NAT-punching?
- How does this part work? Is just any random bootstrap peer chosen, and thus the probability of being discoverable drops to #honest/#total? Or will it just take longer because you’ll keep trying until an honest peer helps you make it work?
I think that your node will ask for information and publish to all the connected peers if you use PubSub, so I think it should be OK, if it’s the gossip protocol you chose. But I don’t know the Bitswap part very well.m, so double check on that. And the default gossip protocol is now GossipSub in Go, where I think you don’t contact all the connected nodes So be sure to carefully read the specs or explicitly chose PubSub if you’re not sure. This will come with higher bandwidth cost or course.
Yes, you are right. The thing is that if it’s a bootstrap peer, they can introduce you to only malicious peers, so a significant part of your connected peers can be malicious. With default hardcoded safe bootstrap nodes likely being under heavy load and maybe too busy to answer quickly, it’s something to be considered. (Haven’t measured that though. Maybe they are lightening quick.)
I’m not sure about that. You should look up the autoNAT and autoRelay features in the specs and their status in your chosen implementation to be sure. I know your node stops trying eventually, but it may be once it has exhausted all peers or after N tries. IDK.
NB: I don’t even think you have to use a bootstrap peer. (But I’m not sure you can use any node, though)
All that being said, specs says that your client will drop bad connections following some heuristics (bandswith, delay, reliability, uptime,… ) I have no idea if it’s implemented yet. If that’s the case, your client will be able to clean its connexion set and keep healthy friends. To be checked. In that case, you being snooped on and performance issues on some specific contents the eclipsing nodes don’t want you to access or publish are the only threat I see. (And they can filter only a few of them or you will drop the connection for lack of reliability to provide a content other peers are able to find.)
Maybe you wanna have a look here: https://docs.libp2p.io/concepts/ and https://libp2p.io/implementations/
Actually, forget the autoNAT part. If you can’t get out of your NAT, you won’t be able to bootstrap over the Internet and join the whole network. So if you are in an unfriendly/badly-configured NAT, you’ll need a friend both accessible to you and connected to the internet to get you out. Better be a good peer or you’re stuck at this step and you can only communicate in you NAT, at most…