Hey there! This could be kind of a stupid question but I would like to know what does a node need to do to be considered a “public” node.
As a bit of context, i would like to contribute to IPFS ecosystem by launching my own node(s) to the swarm, but I don’t really know if just launching the daemon and forgetting about it is just it or there’s some kind of special setup I need to do (I have seen some posts out there who leave the /api/v0/add open for some reason).
Is there an official documentation for this? Should we make one?
Is there a need to make it a public gateway too?
And more technical, since I didn’t dive deep into IPFS papers but I assume that it needs a somewhat decent CPU to work with file blocks, could this run on a 500GB with crappy CPU from OVH or is better to get some smaller (30GB) VPS from famous providers with a fraction of a Xeon CPU?
Again, I think this should have been answered somewhere (and the documentation needs this kinds of “running your own public/private things”) but I didn’t find anything official.
You should never open up the API port (5001 by default) to the internet. Doing that, any data stored on your node can be removed, anyone can get copies of all your IPFS keys, etc… and you basically give random internet goers full control over your node. You could open up the gateway port (8080 by default), but never open up the API port.
That aside, there is nothing special that must be done to participate in the network. Just launch your node, make sure you have peers, and realistically that’s all that needs to be done. If you want to take it a step further, you can become a circuit relay hop, and a couple other cool stuff.
One thing I would suggest, is that you update PubSub.Router to be set to gossipsub. Additionally, when launching your node you might want to enable ipns pubsub with the command-line flag --enable-namesys-pubsub
In terms of physical resources needed, you can technically run an IPFS node on a Raspberry Pi. So it’s really about what your budget is, and how much you want to spend on running an IPFS node.
To take it a step further, you can spin up multiple IPFS nodes and join them together via ipfs-cluster to ensure any data pinned on your node is replicated across multiple hosts.
If you want to monitor your nodes, my personal prefferrence is using Zabbix. I’ve got a couple neat tools for Zabbix to monitor IPFS, and IPFS Cluster nodes.
Thanks for your reply @postables! I would comment per section:
In this case they only opened the /api/v0/add, not the entire API.
About hosting public gateways, is there some sort of website of public gateways? I only know two: IPFS and Cloudflare.
I will take a look at that!
What’s the benefits of all of this? Just point me to any kind of documentation I should read!
Yeah, I know about that, but I had a spare OVH Kimsufi server laying around (the cheapest one with a really crappy CPU) and I didn’t know if that was enough or if I could just roll up VPSs in DigitalOcean/ScaleWay/Vultr/… that had better CPUs but lower storage.
Yeah, that’s another thing I need to take a look at Specially connecting my local IPFS to my remote nodes, so I can publish stuff in my local node that is pinned in the remote, so it gets updated automatically.
You’ll want to make sure you’re doing that via something like a reverse proxy with ACLs, as the IPFS daemon itself doesn’t have that level of granularity with access controls.
Which model do you have? If you’re using a model with the intel atom you might have problems, but I’m not sure to be honest. Version 0.4.19 had a CPU performance issue with a bitswap change, however if you build an IPFS daemon from the master branch, it includes a fix for that.
I have the KS-1, used only to put some files in there, but currently un-used (well, with a IPFS node installed but not 100% properly configured). CPU is Intel Atom D425, IPFS 0.4.18, waiting for the 0.4.19 release on the Arch repos before upgrading.
I get why you wouldn’t want to open the api port but is that correct that you can get access to IPFS keys from there? Why would that even be possible?
Any keys you store, including the peer id are available by listing the stored keys. So if you don’t block that command, or at least wrap it with authorization anyone with access can get your nodes private key, amongst messing with other things like removing pinned data
I believe you would also have access to the config command in which case you can display information in there, like the encoded private key of your node
The main issue i think is that there’s no ability for fine-grained authorization of API calls, but I believe that is in the roadmap of stuff to change. You just really shouldn’t give access to the API, and if you do, you should wrap it with some layer of authentication.
For example, I’ve got a pretty basic golang reverse proxy that blocks calls to non whitelisted commands. Proxy code here, whitelisted commands here. You can probably accomplish the same thing with nginx but I’m not too familiar with nginx.
The default setting is to only listen for API connections from localhost so its not too big of an issue imo. It’s just genuinely a bad idea to allow unfiltered access to the API.