After my local server stops hosting the file, where does the file go?

Since I have an extremely limited background in tech, I am having difficulty wrapping my head around the fundamentals of IPFS. That said, of what I do understand from a consumer point of view, I am extremely excited about the potential this can create in the world.

While doing some internal test, I uploaded a file and sent it to my friend: https://ipfs.io/ipfs/QmPNVt1aNN4KTZhjtS91PtRgrdNCcEL6N2sooScfWfgh42

However, after I stopped running my local “Daemon” server, the file still seems to be active and view-able (even though no other person has this file… that I know of). Can you help me understand, how is this link still active even though I am not hosting it on my server anymore?

Thanks,

There are two nodes currently seeding the file. If this is not your own and your friend’s nodes, then someone else has cached or even pinned the file, i.e. it’s visible to everyone who knows its multihash, as long as these two nodes are online.

ipfs dht findprovs QmPNVt1aNN4KTZhjtS91PtRgrdNCcEL6N2sooScfWfgh42
QmSFb4crR8GaDw9Ypb1GSAuhmVheQyo9cMfDqtYjQEvRpD
Qmbut9Ywz9YEDrz8ySBSgWyJk41Uvm2QJPhwDJzJyGFsD6

Note: The gateway probably also has the file cached (it usually caches nodes for a few hours at least).

Thank you for the reply, that clears up a lot. If you don’t mind, that prompted just a few last questions:

  1. For a device to become a “node”, do they have to have ipfs running on their end? For example, if I email the hash link above to a friend (Who has never even heard of ipfs) and they open it, are they now a node? Or must they be running the daemon?

  2. After I shut down the terminal running “Daemon”, is my node still active? How can I check?

  3. The “Gateway” you mentioned sounds like a good feature in keeping early links alive. That said, I am still confused on “where” the gateway is saving this information. Does IPFS have insight/control of the gateway?

I apologize if these questions are trivial or using incorrect terms. I want to bring this technology to the masses/every day person, so understanding this foundational flow is helpful. Thanks for your time.

Good questions.

Once a file is added, it is replicated in the IPFS network, other nodes will cache a copy. So even if you stop your node, other’s can still get the data!

The “Gateway” is a server in the cloud that allows HTTP access to IPFS file.The link you send in email should be the “Gateway” URL plus the file hash; in your case https://ipfs.io/ipfs/QmPNVt1aNN4KTZhjtS91PtRgrdNCcEL6N2sooScfWfgh42

This mean that you friend can see the file without knowing about IPFS.

For a device to become a “node”, do they have to have ipfs running on their end? For example, if I email the hash link above to a friend (Who has never even heard of ipfs) and they open it, are they now a node? Or must they be running the daemon?

For now, they must be running a go-ipfs daemon. In the future, we’d like to make a gateway (maybe the default, this has yet to be determined), use js-ipfs so browsers will also be able to act like (light) nodes.

After I shut down the terminal running “Daemon”, is my node still active? How can I check?

No, the daemon is your node. You can check if your node is online by trying to connect to it from another node ipfs swarm connect /ipfs/YourOtherPeerID. However, this may not work depending on your network setup (firewalls and nats). We’re working on improving that with relays but that’s still in progress.

The “Gateway” you mentioned sounds like a good feature in keeping early links alive. That said, I am still confused on “where” the gateway is saving this information. Does IPFS have insight/control of the gateway?

The gateway is just a service we run (on ipfs.io, currently). Actually, every go-ipfs node also has a gateway, they just aren’t exposed to the network. When running a node, you can always replace https://ipfs.io/ipfs/... with http://localhost:8080/ipfs/....

As for the other part of your question, gateways are also nodes so they cache the data in the local repo. This repo will be cleaned (non-“pinned” objects will be removed) if you either (a) have GC enabled and run out of space or (b) manually run ipfs repo gc. Note: Everything you add manually via ipfs add ... is pinned by default so GC won’t delete it.

Not necessarily. If another node fetches the content from your node, they’ll cache and serve a copy. However, it isn’t automatically replicated in the background. You’ll see this behavior if you add the file and then fetch it from ipfs.io but that’s only because ipfs.io has now fetched a copy (so it could serve it to you).

1 Like

To follow up on this.

Does IPFS make any determination about the “nearness” of the nodes when fetching data. For example say I want to fetch an ISO of the latest version of Ubuntu. Will IPFS prioritize nodes in my country before pulling data from other countries. If somebody on my own network has the file will it prioritize that so I don’t choke up the company bandwidth to the provider?

Thanks.

At the moment, no. However, that’s something we’d like. The plan is extend something we call bitswap sessions.

First a bit of background. In IPFS, we break files into chunks and then build a merkle-tree structure on-top of them (kind of like bittorrent). So,

  • Basic bitswap: Ask everyone we’re connected to for every piece of every file we want.
    Current bitswap: Ask everyone for the root, add those that respond to the “session” and only ask nodes in the session for the children.
  • Next step (hopefully before the end of 2017): Ask everyone for the root, add those that respond to the session but weight them by response time. Ask the first responders (likely the closest peers) for the children (and then ask the next fastest etc until we actually find what we’re looking for).
  • Future: In addition to ranking by latency, rank by “cost” (i.e., allow custom per-connection cost functions).

Thanks.

I think this would be a really good feature to have when trying to share files between co workers or the university.