Vector for abuse, any solutions?

supereggbert · May 8, 2018, 3:51pm

I’m new to IPFS and am just trying to get a website up and running. While doing so I noticed there doesn’t seem to be anything stopping me using visitors nodes to cache more then just the page they are viewing. My understanding is that your local node caches any blocks you access; if that’s the case then it could be abused to force the caching of other material. If I’m browsing sites on IPFS there’s a lot going on in the background, fetching images, JS, CSS, XHR etc. To me there seems to be an incentive for websites to download as much as they can in the background while the user is on a page to maximise the availability of their content.

Potentially an even bigger issue would be caching of illegal material without the users knowledge when they visit what seem to be perfectly legitimate websites.

Am I misunderstanding something or is this a known issue and are there any ways to mitigate such abuse?

leerspace · May 8, 2018, 6:06pm

One mitigation would be to garbage collect often. That’ll wipe out anything you haven’t intentionally pinned.

stebalien · May 9, 2018, 12:03pm

FYI, websites can already do this (although you usually don’t end up serving this data).

But yes, this is a very real (and complex) problem without any clear solutions. Things we can do:

Provide blocklists for known “bad bits” (using double hashing to avoid indexing the bad bits).
Allow users to determine which websites/apps they want to make available. To make this work, we’d have to associate every block with each “origin” through which we downloaded the block. (HARD)
Allow users to explicitly decide what data they want to make available. That is, we could allow websites to add a “pin this!” button (kind of like a “like this!”) that would (at the users request) tell the IPFS daemon to make the requested resource available to the network.

supereggbert · May 9, 2018, 1:48pm

Website can download stuff without user knowledge already but like you say they don’t serve that content so websites authors have no reason to do put unnecessary load on their own servers.

I was giving it some thought and did come up with one potential solution that may at least removes the incentive to cache as much content as possible on the users node. It should be possible for the HTTP server to deny access to anything outside of the root hash by using the referrer in the request. Then if you get the HTTP server to wait for the entirety (or at least a good chunk) of the root hash to download before providing the entry point you encourage websites to minimise the content available to them on a single page.

Such a solution does limit some use cases(ie streaming video) for IPFS so it might be a case of having user permission to add exceptions.

Topic		Replies	Views
Is there any way to prevent caching of illegal content?	4	2397	November 30, 2017
Web browser with integrated IPFS node/support for browser cache? Ecosystem and Usage use-cases-and-apps	7	2214	April 1, 2018
How it works! - storage hosting Help	4	3785	July 3, 2018
Stealing web browser history using cache timing Help	1	447	May 23, 2017
How users share IPFS content? Help go-ipfs	3	591	July 11, 2018

Vector for abuse, any solutions?

Related topics