Multiple problems when allowing DAPP users to directly interact with your own IPFS gateway

I was reading dozens of of articles, FAQs and other questions here on discuss the whole day. Yet, I can’t find an answer about how to realize my use case:

IPFS Gateway: http://ipfs.example.com
Website: http://example.com (static Vue.js application to interact with IPFS and Ethereum)
No further server-side processing.

Within the JS application users are uploading images which in turn are written to the IPFS gateway:

import ipfsAPI from 'ipfs-api'

const ipfs = ipfsAPI('ipfs.example.com', '5001')

// ...

// Push to ipfs
let fileReader = new FileReader()

fileReader.onload = e => {
  const buffer = Buffer.from(fileReader.result)
  ipfs.add(buffer, {progress: this.progress})
    .then((response) => {
      console.log('ipfs hash', response[0].hash)

      // ...store hash in ethereum...
    })
}

Problem 1:
ipfs.add() does not allow to add the file to a specific directory. I’d have to use object.patch.addLink to append the uploaded file to a known directory object and have to store the new hash somewhere for subsequent users.

Problem 2:
The directory objects hash changes everytime someone adds a file, which can happen multiple times per second. It’s impossible to guarantee that all current clients are in sync about what the current hash is.

Problem 3:
Even though I can restrict access to the gateway via CORS to just example.com, a malicious user could still manipulate the clientside javascript to circumvent the restriction of uploading jpeg and png files.

Question 1: Is there any other method to tell the gateway ‘Add this file to latest version of folder XYZ - you know its hash - and only if the file mime type is either image/png or image/jpeg’?

Problem 4:
All images are loaded individually in image tags like this:
<img src="http://ipfs.example.com/ipfs/QmRuzPpc1tjJ5TbhG7B2Ato8LtaY2DK2Y5DMWWb29cqFF5"/>
A malicious user could just press F12 and enter any hash he wants, forcing my gateway to load illegal content.

Question 2: Is it possible to configure a gateway to serve only already pinned files?

Has anyone yet build a serverless single page application which stores user data in IPFS? How did you circumvent these problems?

Is that only possible by hiding the gateway behind some serverside logic (e.g. a node server which queues the uploads + adds the files to IPFS + keeps track of the latest directory hash + also stores the images outside of IPFS to serve as <img> source to prevent loading of unpinned content)?

hi
problem 2 : this is a normal behavior for a hash ipfs based on merkle tree so if you add a link into your directory.

for yours question i’m sorry i’m not sure with my knowedge can be help on that.

Regards

Hi, some drive-by answers below :slight_smile:

It should be possible to define arbitrary hierarchies for data added via ipfs.add.

Try using wrapWithDirectory along with path parameter:

const files = [{
      path: 'dir/sub1/sub2/filename',
      content: buffer
}]
const res = await ipfs.files.add(files, {
        wrapWithDirectory: true,
        progress: updateProgress
})

Sounds like you want to look at realtime updates via pubsub: make sure to check this cool example.

If you don’t want to run a public gateway, put go-ipfs behind additional orchestration app that:

  1. performs validation of added data and rejects it if it does not mime-sniff as one of whitelisted media types
  2. tracks CIDs of every added file and returns HTTP 403 for non-whitelisted ones

Hope this helps!

2 Likes

Hi lidel,
thank you for this very informative post! :grinning:

So if I add a second file from the same path path: 'dir/sub1/sub2/filename2' both will be appended to the same directory object? That would be cool - but I guess it wouldn’t help if you add files directly from a buffer, would it?

Indeed, that looks promising. I just feel a little uncomfortable to let each user run a client node because it sounds quite heavy weighted.
I already started to build a proxy application in front of IPFS as you suggested. The SPA will be totally isolated from the IPFS api. A good starting point is the solution from Decentraland.

I love learning new things about IPFS and possible solutions to complex problems. Thanks for your drive-by :slight_smile:

1 Like

files.add will work the same, no matter if you add a Buffer or a stream. If you add a list of multiple files/buffers all will be added under the same directory.

See interface-ipfs-core/SPEC/FILES.md → files.add :sparkles:

Depends on how you configure client nodes. By default js-ipfs in browser will connect to ~8 bootstrap nodes over websockets, but you can change that behavior, eg. replace them with your own nodes, when you pass options.config to the constructor of your js-ipfs node.

I think if you also tweak defaults of libp2p’s connection manager via options.connectionManager it is possible to make it not heavier than any other JS-based app.

What I meant was adding buffers isolated from each other. Like each time a user uploads an image. IPFS wouldn’t know they belong to the same directory and wrap each one with its own directory.

Hi @haggis, did you find a solution for this? I am trying to do something similar. Simply letting the app. talk to IPFS and Ethereum directly instead of having to go through the REST API layer with business logic and the DB. So, my questions are,

  1. Since we use IPFS as our backend, we are planning to store the app. data as well in IPFS along with the files. Is it possible or advisable? If not why?
  2. If 1 is possible, how can we access the app. data in IPFS from the app. ?
  3. How can we ensure that the app. data is requested by the right client/user?
  4. What is the best way to access the app. data / file in IPFS from the app. directly to get better performance?

Thanks in advance.

Hi @decentralizedMe, nope. I went for a layer between the IPFS node and the frontend. Otherwise there’s no way to control who up-/downloads what and when and how much to/from your node. Even if you grant access only to requests through your domain, everyone can inject custom javascript to your website and let it access the IPFS node under the given domain.

To make this work, IPFS would need customizable “filter hooks” so that node operators have more control about what their node is used for.

Possible yes. Advisable solely depends on the requirements of your application, confidentiality of your data etc.

If you let your node open to the public then you can link it right from your website like this: https://your-ipfs-node.org/ipfs/[hash].
I went the way to mirror all application data on Amazon S3 and communicate with IPFS only within backend services (see here and here)

What do you mean with “right user”? That only a particular user is allowed/able to access specific data exclusively? This isn’t possible at all. Once you write data to IPFS it’s accessible by everyone. However, you could encrypt data with your own logic to achieve this goal.

So far I only know about two ways to access IPFS data “directly”.

  1. Fetch data from your hosted node via https://your-ipfs-node.org/ipfs/[hash]
  2. Have a js ipfs node running on the client side (like suggested above by @lidel)

Best would be to run your own benchmarks as this highly depends on the nature of your data, your infrastructure, users and so on.

Hi @haggis, if we directly access the hash in the node, then it is like a centralised solution and we are losing the advantage of being decentralised? ie, all my requests will hit the same node, even if the same file is synced to the nearest node?

Also, is there any best way/tools to do benchmarking/evaluating the performance by spawning n nodes etc?