.ipfs/blocks enlarged to 2TB despite MaxStorage setting, discovering cid to ds key conversion

This is my Datastore config:

  "Datastore": {
    "BloomFilterSize": 0,
    "GCPeriod": "1h",
    "HashOnRead": false,
    "Spec": {
      "mounts": [
        {
          "child": {
            "path": "blocks",
            "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
            "sync": true,
            "type": "flatfs"
          },
          "mountpoint": "/blocks",
          "prefix": "flatfs.datastore",
          "type": "measure"
        },
        {
          "child": {
            "compression": "none",
            "path": "datastore",
            "type": "levelds"
          },
          "mountpoint": "/",
          "prefix": "leveldb.datastore",
          "type": "measure"
        }
      ],
      "type": "mount"
    },
    "StorageGCWatermark": 90,
    "StorageMax": "10GB"
  },

First question is how come blocks directory has grown to 2TB so far?

Perhaps I need to restart the daemon with
ipfs daemon --enable-gc

Or manually clean with:
ipfs repo gc

But ok, I am trying to figure out if I can use this storage as a fast cache to the files.
Then I’ve got really stuck with figuring out how can I “convert cid to ds key”.
Lets say I have this file:
.ipfs/blocks/DY/CIQA272DMOWSAP3RBQQJESHRQWMQM76EAHLAD2XUDXWRMFVKZLLPDYQ.data

I’ve found out that I could issue commands like:

ipfs dag resolve BCIQA272DMOWSAP3RBQQJESHRQWMQM76EAHLAD2XUDXWRMFVKZLLPDYQ
ipfs cid format BCIQA272DMOWSAP3RBQQJESHRQWMQM76EAHLAD2XUDXWRMFVKZLLPDYQ -v 0

Will give me cidv0 QmPFLZYMsVVCgdKwrLC1FxKRQcZHb6A3kNrzMuivJTWDfF

But couldn’t really find a command to do the reverse using Kubo ipfs tool.

Found some repositories like js-stores/packages/interface-datastore at main · ipfs/js-stores · GitHub

And an example of code to use it:

'use strict'
const base32 = require('base32.js');
import Key = require('interface-datastore').Key;
const path = require('path');

/**
 * Transform a raw buffer to a base32 encoded key.
 *
 * @param {Buffer} rawKey
 * @returns {Key}
 */
const keyFromBuffer = (rawKey) => {
  const enc = new base32.Encoder();
  return new Key(path.sep + enc.write(rawKey).finalize(), false);
};

/**
 * Transform a raw buffer to a base32 encoded key.
 *
 * @param {Buffer} rawKey
 * @returns {Key}
 */

const cidToDsKey = (cid) => {
  return keyFromBuffer(cid.buffer);
};

But couldn’t get it to work with a little of TypeScript experience having complains about

[ERR_PACKAGE_PATH_NOT_EXPORTED]: No “exports” main defined in ./node_modules/interface-datastore/package.json

Also, some links might be in this format:

ipfs://bafybeiao4wmudgiy32muigyaes6zs76ks5yikq56yjigaa46ksji4nhoua/11635.json

How can I convert also this ‘/11635.json’ part to get the filesystem path.
Or in this case, how do I test if the above link is present locally on the system?

I want to figure out how can I scan every block and see which cid is that and it seems to be doable with the commands I mentioned, but I would also like to have ‘/11635.json’ part from the datastore key.
And contrary from any cid link, may be just a cid in v0 or v1, or also having a sub-location like ‘/11635.json’ I want to get its local datastore key and/or a command to check if its present locally.

MaxStorage does not set any limit, it only informs the point at which GC should run when auto-gc is enabled. when enable, it will delete everything non-pinned.

Try ipfs cid format -b base32upper -f '%M' QmPFLZYMsVVCgdKwrLC1FxKRQcZHb6A3kNrzMuivJTWDfF, which does base32upper encoding on the multihash bytes of a CID.

However I’m not sure what is the point of interacting with low-level flatfs datastore. You can check if a block is present with ipfs --offline block stat <cid> too.

Because /11635.json is a named link in the original CID, you would need to decode the protobuf data referenced by the original CID, find out the CID of /11635.json and test that. Or you can ipfs --offline block stat <cid>/whatever.json which does it too.

Great, thanks, at least something is sorted.
So, the ipfs cmd for ‘Cid to Ds key’ conversion is
ipfs cid format -b base32upper -f '%M' <cid>

I have tried it with just -f %M option, so need to also specify -b base32upper, ok.

I just wanted to edit the initial message, to indicate that I have tried commands to check if a cid is present locally, but they returned an error, as it was blocked by Akismet SPAM bot…

So here is how it looks for me now:

.ipfs/blocks #  ipfs cid format -b base32upper -f '%M' QmPFLZYMsVVCgdKwrLC1FxKRQcZHb6A3kNrzMuivJTWDfF
CIQA272DMOWSAP3RBQQJESHRQWMQM76EAHLAD2XUDXWRMFVKZLLPDYQ
.ipfs/blocks #  ll DY/CIQA272DMOWSAP3RBQQJESHRQWMQM76EAHLAD2XUDXWRMFVKZLLPDYQ.data 
-rw------- 1 root root 585 Jan  5 03:17 DY/CIQA272DMOWSAP3RBQQJESHRQWMQM76EAHLAD2XUDXWRMFVKZLLPDYQ.data
.ipfs/blocks #   ipfs --offline block stat QmPFLZYMsVVCgdKwrLC1FxKRQcZHb6A3kNrzMuivJTWDfF
Error: block was not found locally (offline): ipld: could not find QmPFLZYMsVVCgdKwrLC1FxKRQcZHb6A3kNrzMuivJTWDfF

And I still do not understand the ‘/whatever.json’ case as on local filesystem its always CIQWHATEVER.data

I can’t explain that, unless ipfs daemon is running with a different, non-root user or something, and reading from a datastore in a different location.

<cid>/whatever.json resolves to a single CID. You need to resolve the path first and then you have the cid of the corresponding block.

I have found that the issue was that IPFS_PATH was not set so it was looking into ~/.ipfs.

I remember to have found some documentation (maybe some --help for a Kubo command) that describes how a local blocks filesystem is created, but can’t find it back now.
I remember it was referencing to how Github organises its file system: using 1 level directories with 2 characters.
For example from hash CIQA272DMOWSAP3RBQQJESHRQWMQM76EAHLAD2XUDXWRMFVKZLLPDYQ, it takes DY as 2 characters from the last 3 characters and puts it into directory .blocks/DY/CIQA272DMOWSAP3RBQQJESHRQWMQM76EAHLAD2XUDXWRMFVKZLLPDYQ.data
But where can I find this documentation, so I can write a script that could convert a hash to filesystem path.
Is directory to be always first 2 characters from the 3 last ones?

UPDATE: I have found a quite comprehensive description at Understanding the filenames inside .ipfs/blocks/ on how do you convert from CIDv0 to blocks filesystem CIDv1, then to actual filename and vice verca.
The documentation I was looking for is found in .ipfs/blocks/_README

It also mentions that bigger files are being split into several blocks.

My idea was to make symlinks over the pinned files to pretend them like just normal local files. This is because this domain/path is being already configured to be cached by Cloudflare when caching a separate IPFS node might be problematic because of possible many undesired/unpinned files.

The symlink method perfectly works but according to Reddit article will not work properly for bigger files.