Disk space consumption in IPFS

Hello,

You slipped here, unfortunately.

The size is summed from directory entries. These directories and nodes come from GitHub - ipfs/go-ipfs-files: An old files library, please migrate to `github.com/ipfs/go-libipfs/files` instead.. Directory and Node are interfaces: the actual type behind is unixfs (called from here , which arrives to https://github.com/ipfs/go-unixfs/blob/v0.4.0/file/unixfile.go#L153).

In term of directories, the underlying type would be ufsDirectory (https://github.com/ipfs/go-unixfs/blob/59752aec6306c2ca2d9a020a2a9556d5f8bce956/file/unixfile.go#L19. This has a Size() which is the sum of the sizes of the links (at least for basicDirectories). That is the size of the data of a unixfs directory node itself.

The gateway wants to show the sum of the sizes of the files and folders in the directory. The size of a unixfs folder would be counted as I mentioned just above. The rest would be unixfs files, which are of type ufsFile, and that implements the Size() method with: FileSize().

FileSize() eventually gets the FileSize field of the unixfs node, which is a field that does not require any computing as it comes already hardcoded in the unixfs file protobuf.

As you know, a big file will be made of many Blocks glued together in a MerkleDAG. The FileSize is set during the construction of the dags and contains the sum of sizes of the blocks (https://github.com/ipfs/go-unixfs/blob/59752aec6306c2ca2d9a020a2a9556d5f8bce956/importer/balanced/builder.go#L157-L170).

And this is why the directory view in the gateway can report the total size of the files in the directory. Note this requires fetching the root node for every item in the directory, which is why a recent optimization avoids that for directories with many entries, in which case I think it won’t show the size.


Now for the topic of what size is what, there are several layers wrapping each other (Merkledag (“ipfs object stat…” for merkledag-pb dag nodes), unixfs (“ipfs files stat” for unixfs nodes) and blocks (“ipfs block stat”, which is what the the BlockStat you found gives), each of them with a concept of what “size” is. See Finding the DataType of a dag-pb block via the HTTP API - #6 by hector. “ipfs dag stat” would operate on the same level as “ipfs object stat”, except for arbitrary IPLD DAG nodes, not necessarily merkledag-pb.

Hope that helps.

2 Likes