Finding the DataType of a dag-pb block via the HTTP API

Hi everyone,

I have been using IPFS at the IPLD level for a while, but now I have to deal with files, i.e. UnixfsV1. I need to do some sanity checks on CIDs passed into my code, such as checking if it’s a file or a directory, and checking the file size before downloading a potentially huge stream of bytes.

Therefore I am looking for a way to obtain metadata given a dag-pb CID, and in particular the DataType and filesize fields of the ProtoBuf record, using the HTTP API. In principle, I can use the block API ( /api/v0/block/get) to obtain the raw data and decode the ProtoBuf myself, but I would prefer to avoid going into such low-level details.

Is there a simpler way?

Seems like ipfs object stat would help you.

Thanks @hector, that looks useful indeed. The size isn’t quite the file size, but close enough to distinguish “reasonable” from “too big”.

There is Size and CumulativeSize. One should be “file size” and the other “unixfs dag size” (larger).

There is BlockSize, DataSize, and CumulativeSize. They are all close to but not identical to the file size. Example:

~/ $ ipfs ls bafybeibxm2nsadl3fnxv2sxcxmxaco2jl53wpeorjdzidjwf5aqdg7wa6u
QmZTR5bcpQD7cFgTorqxZDYaew1Wqgfbd2ud9QqGPAkK2V 1677 about
QmYCvbfNbCwFR45HiNP45rwJgvatpiW38D961L5qAhUM5Y 189  contact
QmY5heUM5qgRubMDD1og9fhCPA6QdkMp3QCwd4s7gJsyE7 311  help
QmejvEPop4D7YUadeGqYWmZxHhLc4JBUCzJJHWMzdcMe2y 4    ping
QmXgqKTbzdh83pQtKFb19SpMCpDDcKR2ujqk3pKph9aCNF 1681 quick-start
QmPZ9gcCEpqKTo6aq61g2nXGUhM4iCL3ewB6LDXZCtioEB 1091 readme
QmQ5vhrL7uv6tuoN9KeVBwd4PwfQkXdVVmDLUZuTNxqgvm 1162 security-notes

~/ $ ipfs object stat QmZTR5bcpQD7cFgTorqxZDYaew1Wqgfbd2ud9QqGPAkK2V
NumLinks:       0
BlockSize:      1688
LinksSize:      3
DataSize:       1685
CumulativeSize: 1688

~/ $ ipfs get QmZTR5bcpQD7cFgTorqxZDYaew1Wqgfbd2ud9QqGPAkK2V -o=about
Saving file(s) to about
 1.64 KiB / 1.64 KiB [==============================================] 100.00% 0s

~/ $ ls -l about
-rw-r--r--  1 hinsen  staff  1677 Oct 28 16:01 about

Sorry, I got it slightly wrong.

ipfs object operates on the merkledag-pb layer, which wraps the unixfs-pb.

1677 is the actual file size as reported by a field in the unixfs-protobuf (which ls uses).

1685 is the DataSize in the block, which is the [full_blocksize - links size] (the full merkledag-pb block being 1688 ytes). So this is the size of the unixfs protobuf inside the pb-merkledag block.

Instead, if we want answers from the unixfs layer we need to use ipfs files:

ipfs files stat /ipfs/bafybeibxm2nsadl3fnxv2sxcxmxaco2jl53wpeorjdzidjwf5aqdg7wa6u
Size: 0
CumulativeSize: 6544
ChildBlocks: 7
Type: directory
ipfs files stat /ipfs/bafybeibxm2nsadl3fnxv2sxcxmxaco2jl53wpeorjdzidjwf5aqdg7wa6u/about
Size: 1677
CumulativeSize: 1688
ChildBlocks: 0
Type: file

In the last one, Size is the actual file size as it would be on disk. CumulativeSize corresponds to that of ipfs object stat, that is, the actual size the dag under that CID takes on IPFS.

Ahhhh… thanks, that’s really useful. And I hadn’t even looked at the files section of the API because in my mind it was for MFS only.