Should we profile CIDs?

Thank you:

The difference is the following:

  • (IPFS) bafybeignp2eaklnbejnlcrxaldpiuoc63tk63vdsokleegajxpvczzxiau:
    • Intermediary node1 → 1024 links
    • Intermediary node2 → 1 link (last block)
  • Singularity: bafybeia2jsxebrhwuehoptuhpmhmlxhot74nalyihzud2uufosptoakjyu
    • Intermediary node 1 → 1024 links
    • last block

Both export to the same file (by chance or not).

Our “balanced” dag builder documentation says:

// Package balanced provides methods to build balanced DAGs, which are generalistic
// DAGs in which all leaves (nodes representing chunks of data) are at the same
// distance from the root. Nodes can have only a maximum number of children; to be
// able to store more leaf data nodes balanced DAGs are extended by increasing its
// depth (and having more intermediary nodes).

In Singularity’s DAG, the last leaf node is not as the same distance from the root as the others.

There’s a UnixFS Spec in the repo (specs/UNIXFS.md at main · ipfs/specs · GitHub):

The balanced layout creates a balanced tree of width ‘max width’. The tree is formed by taking up to ‘max width’ chunks from the chunk stream, and creating a unixfs file node that links to all of them. This is repeated until ‘max width’ unixfs file nodes are created, at which point a unixfs file node is created to hold all of those nodes, recursively. The root node of the resultant tree is returned as the handle to the newly imported file.

It could be worded much better but I think it matches what our implementation does:

  • Add chunks to a node until max width reached
  • At which point do the same but with a different node.
  • Create a unixfs node that links to “those nodes” (meaning the nodes linking to the chunks, not the chunks directly)

Is it possible to adapt your implementation at this point?

3 Likes