How are git-raw IPLD nodes created?

Among the examples given for exploration in the IPFS Web UI, there is the CID z8mWaJHXieAVxxLagBpdaNWFEBKVWmMiE which is of type “git-raw”. I assume that it was somehow generated from a git repository, but how? I found various explanations of how to store git repositories in IPFS, but they only store the git objects as plain files, so the CIDs will be of type dag-pb.

You can traverse a range of data structures using ipld.

Here you can find ipld codec for git objects allowing path traversals across the git graph:

Also, there are tons of codecs for ethereum, bitcoin, torrent, ZCash etc. Here is a list of them.

Thanks, that solves a part of the mystery!

But I am still wondering how the example repository was transferred to IPFS. Using the standard DAG API, I’d have to convert each git object into the correct JSON syntax first, which is certainly possible, but equally certainly cumbersome.

I looked a bit at the raw blocks in that example repository. Their contents looks similar to the output of git cat-file, but it isn’t quite the same (for example, there’s a zero byte after the length field). So I suspect that someone has written a script to convert the git objects from a repository to DAG nodes, and I’d really like to see that script.

Yes I am also wondering about this. When I do ipfs dag get z8mWaJHXieAVxxLagBpdaNWFEBKVWmMiE, I gather https://github.com/ipfs/go-ipld-git is being used to unmarshall the data into json?

[Aside: In general I am confused about the exact architecture wrp codecs—I think get what is going on conceptually, but i am curious how the implementations, not just the concept, compose. I saw https://github.com/ipld/specs/issues/244 on the docs which I think would help me understand better.]

I guess the minimum question I have is whether this is a combination of --format and --input-enc args to ipfs dag put such that I can create data with a git-raw CID which I could convert into a git hash without knowing the data referenced.

You should be able to push the Git blob directly via the Block API instead of the DAG API. You also need to use the uncompressed Git objects, for more information about that, see https://github.com/ipld/js-ipld-git/tree/master/test/fixtures.

Update: I forgot to mention that Bitswap (the main mechanism in IPFS to transfer blocks between peers) has a certain block size limit (around 1MiB IIRC). So don’t be surprised if you have larger blocks and they are not available as expected.

I’ve been exploring a variety of Git + IPFS approaches. These projects may be of interest.

@boris, I’m the author of the IGiS remote. It converts all the structures to CBOR and Protobuf so that the contents are accessible outside of git.

For example, QmS4QPcGCrN7rvn1JynNtBgCCudx5U1Y4fuCa5D14cEhgF. The branch that was pushed is the filesystem that you see at the root, and the git information is in the .git/ CBOR-DAG.

I’ve been messing around trying to create a system that supports pull requests, so there is some IOTA code mixed in with the IPFS pushing currently. I’m going to be separating that out and working on making fetching more performant this week.

The program for inserting raw git blocks is the IPLD remote. Raw git objects include a header with the type and size, so they can’t be used outside of git without some processing.

1 Like

Yeah I’m very interested in your work. I’d like to potentially integrate /use it with what we’re doing at Fission. I’ll do a write up in our forum about my ideas as well as experiment with it myself.

Thanks for your work so far!

1 Like

Thanks!—I’m belated reporting that everything you said worked perfectly. As to the size limit I thought about it some more and left a comment about what could be done: Git on IPFS - Links and References It would be nice to see that resolved statelessly, but in the meantime it’s not a blocker for me.

2 Likes