Getting a CID by combining chunks of file

noname · July 24, 2022, 5:28am

Hello,
I wanted to use Crust for storing large files.
Recommended size for storing file on Crust is 1GB.
I want to store large files and share the CID publicly,
let’s suppose I have a 5GB file which I will split into 1GB chunks with help of split utility and pin them to Crust. But rather than putting everyone in the process of download these chunks and combining them together themselves, can I just obtain a new CID which I can share publicly. Through which everyone else can download with help of just one CID. Rather than going through the mess of downloading chunks and merging them.

All I want is to get a CID which combines these chunks together.

dvntaudio · July 24, 2022, 6:53am

Does crust have 1Gb as a hard limit or a “best practices”? From what I have observed with working on IPFS cluster and private IPNS networks is there is a bottleneck with Pins and upload. This leads to guiding to use smaller files. To ensure there aren’t 504 errors and other service timeouts. If crust disallows over 1Gb objects I would do it as a private torrent and use a solid p2p client. I like Tixati (xplatform) also qbittorrent on winPC. Or you could adapt the “custom remote pinning API” to your use case. Lastly there is ZeroTier which is excellent for rolling your own p2p solutions

noname · July 24, 2022, 7:31am

1GB is best practices not a hard limit, you can store upto 32GB (limit), and practical size limit is 5GB. Though the network allows upto 32GB, when files get larger than 5GB, less (no) nodes store the content. Files upto 1GB are ideal and stored by many nodes.

Thanks for the suggestion of Private Torrent, I will take a look into it.

fusetim · July 24, 2022, 10:05am

So, it seems that crust does not support CAR import, so my solution can work but is quite time-consuming. I would do that if I thought the pinning of this file is very important to have.

Firstly, what you want is to import your file into IPFS, and generate its CID (this CID is the one you want).
Then,you will export this file into a CAR archive using IPFS.

This archive is currently too big to be send for now. You would like to split this archive, so you will use carbites (or some other utility) to split this big Car files into multiple ones of fixed size.
This step will create new CAR archives (and CID for these parts) that are composed of part of your bigger file and rearrange them (partial subtree).

Using the Crust tool integrated into IPFS, you will want to : first import one CAR part into IPFS, then open its content using its CID in the file explorer then upload it to CRUST. And the same thing for all the parts.

You will normally after that have uploaded the entire file, using parts which would be pinned independently from each other. But as the part archives contains exactly the same blocks than the entire archive, the first CID of your big file will just work.

Jorropo · July 24, 2022, 6:03pm

I have wrote code that do exactly that I call that “the leaf hack”

(if you want to use it just write a crust driver for linux2ipfs)

veermetri05 · June 6, 2024, 11:27am

IPFS chunks files by default. If you inspect a CID of a large file you can there are links to other links which will then have raw data. What you would do is, try pinning those links, which will be of smaller size than the large file and then just store your CIDs DAG data somewhere, or if it is feasible, pin all the chunks and then pin the whole file again (you will essentially be putting 2x+ cost of just storing it in single go).
I am not sure how to do this programmatically, but you can use any IPFS library to parse and traverse the DAG and then make a list of CIDs that are links within the large file. After you have obtained the list of CIDs you can then place order on the network, for those tiny chunks. Once all chunks are pinned try placing a order on the network, by your original CID.

This topic is interesting, I will try to build some code that will help you to achieve this.

Topic		Replies	Views
Can I get the size of the file before downloading it throught IPFS? Help	3	257	October 5, 2022
How to get all chunks'cid of a file larger than 43.5MiB from IPFS? Help go-ipfs	2	48	September 26, 2024
Same File Produced Different CID go-ipfs , files	5	553	March 28, 2022
Why does the same file result in different sha256 in cid? Help go-ipfs	4	537	June 14, 2022
Crust Network - decentralized pinning service of IPFS Ecosystem and Usage	0	589	November 17, 2021

Getting a CID by combining chunks of file

Related topics