I would definitely use Lucene (w/ Tika) and then perhaps make your project itself consist of writing a crawler that crawls IPFS data and loads it into a Lucene index. There is already a Solr (built on Lucene) app that can provide some GUI interface.
There’s a sample crawler app for Lucene (part of the Lucene distro) that can scan an entire folder structure of files, and you could use something like that example as your starting point for how to crawl IPFS data. There may already be some Lucene-based tooling for IPFS? I haven’t checked.
Random links I found: (I’m interested in this topic too, btw)