The web platform I’ve developed called Quanta (https://quanta.wiki) is uniquely positioned I think to be both the search engine as well as the browsing interface (and even editing and upload interface) for a new Distributed Web3.0 Wikipedia. It’s Federated, supports IPFS, and is for the most part a “completed” project/app. It can do industrial strength high-performance full-text search using MongoDB (i.e. Lucene) as it’s data storage (in addition to IPFS).
For years I’ve planned to load the DB with Wikimedia data to show what it can do as a competitor for Wikipedia. I may now look into ZIM files. Are ZIM files the standard Wikimedia file format? Standing up a new Wikipedia on Quanta is basically just a matter of importing the data. Quanta expects content to be plain text or markdown however so admittedly getting the formatting right for the articles would be a separate task.
Here’s a link:
Quanta has been mentioned in a previous IPFS Newsletter, but for some reason it hasn’t yet been added to their “Ecosystems” page, despite it being far larger and feature rich than probably any other project listed. Quanta is a ‘general purpose’ content platform, with a unique set of features and a powerful design unlike any other existing platform.
Thanks @lidel Do you happen to know where I can download a subset of wikimedia (rather than the 100GB) download, to get for example only the field of “Physics” (some any small niche genre of documents), to use as content for a small technology demonstration? Preferably in JSON format.
As far as I could determine ZIM files are a proprietary format for a specific reader app called Kiwik. I know the topic-specific XML/JSON files are out there, because I’ve downloaded the physics one in the past.