On-Demand Podcast Distribution (cache only)?

ehmar · November 14, 2020, 12:30am

Hello,

I’m trying to understand the fesability of podcast distribution via IPFS.

The idea is to build a site to convert podcast/RSS feeds into IPFS/RSS feeds. Literally make a copy of the original feed with IPFS gateway urls for the media content. Anyone who uses the IPFS feed will get their media via IPFS gateways.

All the content would remain on the original hosting provider. The IPFS feed would load media files “on-demand” from the original feed into IPFS then naturally disappear as demand goes away.

I don’t want to pin every media file for every podcast in the world (don’t want to host the files), but get them loaded into the IPFS network temporarily until demand drops off.

If someone requests a podcast that’s 5 years old, it will also load “on-demand” (from the original feed) into IPFS for delivery. Not much performance increase in this case, but uses the same mechanism as a new podcast/episode.

I’m thinking I would create hashes for all the urls in the feed without actually downloading the original files and (somehow) stream the data only when requested.

Is on-demand file creation possible? i.e. create a hash, but stream the data only when requested (not pinned/stored in IPFS).

I’ve seen dynamic loading IPLD examples, but not sure if that’s what I need.

Can someone point me in the right direction?
Or let me know “it doesn’t work that way”?

Thanks,

teknomunk · November 14, 2020, 1:49am

…create hashes for all the urls in the feed without actually downloading the original files…

I don’t think you will be able to do this unless you have the cooperation of the source podcast(s) and then they would have to serve the files over IPFS. The IPFS hash is created from the content, and without downloading the entire content the IPFS application will not have the data it needs to create the hash. The only way around that is to have someone else generate the hash and them give it to you while also hosting the data on an IPFS node.

Something that would be possible is to have a standard HTTP server that provides both RSS feeds with rewritten links that redirect to the original media but tracks popular requests, has those files downloaded and added to IPFS and then provides a link to a public IPFS gateway to serve that file. You won’t be able to get around downloading the original content somewhere, but you would be able to get away with fewer files than downloading everything you are providing links to.

The public server will need to have the IPFS hash, but doesn’t have to have the audio files themselves as long as it gets the hash and someone is hosting the content. Downloading and uploading the podcast audio files can be on a different computer. That computer just has to be able to download the files, be running the IPFS daemon and have a way to communicate with the public server to exchange the URLs to download and the IPFS hashes generated.

ehmar · November 14, 2020, 3:12am

Thanks for the explanation. Can’t believe I didn’t make the connection that the hash is based on the data. Ugh!

Thinking performance is only beneficial for the newest episode (when everybody wants it). So rewriting links to all the old episodes would actually be slower than getting directly from the original host.

New process seems easier if I just store the latest episode.

User submits/updates their original feed.
Download new/latest episode to IPFS node (create episode hash).
Create/Update IPFS version of feed with new episode gateway URL. Leave all the old episodes as-is (original host).
Update IPNS for feed.
Podcast clients request IPNS feed & download the new episode via IPFS gateway.

When the original feed changes, only download the new/latest episode, and delete any old IPFS episodes from the node (because the old episodes in the feed will return to the original hosting url).

One downside I can see is that the original feed will always be posted sooner so people may not want to wait for the IPFS feed to update. Though performance should increase for the podcaster if they hide their feed and have clients use the IPFS feed.

Another downside is traffic on the IPFS gateway. If I can only put one url in the feed, everyone will flood the gateway. So it’s no different than using the original host, maybe worse?

Are gateways the bottleneck (until there’s an IPFS podcast app)? I quickly found this article which makes it seem like using IPFS gateways for distribution won’t help anybody.

Thanks for the help,

wclayf · November 14, 2020, 9:20pm

Podcasts over IPFS is something lots of people are thinking about (Adam Curry and his new PodcastIndex is an example of a team considering it). It’s true that most IPFS gateways aren’t going to want to just be used as “free” bandwidth providers, although I’m not sure what they’re doing in general to combat that. I do know some don’t allow video streaming for example. They just block it.

The interesting use case would be pure browser-based IPFS instances running peer-to-peer and getting podcast data from each other (originally sourced from the normal non-IPFS url), and sharing the bandwidth and gaining performance similar to how BitTorrent works.

But here’s the thing: Because most people will NOT be using any IPFS players, the podcasters are forced to use a normal podcasting hosting service (libsyn, bluberry, etc), and once they do that their bandwidth problems all vanish. So there’s no incentive. And on the consumer end, the podcasting hosts are serving up data just fine as is today. So neither end of the equation currently has any incentive to jump to IPFS.

Akita · November 15, 2020, 11:30am

One actor may have interest in the switch, though: the podcast hosting service.

wclayf · November 15, 2020, 3:15pm

Exactly right Akita. If podcast services provided a “player” (app) that used p2p tech (like IPFS) on the client to pull data from other peers then that could cut down on their bandwidth expenses. That is, decrease the bandwidth load on the service provider network itself.

ehmar · November 15, 2020, 4:32pm

Agreed. Ideally, distribution would use p2p natively, but it seems like that is a long way off.

The origination of the idea came after listening to Adam Curry talk about IPFS & PodcastIndex (+1 wclayf). I still have issues downloading from Adam’s feeds. I get disconnected often while downloading. Even podcastindex.org gives me problems (which I assume doesn’t have much traffic). Hoping that distributing the load would help.

Also agreed that diverting bandwidth to the gateways isn’t a viable solution. But it feels like that is their purpose. To donate bandwidth and promote access to IPFS. If they’re going to block or throttle traffic, it seems counter to the goal. But I understand why they need to prevent abuse.

johndeoi · November 15, 2020, 4:40pm

Newbie here, id love to dig into ipfs but i am just not there yet with either the practice or syntax.

So ive purchased literally every course on itulearning.com… im knocking our comp tia + and networking + , but from that point what would yall say the next course / series of courses i should aim for?

im also learning python but want to learn but would like to learn it via docker / azure / visual basic / kubernetes , so that i can put all the pieces together in a more thourough way. anywho id love to hear yalls advice / thoughts.

wclayf · November 15, 2020, 6:28pm

The PodcastIndex stuff is still a work in progress and they are actively working on all that, so it’s not stable right now I guess, but it’s great that they are helping revive RSS, because BigTech wants to kill it off so they can maintain control over all the world’s feeds. It’s a big threat to BigTech if the world gets where they’re no longer needed as the middle man between content providers and content consumers. Part of what IPFS itself is about is putting power back in the hands of the people thru decentralization.

ehmar · January 18, 2021, 3:30am

Just to follow up. This is what I’ve built.

ipfspodcasting.net

A website that reads podcast feeds and loads episodes into IPFS. This is achieved by instructing nodes to download and pin content (and also when to unpin). Then the podcast client uses the “IPFS Enabled” feed to download episodes via public gateways.

I only have a few local nodes participating. If anyone wants to test adding their node to the site, it would help. Basically, you run a python script on your node to request “work” and report results to the website.

You can also try adding your favorite RSS feeds.

It’s still experimental, but I wanted to get some public testing done before announcing to the podcast community.

Thanks.

Topic		Replies	Views
Caching of files on the IPFS network	2	1326	January 6, 2018
IPFS as a transport layer Ecosystem and Usage	14	1386	August 15, 2019
IPFS conceptual questions from a semi-noob	3	752	February 24, 2018
IPFS Propagation Help go-ipfs , research	4	2682	November 21, 2018
Torrent like usage? Kubo go-ipfs	7	1398	December 14, 2021

On-Demand Podcast Distribution (cache only)?

Related topics