(Edit 08/03/2023: incredibly enough, this mess still works. One has to change the Invidious instance from time to time, but still.)

Instead of giving you the answer directly, I will explain the incredible journey I undertook to avoid clicking on three separate buttons. For the caffeine enjoyers among you, the complete code is already available here, along with a shorter, clearer version bellow.

Setting the scene

A few months ago, I stumbled upon a nice-looking essay about Newsletters and RSS, by Robin Rendle. I encourage you to read it, along with their others rants; it is informative, carefully crafted and an unexpected source of old book illustrations. This discovery pushed me to hop on the RSS train, long after it left the station, only following some blogs for a while.

Mustering my courage after postening it for days, I recently clicked two buttons and transfered all my YouTube subscriptions to my RSS reader. Everything worked fine, but a tiny thingy was missing: the videos’ duration. You see, clicking on a brand new video, realizing it is a 47 minutes documentary instead of a 30 seconds meme edit, and going back to the feed to mark it as “unread” is a tedious process. “One that could be avoided by simply adding the duration in the title, how hard could it be?”, said past-me, ignoring all the developement stories he’s ever read and his obvious need for sleep. So here we are.

As any other cool side project, this one is a perfect opportunity to learn new things. In this case, push-and-deploy-to-our-AWS-instance services, such as Vercel, but worry not, as the underlying principles do not change.

Note: yes, Heroku exists. It seemed more complicated than necessary, and their price specification is confusing (dynos? why do you need to bring a T-rex into the mix?). Anyway.

1. Parsing YouTube’s RSS feed

Vercel helps creating APIs by giving developers a neat boilerplate. For example, a file with the path api/channel/[id].js means an HTTP request to https://example.com/api/channel/abcd is handled by the [id].js file, and abcd is available as a variable during processing. To demonstrate that, the following sample of code gets the id and returns it as JSON:

1
2
3
export default async (req, res) => {
  return res.json({'id': req.query.id})
}

Note: to see it in action, install the Vercel CLI and then run vercel dev in your terminal. Finally, visit http://localhost:3000/channel/abcd.

Cool, now we need to get our RSS feed. YouTube has a not-so-straightforward-but-still-replicable way of getting RSS feeds for channels:

  1. Click on a video
  2. Click on the channel’s name
  3. Extract the channel’s ID from the URL and add it at the end of this one: https://www.youtube.com/feeds/videos.xml?channel_id=.

Although previous iterations of the code used various techniques to fetch and convert the feed to an exploitable JSON object, I finally settled with rss-parser. After creating a new parser, a single call is necessary to retrieve the feed. In this example, it is fetched and returned without any change:

1
2
3
4
5
6
7
8
let Parser = require('rss-parser')
let parser = new Parser()

export default async (req, res) => {
  let channel_id = req.query.id
  let feed = await parser.parseURL("https://www.youtube.com/feeds/videos.xml?channel_id=" + channel_id)
  return res.json(feed)
}

Visiting http://localhost:3000/channel/UCMOgdURr7d8pOVlc-alkfRg should show a happy little JSON object showcasing various information about the Kurzgesagt channel.

2. Where are the videos durations? Invidious to the rescue

As a cold sweat runs down your back, a glance at the data provided by the YouTube RSS feed confirms your greatest fear: no trace of the videos’ duration here. Keep calm; there are multiples ways of obtaining such information.

The first one is by kindly asking good ol’ YouTube itself. Its API, more specifically the Videos:list endpoint, gives access to the contentDetails.duration value for each video. Although it may look simple enough, I prayed the Open Web gods for two days, and what was bestowed upon me were only OAuth 2 requirements and obscure error messages. So no YouTube API for today.

Fortunately, an open-source, privacy friendly and API-enhanced alternative front-end to YouTube called Invidious exists. Its api/v1/channels/<id> endpoint provides a lot of information about a channel, among which latestVideos, the list of its lastest videos (who would have thought), containing a sneaky yet priceless lengthSeconds value for each one of them.

OK, let’s recap. We already have the content of the official YouTube RSS feed. For now, we want to:

  1. Get the videos’ duration from the Invidious API
  2. Add the duration from the API to the relevant video in the YouTube feed.

First, the node-fetch package is used to call the API, while specifying latestVideos as the only field to be retrieved.

1
2
3
4
5
import fetch from 'node-fetch'
// [...]
let invidious_instance_url = "https://<your_instance_here>/"
let invidious_result = await fetch(invidious_instance_url + "api/v1/channels/" + channel_id + "?&fields=latestVideos")
let invidious_data = await invidious_result.json()

Note : there are multiple instances of Invidious; pick one from the official list. Not all of them allow access to their API, so you need to manually test it.

Then, videos’ duration are stored in a object, using the relevant id as key.

1
2
3
4
5
6
7
let durations = {}
if(invidious_result.status == 200) {
  let invidious_data = await invidious_result.json()
  for(let vid of invidious_data.latestVideos) {
      durations[vid.videoId] = vid.lengthSeconds
  }
}

Finally, a loop iterates over the YouTube feed videos and add their corresponding duration to the object. Some string manipulation is needed to get the actual id of the video and to convert the durations from seconds to an elegant HH:MM:SS format (see this stackoverflow question for details).

1
2
3
4
5
6
7
for(let vid of feed.items) {
  let id = vid.id.replace("yt:video:", "")
  if(id in durations) {
    let duration = new Date(durations[id] * 1000).toISOString().substr(11, 8)
    vid.title = "[" + duration + "] " + vid.title
  }
}

Titles now contain the duration, hooray!

3. From RSS to JSON Feed

But not so fast! At the moment, our API does not return a valid RSS feed. To be fair, it will never return one. In fact, we will use JSON Feed instead, the format developers tell XML-based feeds not to worry about. As our code already works with a JavaScript/JSON object, returning it directly without fighting to convert it to XML is a true timesaver.

Most of the remaining modifications are required to be compliant with the 1.1 specification of JSON Feed so I won’t go over them all, except for a key detail. What would YouTube be without its agressive red arrows, ridicoulous reaction faces and other click-bait titles in gigantic letters? Probably a better place, but still, some thumbnails are needed.

Their URLs are hidden inside a field named media:thumbnail in the YouTube feed. The rss-parser package does not recognise it natively, and only top-level custom fields are supported, so the parent has to be specified when creating a new parser. Thus, we can access it with some property gymnastics inside the loop and add the thumbnail to each video in the feed.

1
2
3
4
5
6
7
8
9
let parser = new Parser({
  customFields: {
    item: ['media:group'],
  }
})
// ...
for(let vid of feed.items) {
  vid.image = vid['media:group']['media:thumbnail'][0]['$'].url
}

The only thing left now is to check the JSON feed specification and set the remaining values accordly while deleting the unwanted ones.

Example source code

If you want to copy and paste the following code, update the invidious_instance_url variable. You may also want to change the vl_url with your Vercel project’s URL, if you actually push it somewhere.

Note : a much more complete, more-or-less-deployment-ready version is available here, with instructions.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
import fetch from 'node-fetch'

let Parser = require('rss-parser')
let parser = new Parser({
  customFields: {
    item: ['media:group'],
  }
})

let invidious_instance_url = ""
let vl_url = "http://localhost:3000/"

export default async (req, res) => {
  let channel_id = req.query.id
  let feed = await parser.parseURL("https://www.youtube.com/feeds/videos.xml?channel_id=" + channel_id)

  let invidious_result = await fetch(invidious_instance_url + "api/v1/channels/" + channel_id + "?&fields=latestVideos")

  let durations = {}
  if(invidious_result.status == 200) {
    let invidious_data = await invidious_result.json()
    for(let vid of invidious_data.latestVideos) {
        durations[vid.videoId] = vid.lengthSeconds
    }
  }

  for(let vid of feed.items) {
    let id = vid.id.replace("yt:video:", "")
    if(id in durations) {
      let duration = new Date(durations[id] * 1000).toISOString().substr(11, 8)
      vid.title = "[" + duration + "] " + vid.title
    }

    vid.image = vid['media:group']['media:thumbnail'][0]['$'].url
    vid.content_text = vid['media:group']['media:description'][0]
    delete vid['media:group']
    vid.url = vid.link
    delete vid.link
    vid.date_published = vid.isoDate.replace('Z', '')
    delete vid.isoDate
    delete vid.pubDate
    vid.authors = [{name: vid.author}]
  }

  feed.version = "https://jsonfeed.org/version/1.1"
  feed.feed_url = vl_url + 'api/channel/' + channel_id
  feed.home_page_url = "https://www.youtube.com/channel/" + channel_id
  delete feed.link

  return res.json(feed)
}

To go a bit further

Those of you who checked the source code probably noticed various differences between what I showed you in this tutorial and what is actually being deployed. Some adjusments are made:

  1. Environnement variables are used to set the Invidious instance URL and the Vercel URL.
  2. It is possible to use only Invidious, as it provides a RSS feed similar to YouTube’s one.

I hope this article interested you. If you have any question, feel free to ask them. May you find new exciting ways to customize your RSS feeds.