YouTube API
Calculate the total duration of a youtube playlist by getting all video ids it contains, and accumulating their durations
Refer to the link below to learn how to enable the API. Note that at the time of writing this notebook, the code examples in the official guide use the OAuth 2.0 method to authenticate, while this notebook simply uses an API key (option "a" in the guide). OAuth 2.0 is only required when accessing account related content.
- Official API intro guide: https://developers.google.com/youtube/v3/quickstart/python
In [1]:
import getpass
import googleapiclient.discovery
import numpy as np
import pandas as pd
from datetime import timedelta
In [2]:
api_key = getpass.getpass("Enter your API key: ")
youtube = googleapiclient.discovery.build("youtube", "v3",
developerKey=api_key)
Enter you API key: ··········
Performing the neccessary requests
Carefully specifying the "part" parameter of the requests is important to conserve quoatas, as it specifies which info we want in our response. Set it as narrow/concise as possible.
In [3]:
playlist_id = input("Enter a youtube playlist id: ")
Enter a youtube playlist id: PLGVZCDnMOq0rLLb519Ah3EntCUAAHPnfU
In [4]:
# limit used for pagination
page_size = 50
# get video identifiers from playlist identifier (page_size at a time)
videoIds = []
query = dict(
part="contentDetails", maxResults=page_size, playlistId=playlist_id)
while True:
request = youtube.playlistItems().list(**query)
result = request.execute()
videoIds += [i["contentDetails"]["videoId"] for i in result["items"]]
if "nextPageToken" in result:
query["pageToken"] = result["nextPageToken"]
else:
break
# get video metadata by video identifiers (page_size at a time)
durations = []
while videoIds:
idList = videoIds[:page_size]
videoIds = videoIds[page_size:]
request = youtube.videos().list(
part="contentDetails", id=",".join(idList))
result = request.execute()
durations += [r["contentDetails"]["duration"] for r in result["items"]]
Playlist duration calculation
In [5]:
timedeltas = pd.to_timedelta(pd.Series(durations).str.slice(start=2))
print(f"playlist: https://www.youtube.com/playlist?list={playlist_id}\n"
f"videos: {len(timedeltas)}\n"
f"mean_playtime: {timedeltas.mean()}\n"
f"total_playtime: {timedeltas.sum()}")
playlist: https://www.youtube.com/playlist?list=PLGVZCDnMOq0rLLb519Ah3EntCUAAHPnfU videos: 52 mean_playtime: 0 days 00:46:44.961538461 total_playtime: 1 days 16:30:58