Step-by-Step: Fetch YouTube Playlist Information with Python

Step-by-Step: Fetch YouTube Playlist Information with Python

If you’re looking to gather detailed statistics from your YouTube playlists, such as views, ratings, and video information, Python offers a robust solution. By leveraging libraries like pytube and pandas, you can automate the process and organize the data into a structured format. In this post, we'll walk through the steps to achieve this.


Steps

Certainly! Here's a step-by-step explanation of what is happening in the code:

  1. Install Required Libraries:

     pythonCopy code!pip install pytube pandas
    

    This line installs the necessary Python packages: pytube for interacting with YouTube and pandas for data manipulation and analysis.

  2. Import Libraries:

     pythonCopy codefrom pytube import Playlist, YouTube
     import pandas as pd
    

    This line imports the Playlist and YouTube classes from the pytube library and the pandas library as pd.

  3. Define Playlist URLs:

     pythonCopy codeall_playlist_urls = {
         'DAX Tutorials': 'https://www.youtube.com/watch?v=So6vr3mTHsA&list=PL03Lfvph34mezSKrmQS3Q9ZcZKZgbF7E3&pp=iAQB'',
     }
    

    This dictionary maps playlist titles to their respective YouTube playlist URLs.

  4. Initialize Result List:

     pythonCopy coderesult = []
    

    This line initializes an empty list called result to store information about each video.

  5. Iterate Over Playlists:

     pythonCopy codefor playlist_title, playlist_url in all_playlist_urls.items():
         playlist = Playlist(playlist_url)
    

    This loop iterates over each playlist in the all_playlist_urls dictionary. For each playlist, it creates a Playlist object using the URL.

  6. Iterate Over Videos in Playlist:

     pythonCopy codefor video_url in playlist.video_urls:
    

    This inner loop iterates over each video URL in the playlist.

  7. Fetch Video Details:

     pythonCopy codetry:
         video = YouTube(video_url)
         result.append(
             [
                 playlist_title,
                 video.title, 
                 video.views, 
                 video.rating,  # Note: YouTube does not provide likes/dislikes directly
                 video.length, 
                 video.video_id,
                 video_url,
                 playlist_url
             ]
         )
     except Exception as e:
         print(f"Error processing video {video_url}: {e}")
         continue
    
    • Try Block: Attempts to create a YouTube object using the video URL.

    • Extract Video Data: If successful, it extracts various details about the video such as the title, views, rating, duration, video ID, video URL, and playlist URL.

    • Append to Result: Adds the extracted data as a list to the result list.

    • Except Block: If there's an error processing a video, it prints an error message and continues with the next video.

  8. Create DataFrame:

     pythonCopy codedf = pd.DataFrame(
         result, 
         columns=['Playlist', 'Title', 'Views', 'Rating', 'Duration', 'VideoID', 'URL', 'PlaylistURL']
     )
    

    This line converts the result list into a pandas DataFrame with specified column names.

  9. Print DataFrame:

     pythonCopy codeprint(df)
    

    This line prints the DataFrame, displaying the collected video data.

  10. Complete Code Block

    !pip install pytube pandas
    
    from pytube import Playlist, YouTube
    import pandas as pd
    
    pythonCopy codeall_playlist_urls = {
        'DAX Tutorials': 'https://www.youtube.com/watch?v=So6vr3mTHsA&list=PL03Lfvph34mezSKrmQS3Q9ZcZKZgbF7E3&pp=iAQB'',
    }
    
    result = []
    
    for playlist_title, playlist_url in all_playlist_urls.items():
        playlist = Playlist(playlist_url)
    
        for video_url in playlist.video_urls:
            try:
                video = YouTube(video_url)
                result.append(
                    [
                        playlist_title,
                        video.title, 
                        video.views, 
                        video.rating,  # Note: YouTube does not provide likes/dislikes directly
                        video.length, 
                        video.video_id,
                        video_url,
                        playlist_url
                    ]
                )
            except Exception as e:
                print(f"Error processing video {video_url}: {e}")
                continue
    
    df = pd.DataFrame(
        result, 
        columns=['Playlist', 'Title', 'Views', 'Rating', 'Duration', 'VideoID', 'URL', 'PlaylistURL']
    )
    print(df)
    

Summary

The script fetches details for all videos in the specified YouTube playlists and stores the information in a pandas DataFrame. The YouTube Data API is used to retrieve video data, and the script handles any errors that occur during data fetching. The final DataFrame contains columns for the playlist title, video title, views, rating, duration, video ID, video URL, and playlist URL.

Notes:

Fetching Likes and Dislikes: The YouTube Data API v3 does not directly provide likes and dislikes. If you need this data, consider using the official YouTube Data API with proper authentication and access.

Did you find this article valuable?

Support BI Diaries© by becoming a sponsor. Any amount is appreciated!