ARCHIVAL NOTICE

YouTube's front end has changed significantly, requiring JavaScript rendering of pages for web scraping. Considering this I am no longer maintaining this code base. As it was primarily just an example of a somewhat advanced Scrapy spider project.

As such, I won't be accepting pull requests or continuing development.

If you have an alternative project filling this niche, I will happy link it here:

(YouTube API V3)[https://developers.google.com/youtube/v3/docs?hl=en#Playlists}

Youtube History Scraper

This tool can be used to scrape YouTube history after their history API changed to only allow fetching the last two weeks' worth of data. There might exist a project to use that API periodically to log history from a certain time onwards but zvodd's scraper exports all history. It was originally in Python 2.7 and had no date parser and were added in this fork. This project is purely experimental and there is no error handling.

Privacy Notice: This tool only exports data locally and does not send your information elsewhere.

Dependencies

Use pip to install the dependencies below. Also note that pywin32's pip support is experimental. See their repo for details.

Required

Python 3
- scrapy
- lxml

Optional

Python 3
- sqlalchemy (optional)
- pywin32 (for Windows users only)

Usage

Prerequisites

Scrapy requires a cookie to export a user's history. A template has already been provided in youtube_request_headers.txt. The only field that needs to be filled in by the user is cookie. This can be obtained by doing the following:

Open a web browser.
Open the Inspect Console by pressing Ctrl+Shift+I (may vary by browser).
Open the Network tab and enable Preserve Logs or a similar option.
Assuming one is signed in to their account, go to the YouTube history page.
Find the corresponding log entry for the history page in the Network tab which should be a GET request.
From the Raw Header data (may need to toggle this on), copy the Cookie field from the Request Headers section into youtube_request_headers.txt already present in the repository. It's a pretty big string unfortunately.

Running scrapy

To run the scraper and export the data as a CSV, open a terminal/shell of your choice and run the following:

	shell> scrapy crawl yth_spider -o history.csv -L ERROR

Note: This may take a while.

The -L argument will output errors. Feel free to open issues if you encounter them. Once it's done, import the CSV into a tool of your choice.

Output Format

The CSV is output in the following format:

Channel Name	Channel URL	Date	Description	Time (in seconds)	Title	Video URL

Known Issues

The data from upto a week before is displayed as days of the week, i.e. Sunday, Monday, etc. Any way to parse this would be appreciated. For now, just change them manually.

Contribution

Any contribution is welcome, whether it be feedback, an issue, a pull request.

License

I'd be happy to license this code under an MIT 2.0 License but the original repository doesn't have a license and I thus have no right. They have sole ownership of the code and thus reserve the right for this to be taken down. This will be modified to include a license in the future if possible.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
youtube_history		youtube_history
.gitignore		.gitignore
LICENSE		LICENSE
readme.md		readme.md
requirements.txt		requirements.txt
scrapy.cfg		scrapy.cfg
youtube_request_headers.txt		youtube_request_headers.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARCHIVAL NOTICE

Youtube History Scraper

Dependencies

Required

Optional

Usage

Prerequisites

Running scrapy

Output Format

Known Issues

Contribution

License

About

Releases

Packages

Contributors 3

Languages

License

zvodd/Youtube-Watch-History-Scraper

Folders and files

Latest commit

History

Repository files navigation

ARCHIVAL NOTICE

Youtube History Scraper

Dependencies

Required

Optional

Usage

Prerequisites

Running scrapy

Output Format

Known Issues

Contribution

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages