Skip to content

Scrape a web page for pdf files and download them all locally.

License

Notifications You must be signed in to change notification settings

scottgriv/python-pdf_web_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Python Badge
GitHub Badge Email Badge BuyMeACoffee Badge
Bronze


Python PDF Web Scraper

A simple Python script that scrapes web pages for PDF files and downloads them to a local directory.


Table of Contents

Getting Started

  1. Clone this repository.
  2. Install Python.
  3. Install Pip.
  4. Install pip installl beautifulsoup4 and pip install urllib3 in your terminal.
  5. Place the web page URL and output file location in the main.py file here:
# Define your URL
url = "https://yourWebsiteURL"

#If there is no such folder, the script will create one automatically
folder_location = r'/YOUR/OUTPUT/FILE/PATH'
  1. Run the script: python main.py
  2. PDF files will be downloaded to your local directory.

Resources

License

This project is released under the terms of The Unlicense, which allows you to use, modify, and distribute the code as you see fit.

  • The Unlicense removes traditional copyright restrictions, giving you the freedom to use the code in any way you choose.
  • For more details, see the LICENSE file in this repository.

Credits

Author: Scott Grivner
Email: scott.grivner@gmail.com
Website: scottgrivner.dev
Reference: Main Branch