Skip to content

For this project, I chose to scrape content from the Student News Daily website. Using Python and these robust modules, I automated the extraction of articles, pulling crucial data such as headlines, publication dates, and article summaries directly from the site.

Notifications You must be signed in to change notification settings

jharishav99/Web-Scrapping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Here’s a brief overview of how each module contributed to the project's success:

1️⃣ Beautiful Soup: This module made parsing HTML and XML documents a breeze. It allowed me to navigate the HTML structure of each web page and extract the specific data elements I needed.

2️⃣ LXML Parser: Known for its speed and efficiency, LXML was instrumental in handling the parsing tasks, especially when dealing with large volumes of data. Its robustness ensured that my scraper could handle the intricacies of the website's markup.

3️⃣ Requests module: This module facilitated seamless HTTP requests, enabling my scraper to fetch web pages from Student News Daily without hassle. It managed the communication between my Python script and the web server flawlessly.

By combining these modules,

I created a streamlined web scraping tool that not only gathers information efficiently but also respects the website's protocols and ensures ethical data extraction practices.

About

For this project, I chose to scrape content from the Student News Daily website. Using Python and these robust modules, I automated the extraction of articles, pulling crucial data such as headlines, publication dates, and article summaries directly from the site.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published