Job Offer Scraping tool + MongoDB integration

Manually job hunting on websites may sometimes be boring. Data scraping comes with help.

I wrote this code because I'm having a tough time landing a job myself. Fingers crossed, this will make things a bit easier down the road.

Note:

To modify the code in order to make it use a different URL just edit config.env, but it is strongly adapted for pracuj.pl site. I suggest pasting the URL with already chosen requirements, such as Job Title, Localisation etc. -> not to over deliver useless data

Written in Python 3.10 using Selenium, Pandas and Spacy

This is a Python script that automates the task of collecting job offers from a specific website (in this case pracuj.pl). It not only grabs basic details like job titles and company names but also extracts job requirements. All collected information is stored in a MongoDB database.

Get the newest chromedriver in order to run the chrome webdriver, it is being updated often and the one provided with this repo could be outdated.

After running for a while, it sorts and prints all the reuqirements found, from the most common to the rarest:

Stores the data in MongoDB:

Libraries Used:

Selenium: Clicks around the web page for us.
BeautifulSoup: Reads the web page's code to pick out the details we want.
spaCy: Looks through the job description to find important keywords.
MongoDB: This is where we keep all the job data we collect.

Note: Make sure to update the URL and adjust the code if needed to match the target website structure or specific requirements.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.idea		.idea
README.md		README.md
chromedriver.exe		chromedriver.exe
config.env		config.env
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Job Offer Scraping tool + MongoDB integration

Note:

Written in Python 3.10 using Selenium, Pandas and Spacy

Libraries Used:

About

Releases

Packages

Languages

gluchy1/joboffer-scraping-tool

Folders and files

Latest commit

History

Repository files navigation

Job Offer Scraping tool + MongoDB integration

Note:

Written in Python 3.10 using Selenium, Pandas and Spacy

Libraries Used:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages