Web-Scraping-IMDb

This is an example of Web Scraping using Scrapy (a Python package) and Python

1. Install the latest version of Python in your computer

2. Open up a terminal and install Scrapy using the command(After installing Python):

$ C:\Users\YOU> pip install Scrapy

3. Clone this repository anywhere in your computer

4. In a terminal, cd into the project directory and navigate to the directory where the settings and pipelines file are.

$ C:\Users\YOU\Desktop\MovieSpider>

5. Run the following command: scrapy crawl greatspider

$ C:\Users\YOU\Desktop\MovieSpider> scrapy crawl greatspider

6. The spider will crawl the website and display the results in the terminal

7. You can choose to output the results in file format like JSON or XML

$ C:\Users\YOU\Desktop\MovieSpider> scrapy crawl greatspider -o movies.json

$ C:\Users\YOU\Desktop\MovieSpider> scrapy crawl greatspider -o movies.xml

Congratulations!

You have successfully scraped your first website on the Internet!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
MovieSpider		MovieSpider
README.md		README.md
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web-Scraping-IMDb

1. Install the latest version of Python in your computer

2. Open up a terminal and install Scrapy using the command(After installing Python):

3. Clone this repository anywhere in your computer

4. In a terminal, cd into the project directory and navigate to the directory where the settings and pipelines file are.

5. Run the following command: scrapy crawl greatspider

6. The spider will crawl the website and display the results in the terminal

7. You can choose to output the results in file format like JSON or XML

Congratulations!

About

Releases

Packages

Languages

fredricksimi/Web-Scraping-IMDb

Folders and files

Latest commit

History

Repository files navigation

Web-Scraping-IMDb

1. Install the latest version of Python in your computer

2. Open up a terminal and install Scrapy using the command(After installing Python):

3. Clone this repository anywhere in your computer

4. In a terminal, cd into the project directory and navigate to the directory where the settings and pipelines file are.

5. Run the following command: scrapy crawl greatspider

6. The spider will crawl the website and display the results in the terminal

7. You can choose to output the results in file format like JSON or XML

Congratulations!

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages