Skip to content

colmex/frontera_example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Example frontera project

This is an example frontera project, combining scrapy's quotes example and frontera's cluster-setup example. (links: https://doc.scrapy.org/en/latest/intro/tutorial.html and http://frontera.readthedocs.io/en/v0.7.1/topics/cluster-setup.html

How to run it

First off open a bunch (7 should be enough) of terminal windows (tmux is awesome for this).

First the backend

Run the following cmd to get hbase, kafka, and zookeeper running. It will also create the HBase namespace.

docker-compose -f docker-compose-backend.yml up -d 
echo "waiting for hbase startup (30 seconds)"
sleep 30
echo "create_namespace 'crawler'" | docker exec -i docker_hbase /hbase/bin/hbase shell

Next the Storage workers

In a separate terminal window run python -m frontera.worker.db --config frontier.dbworker --no-incoming In another separate terminal window run python -m frontera.worker.db --no-batches --config frontier.dbworker

Next the strategy workers

In a separate terminal windows run python -m frontera.worker.strategy --config frontier.sworker --partition-id 0 In another separate terminal window run python -m frontera.worker.strategy --config frontier.sworker --partition-id 1

Next the crawlers

In separate terminal windows run:

  • scrapy crawl quotes -L INFO -s SPIDER_PARTITION_ID=0
  • scrapy crawl quotes -L INFO -s SPIDER_PARTITION_ID=1

Releases

No releases published

Packages

No packages published

Languages