-
Notifications
You must be signed in to change notification settings - Fork 509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do I use this scrapper? #168
Comments
I'm still new to python, but this is my understanding. Information is pulled using a Schema. The supported ones are "recipe" and "webpage" (Line 5, _schemaorg.py). AbstractScrapper will use decorators (_decorators.py) and attempt to find the information based on the schema template. If there are no templates, it will use the scraper found in the scraper folder. This is why "allrecipes" only have the host define since the website is designed with schema in mind. The first two code block in the For your case, you would import the scraper, assign the recipe to variable and call the methods on that.
|
For allrecipes - it's hidden in plain sight but it works as marmiton.py. To save time with the initial scraping I'd suggest going to the archive in this early days issue #9. For the search based on ingredients I suggest looking at tools as Solr and Elasticsearch. And if you are using Django you can further simplify your work with django-haystack. I'm closing the issue but feel free to reopen if anything arises again. |
Hi! I'm new to web-scraping and python. I want to create a simple project where people can lookup recipes based on ingredients, and I think this scrapper could help, but I'm not sure how to approach it.
I cloned the code locally and I've been trying to go through it and this is what I sort of understand:
I noticed some scrappers like allrecipes that only have a host method implemented. I guess my question is are these scrappers incomplete? or is there something I'm not seeing?
And about the ingredients, my idea is to run this scrapper through all the available keys (websites), scrap most of the recipes and store the json on a database. Then, I would create queries looking for the recipes that match the ingredients.
Is this a good use case?
The text was updated successfully, but these errors were encountered: