Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request new functions and clarification of doubts #15

Open
Alex87a opened this issue Apr 21, 2024 · 1 comment
Open

Request new functions and clarification of doubts #15

Alex87a opened this issue Apr 21, 2024 · 1 comment

Comments

@Alex87a
Copy link

Alex87a commented Apr 21, 2024

Good evening! I came across this repository of yours as I was interested in the topic of Amazon Scraping. I haven't had the chance to try and test what your program is capable of doing yet, so I apologize if I ask you obvious questions. I read a bit about the description of what this project could potentially do. At this point, however, a doubt arises regarding the implementation of Mongodb. From what I understand, this is nothing more than a sort of database in which the data scraped by Amazon is stored. The question at this point is, once the information has been extracted from Amazon, does the actual scraping take place in the database or does it continue to do so on the official website? Because I would like to understand if the IP address could be banned (even if you have implemented the user-agent). Next, I wanted to ask you if you have ever considered the possibility of implementing Telegram API to build a Bot, through which scrap offers can be posted on a channel or in private. Maybe it's time consuming and laborious to implement, but I just wanted to know if you've ever considered this as an idea. Thank you in advance and wish you a good evening!

@sushil-rgb
Copy link
Owner

Hey @Alex87a , sorry for responding late and thank you for reaching. As to answer your question:

The question at this point is, once the information has been extracted from Amazon, does the actual scraping take place in the database or does it continue to do so on the official website?

The script stores the datasets in MongoDB after performing live scraping and before storing in the database.

Because I would like to understand if the IP address could be banned (even if you have implemented the user-agent).

It's possible for your IP to be banned, so I have implemented a random time interval between each request to reduce the chance of this happening. So far, my IP has not been banned by Amazon, but I do occasionally receive a 503 error from their server.

Next, I wanted to ask you if you have ever considered the possibility of implementing Telegram API to build a Bot, through which scrap offers can be posted on a channel or in private

I have been thinking about creating a bot that can make a call from the webhook and download the scraped data in a spreadsheet format. However, I am also considering using Discord API, as I have already created a Discord bot that fetches product information from an Amazon product . Unfortunately, I haven't had the time to create a fully-fledged bot app yet.

I would like you to try the scraper and run it for yourself. It's not perfect, but it will do the work accurately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants