Skip to content
Giacomo Stelluti Scala edited this page Jan 25, 2020 · 9 revisions

Introduction

PickAll is a .NET class library that aims to make easy web scraping of search engines or similar web sites. The guiding philosophy, behind its design, is:

  • gather a limited amount of results (essentially an URL and a description) from multiple sources
  • post process them in a predefined chain of steps (e.g. to order and remove duplicates)
  • eventually produce more data during search or post processing (e.g. all text of a web page)

All code snippets are in C#, except where otherwise stated.

Why use it

  • It's batteries included and easy to use.
  • Avoid you boring yourself with web scraping implementation details.
  • It's highly configurable and fully extensible.
  • AngleSharp library, robust and performant, is part of PickAll kernel.
  • It's fun! Pick data from the web play with F# scripts.

What you can do

  • Develop your own specialized web scraping program or library.
  • Develop a new service for the community.
  • Show results correlated to user input in a web application.
  • Gather data and archive it in a database for data mining.
  • Gather data and process it via NLP or similar technologies.