Skip to content

A little Python project to automate gathering website profiling data from "BuiltWith" & "Wappalyzer" for tech stack information, technographic data, website reports, website tech lookups, website architecture lookups, etc.

Notifications You must be signed in to change notification settings

cybersader/WebsiteTechMiner-py

Repository files navigation

WebsiteTechMiner-py ⛏

A little Python project to automate gathering website profiling data from "BuiltWith" & "Wappalyzer" for tech stack information, technographic data, website reports, website tech lookups, website architecture lookups, etc.

Uses of WebsiteTechMiner

  • 👁️ Data Privacy Activities
    • Vendor Discovery for Websites
    • Risk Management
    • Data Privacy Read-Ahead Material for Privacy Assessments
  • 🖥️ Cyber Security Activities
    • Reconnaissance
    • OSINT
  • 🗺️ Other Discovery Activities
    • Business Intelligence
    • Marketing Activities
    • Competition Analysis

Generated Data Fields:

[ domain , tech_profiler_tool_used , category , technology_name , description (if one exists) ]

All data is exported into the CSV file designated in the config file.

Contributions

  • Contributions are welcome! 😁 Just fork my repo and make a pull request.

Getting Started

⬇ Installation

  • Use Git or download this repo
  • Git
    • Open cmd or your terminal of choice
    • cd to the folder you want to git clone to
    • git clone https://github.com/cybersader/WebsiteTechMiner-py.git
  • Download
    • Simply download this repo, as is.

Requirements

  • Python dependencies:
    • Make sure you've installed the project
    • cd into the project
    • If you don't have Python, then you're going to need it to use pip https://www.python.org/downloads/
    • pip install -r requirements.txt

✉ TempMail for Accounts

DO NOT be fradulent

  • I'm not going to design any automated fradulent solutions to automatically generate temporary accounts and emails.
  • If you are trying to process very large amounts of URLs, then please purchase plans from these tech lookup services.

Setting up Wappalyzer

Setting up BuiltWith

💵 Buying API Credits

Usage

WebsiteTechMiner-py currently has 2 options:
  • -s, "single" (analyze a single domain)
  • -b, "bulk" (analyze a list of domains using a CSV file)
    • put them into rows, columns, or a combination of the two in Excel (it doesn't matter).

Single Website Lookup

command:

python WebsiteTechMiner.py -s example.com

Bulk Website Lookup

⚠🛑⚠🛑⚠🛑⚠🛑

  • Be careful running this:
    • if you don't have a paid plan, then you will quickly go over your limits
    • This is not recommended unless you have a high limit for API credits with:
      • Wappalyzer, Builtwith

command:

python WebsiteTechMiner.py -b example_website_list.csv

💎 Future Developments

🐛 Bugs & TODOs

  • Stop WTM if you run out of API credits for all tools
  • Error fidelity on error prints
  • Multiple API tokens in config file or some csv file
  • More fields from APIs to csv
  • Ability to use flags for fields
  • Unlimited domains on command line
  • http and https flags
  • Default command with domains after
  • Add throttling features for when requests start dropping Wapp and BW

🌐 Discovery

  • Recursive Subdomain discovery option
  • Connected website discovery
  • Risk Management
    • Assumed PI discovery
      • OneTrust Vendorpedia API
      • Other Vendor Risk Management DBs & APIs
    • Security Risk Score Attribution
    • Other additional information to pull in from external sources
      • Policies
      • Available Data Processing Agreement links?

About

A little Python project to automate gathering website profiling data from "BuiltWith" & "Wappalyzer" for tech stack information, technographic data, website reports, website tech lookups, website architecture lookups, etc.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages