Skip to content
Change the repository type filter

All

    Repositories list

    • urlnormalizer

      Public archive
      Normalize URLs, works with Python 2 and 3
      Python
      MIT License
      2520Updated May 5, 2021May 5, 2021
    • aleph-helm-charts

      Public archive
      Helm charts for Aleph
      Smarty
      0200Updated Jan 14, 2021Jan 14, 2021
    • convert-document

      Public archive
      A docker container for LibreOffice and unoconv, used to generate PDF files from office-type documents.
      Python
      MIT License
      476710Updated Jan 6, 2021Jan 6, 2021
    • datacommons

      Public archive
      A fleet of Memorious scrapers for crawling various open data sources
      Python
      41570Updated Sep 24, 2020Sep 24, 2020
    • extract-entities

      Public archive
      Service for extracting named entities from text fragments
      Python
      MIT License
      0100Updated Mar 4, 2019Mar 4, 2019
    • recognize-text

      Public archive
      A Tesseract 4 gRPC service container for optical character generation
      Python
      MIT License
      0500Updated Mar 4, 2019Mar 4, 2019
    • storagelayer

      Public archive
      Content-addressable storage for files across S3 and local file systems
      Python
      MIT License
      4700Updated Dec 22, 2018Dec 22, 2018
    • deduper

      Public archive
      A minimal flask app to let folks deduplicate possible matches generated by the company enrichment process
      Python
      0100Updated Nov 21, 2018Nov 21, 2018
    • platform

      Public archive
      Docker base image for Aleph and Ingestors
      Makefile
      2300Updated Jul 15, 2018Jul 15, 2018
    • exactitude

      Public archive
      Parsing and normalising for identifying text data (emails, domains, phone numbers, dates). Combines external libraries into a coherent API.
      Python
      MIT License
      2900Updated Apr 30, 2018Apr 30, 2018
    • aleph-ui

      Public archive
      Front-end application for the Aleph data search engine, based on React/Redux and the Aleph API.
      JavaScript
      18100Updated Dec 12, 2017Dec 12, 2017
    • xref

      Public archive
      Scripts to crossreference lists of things via the aleph API
      Python
      1000Updated Jun 20, 2017Jun 20, 2017
    • Simple fuzzy entity data indexer
      Python
      0400Updated Nov 14, 2016Nov 14, 2016
    • krauler

      Public
      Replaced with alephdata/memorious
      Python
      MIT License
      4500Updated Sep 23, 2016Sep 23, 2016
    • entityman

      Public archive
      Natural language entity extractor
      Java
      0000Updated Jul 14, 2016Jul 14, 2016
    • Store a bunch of documents alongside their metadata for later consumption.
      Python
      MIT License
      2100Updated Feb 16, 2016Feb 16, 2016
    • osoba

      Public
      Python
      0000Updated Jan 21, 2016Jan 21, 2016
    • spindle

      Public
      Front-end application for the loom graph pipeline
      Python
      1700Updated Jan 6, 2016Jan 6, 2016
    • schema

      Public
      JSON Schema for OCCRP data
      HTML
      0500Updated Dec 16, 2015Dec 16, 2015
    • loom

      Public
      Weaving SQL databases into graph data.
      Python
      GNU Affero General Public License v3.0
      0900Updated Dec 9, 2015Dec 9, 2015
    • Meta-search work at the Moldova hackathon
      JavaScript
      Other
      0100Updated Dec 5, 2015Dec 5, 2015
    • ouija

      Public
      The Ouija Board is a data browser for SQL tables.
      Python
      0000Updated Nov 18, 2015Nov 18, 2015
    • Scrapers for Moldovan public data sets
      Python
      0000Updated Oct 22, 2015Oct 22, 2015
    • Public Person Profiles
      JavaScript
      0300Updated Oct 20, 2015Oct 20, 2015