Skip to content

Latest commit

 

History

History
42 lines (29 loc) · 1.65 KB

README.md

File metadata and controls

42 lines (29 loc) · 1.65 KB

Bridge

Bridge the communication gap in dataset lifecycle .

Bridge is a data lifecycle management tool that allows:

  • Users to ensure data quality
  • Orgs to set enforcable programmatic data contracts across teams (data, dev, infra etc.)
  • Seamless validation and monitoring across varios stages of data pipelines

Bridge is available as a command line tool to simply enforce contracts.

Python sdk

Bridge can be easily customized to build, enforce and any custom dataset. For this, we provide a python interface.

Introduction

Bridge consists of 3 main components:

  • Contract : A contract can be seen as a check that analyses an aspect of the given data. It can accept both mandatory and optional arguments.

  • Result : A result object is the result of enforcing a contract

  • Executor : This is the engine that parses contracts from data-contract language( json temporarily ) to python and enforces the checks

Detailed python docs coming soon.

Installation

Git clone the repo and run pip install bridgeai

CLI coming soon

Contract Language

Currently I'm using json to write and ship contracts. You can use the following structure

{
"contract1": {"param1": value, "param2": value},
"contract2": {"param1": value, "param2": value}
}

See a live example in example/ folder

Screenshot 2023-03-07 at 2 44 48 AM

This is a very early protype stage weekend project that implements a simple concept. Some(most) things might not work as expected. I'll work on it some more if I get some nice feedback.