SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
-
Updated
Sep 29, 2024 - Java
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
Concurrent and multi-stage data ingestion and data processing with Elixir
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
Pravega - Streaming as a new software defined storage primitive
Orbital automates integration between data sources (APIs, Databases, Queues and Functions). BFF's, API Composition and ETL pipelines that adapt as your specs change.
Use SQL to build ELT pipelines on a data lakehouse.
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way 🌰
The Data Engineering Book - หนังสือวิศวกรรมข้อมูล ของคนไทย เพื่อคนไทย
Apache Paimon Rust The rust implementation of Apache Paimon.
Squirrel dataset hub
Sample code for the AWS Big Data Blog Post Building a scalable streaming data processor with Amazon Kinesis Data Streams on AWS Fargate
OpenKit Java Reference Implementation
Enables custom tracing of Java applications in Dynatrace
Download and warehouse historical trading data
The Data Integration Library project provides a library of generic components based on a multi-stage architecture for data ingress and egress.
Enables custom tracing of Python applications in Dynatrace
Add a description, image, and links to the data-ingestion topic page so that developers can more easily learn about it.
To associate your repository with the data-ingestion topic, visit your repo's landing page and select "manage topics."