Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clickhouse compatibility: MergeTree engine #28

Open
4 tasks
akvlad opened this issue Aug 13, 2024 · 1 comment
Open
4 tasks

Clickhouse compatibility: MergeTree engine #28

akvlad opened this issue Aug 13, 2024 · 1 comment

Comments

@akvlad
Copy link
Contributor

akvlad commented Aug 13, 2024

What

Desired functionality.

A client sends a request:

create_table: experiment
fields:
  a: UInt64
  b: String
  c: Float64
engine: Merge
order_by: 
  - a
timestamp:
  field: a
  precision: ms
partition_by:
  - b

Then the client sends a bunch of another requests once in a while:

POST /query?query=INSERT INTO experiment FORMAT JSONEachRow

{ "a": 1, "b": "asdad", "c": 1.2}
{ "a": 2, "b": "asdad", "c": 1.2}
{ "a": 3, "b": "asdad", "c": 1.2}

Desired result.

There is a directory on the server HD /tmp/experiment.

It has a plenty of parquet files.

Once in a while these parquet files get merged into a bigger ones according to the the order by key.

The maximum size of a merged file is 4GB.

Why

We can migrate qryn writer part with little to no changes to DuckDB supported data storage.

In order to do it we should support the simplest clickhouse style insert queries.

Let's start with a simple table creation and JSONEachRow insert function.

In future we can optimize the engine for the time-series data we store.

Copy link

Thanks for opening an Issue! Please star this repository to motivate developers! ⭐

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant