Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epic: Realtime Ingestion Improvements #1642

Closed
himanshug opened this issue Aug 19, 2015 · 7 comments
Closed

Epic: Realtime Ingestion Improvements #1642

himanshug opened this issue Aug 19, 2015 · 7 comments
Labels

Comments

@himanshug
Copy link
Contributor

This issue tracks all the "related" efforts targeted towards making realtime ingestion better in one way or the other.
Here is a wishlist of items we should try to solve for. I understand that some of this is already solved by tranquility (probably not in kafka based ingestion), but we should ensure that those are not broken as we make changes and are supported by kafka based ingestion as well.

  1. Exactly once semantics: There should be no duplication introduced and no dropping of events (except when input event is malformed/unparseable). This directly relates to both "No window period" and "FirehoseV2 proposal"
  2. We should be able to scale realtime ingestion by adding more nodes.
  3. Realtime Ingestion and Querying on that data should continue to happen successfully in the event of some process/node failures. It only happens for standalone realtime nodes today. With ingestion done via "tasks", there is possibility of node dying leading to data loss (replication of ingestion tasks should potentially solve this).
  4. Query result consistency: Given query should get exactly same result independent of which realtime nodes/replicants served the query.
  5. No downtime upgrade: Realtime ingestion and Queries should continue to make progress while upgrade.
  6. Operational simplicity: For example kafka ingestion should be able to automatically handle kafka partition addition/removal

(Realtime Delta Ingestion: ability to ingest late events as they come would probably happen as a side effect of 1st)

Related Refs:
https://groups.google.com/forum/#!msg/druid-development/kHgHTgqKFlQ/fXvtsNxWzlMJ (No window period proposal)
https://groups.google.com/forum/#!msg/druid-development/9HB9hCcqvuI/L59RgsloZfoJ (FirehoseV2 proposal)
https://docs.google.com/document/d/1PUG3crI2jiPa_u926R0KrkZVM7t706rXp1IuUxVXB5E/edit?usp=sharing (doc covering design details for both above)
https://groups.google.com/forum/#!searchin/druid-development/tier/druid-development/1I3CmxlOipM/e3-SpWqG170J (Task Tiering proposal)

Related PRs:
#1609 (kafka simple consumer based firehose and initial FirehoseV2 updates)
#1639 (new plumber)

Related Issues:
#401 (log management for long-running tasks)
#1513 (preemption for indexing service locks)
#1514 (aggregatorFactories in segment metadata)
#1515 (AllocateSegmentAction)
#1516 (ElasticShardSpec)
#1517 (user-friendly Hadoop-based re-indexing/compaction)

@drcrallen
Copy link
Contributor

https://groups.google.com/forum/#!searchin/druid-development/tier/druid-development/1I3CmxlOipM/e3-SpWqG170J could fit here also?

That directly applies to 5, 2 (and maybe 6?) on the list.

@himanshug
Copy link
Contributor Author

@drcrallen added

@himanshug
Copy link
Contributor Author

I have created a document at https://docs.google.com/document/d/1PUG3crI2jiPa_u926R0KrkZVM7t706rXp1IuUxVXB5E/edit?usp=sharing to capture various design details of kafka/tranquility ingestion work . This has been created with inputs from @gianm and still under active development. feel free to discuss here.

@gianm
Copy link
Contributor

gianm commented Sep 18, 2015

I updated the doc with some thoughts and preliminary code around push-based/tranquility ingestion.

@gianm
Copy link
Contributor

gianm commented Oct 28, 2015

A couple of tangentially related things.

#1881 - Restorable indexing tasks (PR) - so middleManagers can be restarted similarly to realtime nodes
#1884 - Rack-aware availabilityGroup assignment (issue) - suggestion from @himanshug, to make rolling restarts batchable

@gianm
Copy link
Contributor

gianm commented Jan 18, 2016

Updated the google doc with the current state of kafka ingestion stuff.

@gianm
Copy link
Contributor

gianm commented Mar 15, 2017

Work based off this proposal was released a couple releases ago. Circling back and closing this.

http://druid.io/docs/latest/development/extensions-core/kafka-ingestion.html

@gianm gianm closed this as completed Mar 15, 2017
seoeun25 pushed a commit to seoeun25/incubator-druid that referenced this issue Jan 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants