-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.14.0-incubating release notes #7126
Comments
I think this is largely complete now, please let me know if there's anything I should correct or add. |
Maybe you're aware of this, but the web-consoles doc page linked from the first entry doesn't seem to exist yet (I don't just mean the link is broken: I mean there's no web-consoles.md in master or 0.14-incubating branches). cc @vogievetsky Also I think it might be fun to have a screen shot of the web console in the release notes! |
Ah, the new page would be "management-uis", it was changed during PR review but I haven't updated these notes to reflect that yet. A screen shot sounds like a good idea, thanks! |
The notes talk about 'Maintenance mode for Historicals', which have been renamed recently (#7154). |
In the processing buffer sizing section above, it says: processingBufferSize = max(processingBufferSize, 1GB) Shouldn't it rather be a min() operation? I imagine that the buffer size is supposed to be capped at 1 Gig. |
@glasser quick note: there has also been an entire page of docs added just for the new console: https://github.com/apache/incubator-druid/blob/master/docs/content/operations/druid-console.md |
What is the state of this release? I.e. when it will be downloadable? It's no longer marked as WIP and on the release page it's no longer tagged as a RC, so is it going to be marked as the current stable release soon? |
@sascha-coenen Thanks, it should be min() there. @trtg 0.14.0 is released now, the vote passed yesterday but we needed to wait ~24 hours for the artifacts to propagate across mirrors |
Apache Druid 0.14.0-incubating contains over 200 new features, performance/stability/documentation improvements, and bug fixes from 54 contributors. Major new features and improvements include:
The full list of changes is here: https://github.com/apache/incubator-druid/pulls?q=is%3Apr+is%3Amerged+milestone%3A0.14.0
Documentation for this release is at: http://druid.io/docs/0.14.0-incubating/
Highlights
New web console
Druid has a new web console that provides functionality that was previously split between the coordinator and overlord consoles.
The new console allows the user to manage datasources, segments, tasks, data processes (Historicals and MiddleManagers), and coordinator dynamic configuration. The user can also run SQL and native Druid queries within the console.
For more details, please see http://druid.io/docs/0.14.0-incubating/operations/management-uis.html
Added by @vogievetsky in #6923.
Kinesis indexing service
Druid now supports ingestion from Kinesis streams, provided by the new
druid-kinesis-indexing-service
core extension.Please see http://druid.io/docs/0.14.0-incubating/development/extensions-core/kinesis-ingestion.html for details.
Added by @jsun98 in #6431.
Decommissioning mode for Historicals
Historical processes can now be put into a "decommissioning" mode, where the coordinator will no longer consider the Historical process as a target for segment replication. The coordinator will also move segments off the decommissioning Historical.
This is controlled via Coordinator dynamic configuration. For more details, please see http://druid.io/docs/0.14.0-incubating/configuration/index.html#dynamic-configuration.
Added by @egor-ryashin in #6349.
Published segment cache on Broker
The Druid Broker now has the ability to maintain a cache of published segments via polling the Coordinator, which can significantly improve response time for metadata queries on the
sys.segments
system table.Please see http://druid.io/docs/0.14.0-incubating/querying/sql.html#retrieving-metadata for details.
Added by @surekhasaharan in #6901
Bloom filter aggregator and expression
A new aggregator for constructing Bloom filters at query time and support for performing Bloom filter checks within Druid expressions have been added to the
druid-bloom-filter
extension.Please see http://druid.io/docs/0.14.0-incubating/development/extensions-core/bloom-filter.html
Added by @clintropolis in #6904 and #6397
Updated Parquet extension
druid-extensions-parquet
has been moved into the core extension set from the contrib extensions and now supports flattening and int96 values.Please see http://druid.io/docs/0.14.0-incubating/development/extensions-core/parquet.html for details.
Added by @clintropolis in #6360
Force push down option for nested GroupBy queries
Outer query execution for nested GroupBy queries can now be pushed down to Historical processes; previously, the outer queries would always be executed on the Broker.
Please see #5471 for details.
Added by @samarthjain in #5471.
Better segment handoff and retention rule handling
Segment handoff will now ignore segments that would be dropped by a datasource's retention rules, avoiding ingestion failures caused by issue #5868.
Period load rules will now include the future by default.
A new "Period Drop Before" rule has been added. Please see http://druid.io/docs/0.14.0-incubating/operations/rule-configuration.html#period-drop-before-rule for details.
Added by @QiuMM in #6676, #6414, and #6415.
Automatically kill MapReduce jobs when Hadoop ingestion tasks are killed
Druid will now automatically terminate MapReduce jobs created by Hadoop batch ingestion tasks when the ingestion task is killed.
Added by @ankit0811 in #6828.
DogStatsD tag support for statsd-emitter
The
statsd-emitter
extension now supports DogStatsD-style tags. Please see http://druid.io/docs/0.14.0-incubating/development/extensions-contrib/statsd.htmlAdded by @deiwin in #6605, with support for constant tags added by @glasser in #6791.
New API for retrieving all lookup specs
A new API for retrieving all lookup specs for all tiers has been added. Please see http://druid.io/docs/0.14.0-incubating/querying/lookups.html#get-all-lookups for details.
Added by @jihoonson in #7025.
New compaction options
Auto-compaction now supports the
maxRowsPerSegment
option. Please see http://druid.io/docs/0.14.0-incubating/design/coordinator.html#compacting-segments for details.The compaction task now supports a new
segmentGranularity
option, deprecating the olderkeepSegmentGranularity
option for controlling the segment granularity of compacted segments. Please see thesegmentGranularity
table in http://druid.io/docs/0.14.0-incubating/ingestion/compaction.html for more information on these properties.Added by @jihoonson in #6758 and #6780.
More efficient cachingCost segment balancing strategy
The
cachingCost
Coordinator segment balancing strategy will now only consider Historical processes for balancing decisions. Previously the strategy would unnecessarily consider active worker tasks as well, which are not targets for segment replication.Added by @QiuMM in #6879.
New metrics:
jvm/heapAlloc/bytes
, added by @egor-ryashin in Added an allocation rate metric #6604 #6710.query/count
, added by @QiuMM in QueryCountStatsMonitor: emit query/count #6473.sqlQuery/bytes
andsqlQuery/time
, added by @gaodayue in Add SQL id, request logs, and metrics #6302.ingest/kafka/maxLag
andingest/kafka/avgLag
, added by @QiuMM in emit maxLag/avgLag in KafkaSupervisor #6587task/success/count
,task/failed/count
,task/running/count
,task/pending/count
,task/waiting/count
, added by @QiuMM in Add TaskCountStatsMonitor to monitor task count stats #6657New interfaces for extension developers
RequestLogEvent
It is now possible to control the fields in
RequestLogEvent
, emitted byEmittingRequestLogger
. Please see #6477 for details. Added by @leventov.Custom TLS certificate checks
An extension point for custom TLS certificate checks has been added. Please see http://druid.io/docs/0.14.0-incubating/operations/tls-support.html#custom-tls-certificate-checks for details. Added by @jon-wei in #6432.
Kafka Indexing Service no longer experimental
The Kafka Indexing Service extension has been moved out of experimental status.
SQL Enhancements
Enhancements to dsql
The
dsql
command line client now supports CLI history, basic autocomplete, and specifying query timeouts in the query context.Added in #6929 by @gianm.
Add SQL id, request logs, and metrics
SQL queries now have an ID, and native queries executed as part of a SQL query will have the associated SQL query ID in the native query's request logs. SQL queries will now be logged in the request logs.
Two new metrics,
sqlQuery/time
andsqlQuery/bytes
, are now emitted for SQL queries.Please see http://druid.io/docs/0.14.0-incubating/configuration/index.html#request-logging and http://druid.io/docs/0.14.0-incubating/querying/sql.html#sql-metrics for details.
Added by @gaodayue in #6302
More SQL aggregator support
The follow aggregators are now supported in SQL:
Added by @jon-wei in #6951 and @clintropolis in #6502
Other SQL enhancements
Added by @gianm.
Updating from 0.13.0-incubating and earlier
Kafka ingestion downtime when upgrading
Due to the issue described in #6958, existing Kafka indexing tasks can be terminated unnecessarily during a rolling upgrade of the Overlord. The terminated tasks will be restarted by the Overlord and will function correctly after the initial restart.
Parquet extension changes
The
druid-parquet-extensions
extension has been moved fromcontrib
tocore
. When deploying 0.14.0-incubating, please ensure that yourextensions-contrib
directory does not have any older versions of the Parquet extension.Additionally, there are now two styles of Parquet parsers in the extension:
parquet-avro
: Converts Parquet to Avro, and then parses the Avro representation. This was the existing parser prior to 0.14.0-incubating.parquet
: A new parser that parses the Parquet format directly. Only this new parser supports int96 values.Prior to 0.14.0-incubating, a specifying a
parquet
type parser would have a task use the Avro-converting parser. In 0.14.0-incubating, to continue using the Avro-converting parser, you will need to update your ingestion specs to useparquet-avro
instead.The
inputFormat
field in theinputSpec
for tasks using Parquet input must also match the choice of parser:parquet
:org.apache.druid.data.input.parquet.DruidParquetInputFormat
parquet-avro
:org.apache.druid.data.input.parquet.DruidParquetInputFormat
Please see http://druid.io/docs/0.14.0-incubating/development/extensions-core/parquet.html for details.
Running Druid with non-2.8.3 Hadoop
If you plan to use Druid 0.14.0-incubating with Hadoop versions other than 2.8.3, you may need to do the following:
Tip #3: Use specific versions of Hadoop libraries
.hadoop.compile.version
in the main Druidpom.xml
and then following the standard build instructions.Other Behavior changes
Old task cleanup
Old task entries in the metadata storage will now be cleaned up automatically together with their task logs. Please see http:/druid.io/docs/0.14.0-incubating/development/extensions-core/configuration/index.html#task-logging and #6592 for details.
Automatic processing buffer sizing
The
druid.processing.buffer.sizeBytes
property has new default behavior if it is not set. Druid will now automatically choose a value for the processing buffer size using the following formula:Where:
-XX:MaxDirectMemorySize
druid.processing.numMergeBuffers
.druid.processing.numThreads
.At most, Druid will use 1GB for the automatically chosen processing buffer size. The processing buffer size can still be specified manually.
Please see #6588 for details.
Retention rules now include the future by default
Please be aware that new retention rules will now include the future by default. Please see #6414 for details.
Property changes
Segment announcing
The
druid.announcer.type
property used for choosing between Zookeeper or HTTP-based segment management/discovery has been moved todruid.serverview.type
. If you were usinghttp
prior to 0.14.0-incubating, you will need to update your configs to use the newdruid.serverview.type
.Please see the following for details:
fix missing property in JsonTypeInfo of SegmentWriteOutMediumFactory
The
druid.peon.defaultSegmentWriteOutMediumFactory.@type
property has been fixed. The property is nowdruid.peon.defaultSegmentWriteOutMediumFactory.type
without the "@".Please see #6656 for details.
Deprecations
Approximate Histogram aggregator
The ApproximateHistogram aggregator has been deprecated; it is a distribution-dependent algorithm without formal error bounds and has significant accuracy issues.
The DataSketches quantiles aggregator should be used instead for quantile and histogram use cases.
Please see Histogram and Quantiles Aggregators
Cardinality/HyperUnique aggregator
The Cardinality and HyperUnique aggregators have been deprecated in favor of the DataSketches HLL aggregator and Theta Sketch aggregator. These aggregators have better accuracy and performance characteristics.
Please see Count Distinct Aggregators for details.
Query Chunk Period
The
chunkPeriod
query context configuration is now deprecated, along with the associatedquery/intervalChunk/time
metric. Please see #6591 for details.keepSegmentGranularity
for CompactionThe
keepSegmentGranularity
option for compaction tasks has been deprecated. Please see #6758 and thesegmentGranularity
table in http://druid.io/docs/0.14.0-incubating/ingestion/compaction.html for more information on these properties.Interface changes for extension developers
SegmentId
classDruid now uses a
SegmentId
class instead of plain Strings to represent segment IDs. Please see #6370 for details.Added by @leventov.
druid-api
,druid-common
,java-util
moved todruid-core
The
druid-api
,druid-common
,java-util
modules have been moved intodruid-core
. Please update your dependencies accordingly if your project depended on these libraries.Please see #6443 for details.
Credits
Thanks to everyone who contributed to this release!
@a2l007
@AlexanderSaydakov
@anantmf
@ankit0811
@asdf2014
@awelsh93
@benhopp
@Caroline1000
@clintropolis
@dclim
@deiwin
@DiegoEliasCosta
@drcrallen
@dyf6372
@Dylan1312
@egor-ryashin
@elloooooo
@evans
@FaxianZhao
@gaodayue
@gianm
@glasser
@Guadrado
@hate13
@hoesler
@hpandeycodeit
@janeklb
@jihoonson
@jon-wei
@jorbay-au
@jsun98
@justinborromeo
@kamaci
@leventov
@lxqfy
@mirkojotic
@navkumar
@niketh
@patelh
@pzhdfy
@QiuMM
@rcgarcia74
@richardstartin
@robertervin
@samarthjain
@seoeun25
@Shimi
@surekhasaharan
@taiii
@thomask
@VincentNewkirk
@vogievetsky
@yunwan
@zhaojiandong
The text was updated successfully, but these errors were encountered: