Removes support for Hadoop 2 #14763

tejaswini-imply · 2023-08-07T06:40:19Z

Release note
Hadoop 2 support is formally deprecated in Druid 27.0.0. Hadoop 2 support is now terminated.

abhishekagarwal87 · 2023-08-07T09:25:16Z

.github/workflows/cron-job-its.yml

@@ -112,7 +112,6 @@ jobs:
    strategy:
      fail-fast: false
      matrix:


this line can be removed as well?

abhishekagarwal87 · 2023-08-07T09:27:47Z

examples/quickstart/tutorial/hadoop3/docker/hdfs-site.xml

did we move these files or just simply deleted them?

deleted original tutorial/hadoop/docker/* files and moved tutorial/hadoop3/docker/* files to tutorial/hadoop/docker/ folder

ektravel · 2023-08-07T17:13:06Z

docs/configuration/index.md

@@ -1534,7 +1534,7 @@ Additional peon configs include:
 |`druid.indexer.task.baseDir`|Base temporary working directory.|`System.getProperty("java.io.tmpdir")`|
 |`druid.indexer.task.baseTaskDir`|Base temporary working directory for tasks.|`${druid.indexer.task.baseDir}/persistent/task`|
 |`druid.indexer.task.batchProcessingMode`| Batch ingestion tasks have three operating modes to control construction and tracking for intermediary segments: `OPEN_SEGMENTS`, `CLOSED_SEGMENTS`, and `CLOSED_SEGMENT_SINKS`. `OPEN_SEGMENTS` uses the streaming ingestion code path and performs a `mmap` on intermediary segments to build a timeline to make these segments available to realtime queries. Batch ingestion doesn't require intermediary segments, so the default mode, `CLOSED_SEGMENTS`, eliminates `mmap` of intermediary segments. `CLOSED_SEGMENTS` mode still tracks the entire set of segments in heap. The `CLOSED_SEGMENTS_SINKS` mode is the most aggressive configuration and should have the smallest memory footprint. It eliminates in-memory tracking and `mmap` of intermediary segments produced during segment creation. `CLOSED_SEGMENTS_SINKS` mode isn't as well tested as other modes so is currently considered experimental. You can use `OPEN_SEGMENTS` mode if problems occur with the 2 newer modes. |`CLOSED_SEGMENTS`|
-|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client:2.8.5|
+|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client-api:3.3.6,org.apache.hadoop:hadoop-client-runtime:3.3.6|


Suggested change

|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client-api:3.3.6,org.apache.hadoop:hadoop-client-runtime:3.3.6|

|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|`org.apache.hadoop:hadoop-client-api:3.3.6`, `org.apache.hadoop:hadoop-client-runtime:3.3.6`|

ektravel · 2023-08-07T17:13:50Z

docs/configuration/index.md

@@ -1605,7 +1605,7 @@ then the value from the configuration below is used:
 |`druid.worker.numConcurrentMerges`|Maximum number of segment persist or merge operations that can run concurrently across all tasks.|`druid.worker.capacity` / 2, rounded down|
 |`druid.indexer.task.baseDir`|Base temporary working directory.|`System.getProperty("java.io.tmpdir")`|
 |`druid.indexer.task.baseTaskDir`|Base temporary working directory for tasks.|`${druid.indexer.task.baseDir}/persistent/tasks`|
-|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client:2.8.5|
+|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client-api:3.3.6,org.apache.hadoop:hadoop-client-runtime:3.3.6|


Suggested change

|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|org.apache.hadoop:hadoop-client-api:3.3.6,org.apache.hadoop:hadoop-client-runtime:3.3.6|

|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|`org.apache.hadoop:hadoop-client-api:3.3.6`, `org.apache.hadoop:hadoop-client-runtime:3.3.6`|

ektravel · 2023-08-07T17:27:44Z

docs/operations/other-hadoop.md

@@ -89,7 +89,7 @@ classloader.
 2. Batch ingestion uses jars from `hadoop-dependencies/` to submit Map/Reduce jobs (location customizable via the
 `druid.extensions.hadoopDependenciesDir` runtime property; see [Configuration](../configuration/index.md#extensions)).

-`hadoop-client:2.8.5` is the default version of the Hadoop client bundled with Druid for both purposes. This works with
+`hadoop-client-api:3.3.6, hadoop-client-runtime:3.3.6` is the default version of the Hadoop client bundled with Druid for both purposes. This works with


Suggested change

`hadoop-client-api:3.3.6, hadoop-client-runtime:3.3.6` is the default version of the Hadoop client bundled with Druid for both purposes. This works with

The default version of the Hadoop client bundled with Druid is 3.3.6. This works with

ektravel

Left a few suggestions for the docs.

tejaswini-imply · 2023-08-08T06:58:16Z

Thanks @abhishekagarwal87, @ektravel for the review. I have addressed your comments in the latest commit.

ektravel

Docs look good.

Several dependabot ignore directives are no longer relevant. Unpin them to ensure we get again get timely updates via dependabot. * support for Hadoop 2 was dropped as part of apache#14763 * Guava was upgraded to 31 as part of apache#14767 * Calcite was upgraded to 1.35 as part of apache#14510

Several dependabot ignore directives are no longer relevant. Unpin them to ensure we get again get timely updates via dependabot. * support for Hadoop 2 was dropped as part of #14763 * Guava was upgraded to 31 as part of #14767 * Calcite was upgraded to 1.35 as part of #14510

remove support for hadoop2

d65aaed

github-actions bot added the Area - Documentation label Aug 7, 2023

abhishekagarwal87 reviewed Aug 7, 2023

View reviewed changes

abhishekagarwal87 added the Release Notes label Aug 7, 2023

ektravel reviewed Aug 7, 2023

View reviewed changes

minute corrections

c907639

abhishekagarwal87 approved these changes Aug 8, 2023

View reviewed changes

ektravel approved these changes Aug 8, 2023

View reviewed changes

nit: spelling

0855293

abhishekagarwal87 added the Design Review label Aug 9, 2023

clintropolis approved these changes Aug 9, 2023

View reviewed changes

abhishekagarwal87 merged commit a45b25f into apache:master Aug 9, 2023
79 of 80 checks passed

LakshSingla added this to the 28.0 milestone Oct 12, 2023

LakshSingla mentioned this pull request Nov 4, 2023

[DRAFT] 28.0.0 release notes #15326

Closed

xvrl mentioned this pull request Dec 5, 2023

unpin guava related dependabot dependencies #15494

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removes support for Hadoop 2 #14763

Removes support for Hadoop 2 #14763

tejaswini-imply commented Aug 7, 2023

abhishekagarwal87 Aug 7, 2023

abhishekagarwal87 Aug 7, 2023

tejaswini-imply Aug 8, 2023

ektravel Aug 7, 2023 •

edited

Loading

ektravel Aug 7, 2023 •

edited

Loading

ektravel Aug 7, 2023

ektravel left a comment

tejaswini-imply commented Aug 8, 2023

ektravel left a comment

	\|`druid.indexer.task.defaultHadoopCoordinates`\|Hadoop version to use with HadoopIndexTasks that do not request a particular version.\|org.apache.hadoop:hadoop-client-api:3.3.6,org.apache.hadoop:hadoop-client-runtime:3.3.6\|
	\|`druid.indexer.task.defaultHadoopCoordinates`\|Hadoop version to use with HadoopIndexTasks that do not request a particular version.\|`org.apache.hadoop:hadoop-client-api:3.3.6`, `org.apache.hadoop:hadoop-client-runtime:3.3.6`\|

	`hadoop-client-api:3.3.6, hadoop-client-runtime:3.3.6` is the default version of the Hadoop client bundled with Druid for both purposes. This works with
	The default version of the Hadoop client bundled with Druid is 3.3.6. This works with

Removes support for Hadoop 2 #14763

Removes support for Hadoop 2 #14763

Conversation

tejaswini-imply commented Aug 7, 2023

abhishekagarwal87 Aug 7, 2023

Choose a reason for hiding this comment

abhishekagarwal87 Aug 7, 2023

Choose a reason for hiding this comment

tejaswini-imply Aug 8, 2023

Choose a reason for hiding this comment

ektravel Aug 7, 2023 • edited Loading

Choose a reason for hiding this comment

ektravel Aug 7, 2023 • edited Loading

Choose a reason for hiding this comment

ektravel Aug 7, 2023

Choose a reason for hiding this comment

ektravel left a comment

Choose a reason for hiding this comment

tejaswini-imply commented Aug 8, 2023

ektravel left a comment

Choose a reason for hiding this comment

ektravel Aug 7, 2023 •

edited

Loading

ektravel Aug 7, 2023 •

edited

Loading