Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-17577.Add Support for CreateFlag.NO_LOCAL_WRITE in File Creation to Manage Disk Space and Network Load in Labeled YARN Nodes #6935

Open
wants to merge 9 commits into
base: trunk
Choose a base branch
from

Conversation

liangyu-1
Copy link
Contributor

… to Manage Disk Space and Network Load in Labeled YARN Nodes

Description of PR

As described in HDFS-17577
I am currently using Apache Flink to write files into Hadoop. The Flink application runs on a labeled YARN queue. During operation, it has been observed that the local disks on these labeled nodes get filled up quickly, and the network load is significantly high. This issue arises because Hadoop prioritizes writing files to the local node first, and the number of these labeled nodes is quite limited.

The current behavior leads to inefficient disk space utilization and high network traffic on these few labeled nodes, which could potentially affect the performance and reliability of the application. As shown in the picture, the host I circled have a average net_bytes_sent speed 1.2GB/s while the others are just 50MB/s, this imbalance in network and disk space nearly destroyed the whole cluster. 
6D939050-0BC4-4B17-A6A3-A1EBBD60338D
 

Implementation:
I add an configuration dfs.client.write.no_local_write to support the CreateFlag.NO_LOCAL_WRITE during the file creation process in Hadoop's file system APIs. This will provide flexibility to applications like Flink running in labeled queues to opt for non-local writes when necessary.

How was this patch tested?

I have rebuilt the whole hadoop-hdfs-client module, and then test them using flink on the labeled YARN queue, the distribution of disk storage across the nodes in the cluster is more even, and the network load has also improved.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

… to Manage Disk Space and Network Load in Labeled YARN Nodes
@liangyu-1 liangyu-1 changed the title HDFS-17577.Add Support for CreateFlag.NO_LOCAL_WRITE in File Creation to Manage Disk Space and Network Load in Labeled YARN Nodes" HDFS-17577.Add Support for CreateFlag.NO_LOCAL_WRITE in File Creation to Manage Disk Space and Network Load in Labeled YARN Nodes Jul 10, 2024
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
-1 ❌ mvninstall 2m 13s /branch-mvninstall-root.txt root in trunk failed.
-1 ❌ compile 0m 21s /branch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.txt hadoop-hdfs-client in trunk failed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.
-1 ❌ compile 0m 23s /branch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkPrivateBuild-1.8.0_412-8u412-ga-1~20.04.1-b08.txt hadoop-hdfs-client in trunk failed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08.
-0 ⚠️ checkstyle 0m 21s /buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt The patch fails to run checkstyle in hadoop-hdfs-client
-1 ❌ mvnsite 2m 13s /branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in trunk failed.
-1 ❌ javadoc 0m 33s /branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.txt hadoop-hdfs-client in trunk failed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.
-1 ❌ javadoc 0m 22s /branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkPrivateBuild-1.8.0_412-8u412-ga-1~20.04.1-b08.txt hadoop-hdfs-client in trunk failed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08.
-1 ❌ spotbugs 0m 22s /branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in trunk failed.
+1 💚 shadedclient 4m 36s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 22s /patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in the patch failed.
-1 ❌ compile 0m 22s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.txt hadoop-hdfs-client in the patch failed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.
-1 ❌ javac 0m 22s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.txt hadoop-hdfs-client in the patch failed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.
-1 ❌ compile 0m 22s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkPrivateBuild-1.8.0_412-8u412-ga-1~20.04.1-b08.txt hadoop-hdfs-client in the patch failed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08.
-1 ❌ javac 0m 22s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkPrivateBuild-1.8.0_412-8u412-ga-1~20.04.1-b08.txt hadoop-hdfs-client in the patch failed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 19s /buildtool-patch-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt The patch fails to run checkstyle in hadoop-hdfs-client
-1 ❌ mvnsite 0m 22s /patch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in the patch failed.
-1 ❌ javadoc 0m 12s /patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.txt hadoop-hdfs-client in the patch failed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.
-1 ❌ javadoc 0m 23s /patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkPrivateBuild-1.8.0_412-8u412-ga-1~20.04.1-b08.txt hadoop-hdfs-client in the patch failed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08.
-1 ❌ spotbugs 0m 22s /patch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in the patch failed.
+1 💚 shadedclient 3m 33s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 22s /patch-unit-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in the patch failed.
+0 🆗 asflicense 0m 18s ASF License check generated no output?
14m 27s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/1/artifact/out/Dockerfile
GITHUB PR #6935
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux a459709db3d4 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / de67a94
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/1/testReport/
Max. process+thread count 51 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/1/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
-1 ❌ mvninstall 0m 22s /branch-mvninstall-root.txt root in trunk failed.
-1 ❌ compile 0m 22s /branch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.txt hadoop-hdfs-client in trunk failed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.
-1 ❌ compile 0m 22s /branch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkPrivateBuild-1.8.0_412-8u412-ga-1~20.04.1-b08.txt hadoop-hdfs-client in trunk failed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08.
-0 ⚠️ checkstyle 0m 18s /buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt The patch fails to run checkstyle in hadoop-hdfs-client
-1 ❌ mvnsite 0m 22s /branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in trunk failed.
-1 ❌ javadoc 0m 22s /branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.txt hadoop-hdfs-client in trunk failed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.
-1 ❌ javadoc 0m 22s /branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkPrivateBuild-1.8.0_412-8u412-ga-1~20.04.1-b08.txt hadoop-hdfs-client in trunk failed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08.
-1 ❌ spotbugs 0m 20s /branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in trunk failed.
+1 💚 shadedclient 2m 27s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 22s /patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in the patch failed.
-1 ❌ compile 0m 22s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.txt hadoop-hdfs-client in the patch failed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.
-1 ❌ javac 0m 22s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.txt hadoop-hdfs-client in the patch failed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.
-1 ❌ compile 0m 22s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkPrivateBuild-1.8.0_412-8u412-ga-1~20.04.1-b08.txt hadoop-hdfs-client in the patch failed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08.
-1 ❌ javac 0m 22s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-client-jdkPrivateBuild-1.8.0_412-8u412-ga-1~20.04.1-b08.txt hadoop-hdfs-client in the patch failed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 19s /buildtool-patch-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt The patch fails to run checkstyle in hadoop-hdfs-client
-1 ❌ mvnsite 0m 22s /patch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in the patch failed.
-1 ❌ javadoc 0m 21s /patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.txt hadoop-hdfs-client in the patch failed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2.
-1 ❌ javadoc 0m 22s /patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkPrivateBuild-1.8.0_412-8u412-ga-1~20.04.1-b08.txt hadoop-hdfs-client in the patch failed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08.
-1 ❌ spotbugs 0m 22s /patch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in the patch failed.
+1 💚 shadedclient 4m 7s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 20s /patch-unit-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in the patch failed.
+0 🆗 asflicense 0m 22s ASF License check generated no output?
11m 36s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/2/artifact/out/Dockerfile
GITHUB PR #6935
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 6368fd79930f 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / f84aa72
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/2/testReport/
Max. process+thread count 51 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/2/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@Hexiaoqiao
Copy link
Contributor

Thanks @liangyu-1 for your report and PR. What about to invoke the following interface and set flag to CreateFlag.IGNORE_CLIENT_LOCALITY.

https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1275-L1284

@liangyu-1
Copy link
Contributor Author

Thanks @liangyu-1 for your report and PR. What about to invoke the following interface and set flag to CreateFlag.IGNORE_CLIENT_LOCALITY.

https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1275-L1284

@Hexiaoqiao thanks for your reply, I think it's a good idea and I have resubmitted my code that invokes the interface and replace the createFlag. CreateFlag.IGNORE_CLIENT_LOCALITY helps the cluster to be more even and invoke the interface increased code scalability.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 6m 35s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 33m 25s trunk passed
+1 💚 compile 0m 37s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 0m 32s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 0m 22s trunk passed
+1 💚 mvnsite 0m 36s trunk passed
+1 💚 javadoc 0m 33s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 29s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 1m 28s trunk passed
+1 💚 shadedclient 22m 17s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 28s the patch passed
+1 💚 compile 0m 28s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 0m 28s the patch passed
+1 💚 compile 0m 24s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 0m 24s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 11s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-project/hadoop-hdfs-client: The patch generated 1 new + 44 unchanged - 0 fixed = 45 total (was 44)
+1 💚 mvnsite 0m 28s the patch passed
+1 💚 javadoc 0m 20s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 21s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 1m 22s the patch passed
+1 💚 shadedclient 22m 33s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 52s hadoop-hdfs-client in the patch passed.
+1 💚 asflicense 0m 23s The patch does not generate ASF License warnings.
95m 49s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/3/artifact/out/Dockerfile
GITHUB PR #6935
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux dd0560c46629 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 685073a
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/3/testReport/
Max. process+thread count 551 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 33m 17s trunk passed
+1 💚 compile 0m 30s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 0m 29s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 0m 19s trunk passed
+1 💚 mvnsite 0m 32s trunk passed
+1 💚 javadoc 0m 30s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 25s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 1m 25s trunk passed
+1 💚 shadedclient 22m 59s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 26s the patch passed
+1 💚 compile 0m 31s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 0m 31s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 12s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-project/hadoop-hdfs-client: The patch generated 1 new + 44 unchanged - 0 fixed = 45 total (was 44)
+1 💚 mvnsite 0m 24s the patch passed
+1 💚 javadoc 0m 21s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 22s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 1m 37s the patch passed
+1 💚 shadedclient 22m 10s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 43s hadoop-hdfs-client in the patch passed.
+1 💚 asflicense 0m 22s The patch does not generate ASF License warnings.
89m 14s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/4/artifact/out/Dockerfile
GITHUB PR #6935
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux fec31dc95e30 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a9ed293
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/4/testReport/
Max. process+thread count 551 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@slfan1989
Copy link
Contributor

@liangyu-1 Can we add a corresponding unit test for this? We need to fix the checkstyle issue.

… to Manage Disk Space and Network Load in Labeled YARN Nodes
@liangyu-1
Copy link
Contributor Author

liangyu-1 commented Jul 16, 2024

@liangyu-1 Can we add a corresponding unit test for this? We need to fix the checkstyle issue.

hi, @slfan1989, I have just add an Unit Test on the result of function DFSClient.getConf().getNoLocalWrite(),

DFSClient has no interface to set the address for DataNode, so I can only add this Unit Test to ensure that we successfully add the CreateFlag.NO_LOCAL_WRITE flag when we create a new hdfs file.

Thanks

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 6m 45s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 34m 54s trunk passed
+1 💚 compile 0m 33s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 0m 32s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 0m 18s trunk passed
+1 💚 mvnsite 0m 33s trunk passed
+1 💚 javadoc 0m 30s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 26s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 1m 30s trunk passed
+1 💚 shadedclient 27m 30s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 27s the patch passed
+1 💚 compile 0m 30s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 0m 30s the patch passed
+1 💚 compile 0m 24s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 0m 24s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 11s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-project/hadoop-hdfs-client: The patch generated 7 new + 44 unchanged - 0 fixed = 51 total (was 44)
+1 💚 mvnsite 0m 26s the patch passed
+1 💚 javadoc 0m 21s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 21s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
-1 ❌ spotbugs 1m 17s /patch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in the patch failed.
-1 ❌ shadedclient 4m 58s patch has errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 6s /patch-unit-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-client in the patch failed.
+0 🆗 asflicense 0m 7s ASF License check generated no output?
79m 41s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/5/artifact/out/Dockerfile
GITHUB PR #6935
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux de10559d44a9 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 273ce0e
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/5/testReport/
Max. process+thread count 552 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@Hexiaoqiao
Copy link
Contributor

Sorry for late response. And I didn't know why & what this PR want to do. When I said 'What about to invoke the following interface and set flag to CreateFlag.IGNORE_CLIENT_LOCALITY.', I means that the current implement also support to skip localization when write data to HDFS. Please check again. Thanks.

@liangyu-1
Copy link
Contributor Author

liangyu-1 commented Jul 17, 2024

Sorry for late response. And I didn't know why & what this PR want to do. When I said 'What about to invoke the following interface and set flag to CreateFlag.IGNORE_CLIENT_LOCALITY.', I means that the current implement also support to skip localization when write data to HDFS. Please check again. Thanks.

@Hexiaoqiao Thanks for your reply, I think I have understood your suggestion. You mean that I can implement that interface and add the Flag in that function.

But in my scenario, I am using flink's fileSystem API, and I have read the source code of the flink API that it used FileSystem.create(Path f) , which means that if I want to use the CreateFlag.IGNORE_CLIENT_LOCALITY in hadoop, I have to change the source code of flink filesystem API and rebuild the whole flink project.

I think this will also happens in most computation engines because most engines directly uses function FileSystem.create(Path f) . This will cause too many extra work.

But in my pr, I can solve the problem by just adding the hadoop configuration, this is much more convenient.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 19s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 34s trunk passed
+1 💚 compile 0m 37s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 0m 30s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 0m 22s trunk passed
+1 💚 mvnsite 0m 35s trunk passed
+1 💚 javadoc 0m 34s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 28s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 1m 27s trunk passed
+1 💚 shadedclient 20m 54s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 26s the patch passed
+1 💚 compile 0m 30s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 0m 30s the patch passed
+1 💚 compile 0m 24s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 0m 24s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 13s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-project/hadoop-hdfs-client: The patch generated 7 new + 44 unchanged - 0 fixed = 51 total (was 44)
+1 💚 mvnsite 0m 29s the patch passed
+1 💚 javadoc 0m 23s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 23s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 1m 23s the patch passed
+1 💚 shadedclient 20m 43s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 50s hadoop-hdfs-client in the patch passed.
-1 ❌ asflicense 0m 26s /results-asflicense.txt The patch generated 1 ASF License warnings.
85m 39s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/6/artifact/out/Dockerfile
GITHUB PR #6935
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux b64ff602f484 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / e2a0007
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/6/testReport/
Max. process+thread count 554 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 18s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 37s trunk passed
+1 💚 compile 0m 37s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 0m 30s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 0m 21s trunk passed
+1 💚 mvnsite 0m 36s trunk passed
+1 💚 javadoc 0m 33s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 27s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 1m 26s trunk passed
+1 💚 shadedclient 20m 51s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 26s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 0m 29s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 14s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt hadoop-hdfs-project/hadoop-hdfs-client: The patch generated 7 new + 44 unchanged - 0 fixed = 51 total (was 44)
+1 💚 mvnsite 0m 27s the patch passed
+1 💚 javadoc 0m 19s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 23s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 1m 29s the patch passed
+1 💚 shadedclient 20m 55s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 48s hadoop-hdfs-client in the patch passed.
+1 💚 asflicense 0m 27s The patch does not generate ASF License warnings.
85m 39s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/7/artifact/out/Dockerfile
GITHUB PR #6935
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 08e246daaa61 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d14880b
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/7/testReport/
Max. process+thread count 551 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6935/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@Hexiaoqiao
Copy link
Contributor

But in my scenario, I am using flink's fileSystem API, and I have read the source code of the flink API that it used FileSystem.create(Path f) , which means that if I want to use the CreateFlag.IGNORE_CLIENT_LOCALITY in hadoop, I have to change the source code of flink filesystem API and rebuild the whole flink project.

I think this will also happens in most computation engines because most engines directly uses function FileSystem.create(Path f) . This will cause too many extra work.

Got it. But I am sorry to disagree your opinion. There is one flexible interface however upstream system do not invoke it, thus we should push the upstream system to update. Another side, if config as this PR do, this will affect whole side run at this Client which could not be expected. In one word, suggest to proposal and submit PR at Flink side. Thanks again.

@liangyu-1
Copy link
Contributor Author

But in my scenario, I am using flink's fileSystem API, and I have read the source code of the flink API that it used FileSystem.create(Path f) , which means that if I want to use the CreateFlag.IGNORE_CLIENT_LOCALITY in hadoop, I have to change the source code of flink filesystem API and rebuild the whole flink project.
I think this will also happens in most computation engines because most engines directly uses function FileSystem.create(Path f) . This will cause too many extra work.

Got it. But I am sorry to disagree your opinion. There is one flexible interface however upstream system do not invoke it, thus we should push the upstream system to update. Another side, if config as this PR do, this will affect whole side run at this Client which could not be expected. In one word, suggest to proposal and submit PR at Flink side. Thanks again.

@Hexiaoqiao , This does not only happens in Flink, but also other engines like Spark etc. If I only sunmit a PR at FLINK side, the other engines' API (like SPARK, SPARK Structured stream) will not be able to use this feature and we need to rebuilt the whole computation project whenever we choose to use a new computation engine.

@ayushtkn
Copy link
Member

This does not only happens in Flink, but also other engines like Spark etc. If I only sunmit a PR at FLINK side, the other engines' API (like SPARK, SPARK Structured stream) will not be able to use this feature and we need to rebuilt the whole computation project whenever we choose to use a new computation engine.

That ain't our concern, We provide an interface to do things, If those engines want to leverage that functionality they can do that way.

Those engines can't update their code or their are multiple clients or so doesn't justify bothering the hadoop side code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants