Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-19248. Protobuf code generate and replace should happen together #6975

Merged
merged 1 commit into from
Aug 28, 2024

Conversation

pan3793
Copy link
Member

@pan3793 pan3793 commented Aug 2, 2024

Description of PR

As part of HADOOP-16596, it switched from the vanilla protobuf to the shaded one, thus the generated code by protobuf should be modified (replace package name) before compiling and generating Javadoc.

Currently, the protobuf code generate happens in generate-sources(and other similar phases), while the replace happens in process-sources, this is OK for compiling but has problems for Javadoc.

$ mvn clean install -DskipTests
$ mvn javadoc:javadoc

The above commands won't trigger replace, thus the final generated Javadoc refers to vanilla protobuf.

image

The javadoc(Java 8) also complains that

[INFO] --- javadoc:3.0.1:javadoc (default-cli) @ hadoop-common ---
[INFO] 
ExcludePrivateAnnotationsStandardDoclet
100 warnings
[WARNING] Javadoc Warnings
[WARNING] /Users/chengpan/Projects/apache-hadoop/hadoop-common-project/hadoop-common/target/generated-sources/java/org/apache/hadoop/ipc/protobuf/RpcHeaderProtos.java:3887: error: cannot find symbol
[WARNING] com.google.protobuf.GeneratedMessageV3 implements
[WARNING] ^
[WARNING] symbol:   class GeneratedMessageV3
[WARNING] location: package com.google.protobuf

Seems things become worse in Java 17, such the error will fail the build immediately ...

How was this patch tested?

Tested with Java 8, macOS aarch64.

➜  apache-hadoop git:(trunk) mvn clean install -DskipTests
➜  apache-hadoop git:(trunk) mvn javadoc:javadoc | grep '@ hadoop-common '                 
[INFO] >>> javadoc:3.0.1:javadoc (default-cli) > generate-sources @ hadoop-common >>>
[INFO] --- antrun:1.7:run (create-testdirs) @ hadoop-common ---
[INFO] --- protobuf:0.5.1:compile (src-compile-protoc) @ hadoop-common ---
[INFO] --- build-helper:1.9:add-source (add-source-legacy-protobuf) @ hadoop-common ---
[INFO] <<< javadoc:3.0.1:javadoc (default-cli) < generate-sources @ hadoop-common <<<
[INFO] --- javadoc:3.0.1:javadoc (default-cli) @ hadoop-common ---
➜  apache-hadoop git:(HADOOP-19248) mvn javadoc:javadoc | grep '@ hadoop-common '
[INFO] >>> javadoc:3.0.1:javadoc (default-cli) > generate-sources @ hadoop-common >>>
[INFO] --- antrun:1.7:run (create-testdirs) @ hadoop-common ---
[INFO] --- protobuf:0.5.1:compile (src-compile-protoc) @ hadoop-common ---
[INFO] --- replacer:1.5.3:replace (replace-generated-sources) @ hadoop-common ---
[INFO] --- replacer:1.5.3:replace (replace-sources) @ hadoop-common ---
[INFO] --- build-helper:1.9:add-source (add-source-legacy-protobuf) @ hadoop-common ---
[INFO] <<< javadoc:3.0.1:javadoc (default-cli) < generate-sources @ hadoop-common <<<
[INFO] --- javadoc:3.0.1:javadoc (default-cli) @ hadoop-common ---
image

Javadoc warnings disappeared and the generated code is correct now.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@pan3793
Copy link
Member Author

pan3793 commented Aug 2, 2024

there is no issue with mvn package -Pdocs -DskipTests, but I also suppose mvn javadoc:javadoc should work because it is listed at

* Build javadocs : mvn javadoc:javadoc

@pan3793
Copy link
Member Author

pan3793 commented Aug 2, 2024

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 31s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 48m 55s trunk passed
+1 💚 compile 0m 20s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 0m 20s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 mvnsite 0m 26s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 23s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 shadedclient 86m 34s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 13s the patch passed
+1 💚 compile 0m 12s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 0m 12s the patch passed
+1 💚 compile 0m 12s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 0m 12s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 0m 15s the patch passed
+1 💚 javadoc 0m 13s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 0m 13s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 shadedclient 37m 7s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 16s hadoop-project in the patch passed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
127m 23s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6975/1/artifact/out/Dockerfile
GITHUB PR #6975
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint
uname Linux cc0f6c476722 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 9c51b6c
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6975/1/testReport/
Max. process+thread count 691 (vs. ulimit of 5500)
modules C: hadoop-project U: hadoop-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6975/1/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@slfan1989
Copy link
Contributor

slfan1989 commented Aug 2, 2024

cc @steveloughran @slfan1989 @aajisaka

@pan3793 Thank you for contribution! Personally, I'm not sure if this modification is feasible. I noticed differences between these two options on Maven's lifecycle page. I need other members to help verify.

https://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html

Phase Description
generate-sources generate any source code for inclusion in compilation.
process-sources process the source code, for example to filter any values.

@pan3793
Copy link
Member Author

pan3793 commented Aug 5, 2024

also cc @vinayakumarb

@steveloughran
Copy link
Contributor

Looking at the maven stuff I'd consider the replacement part of the generation. The problem we have is that within a life-cycle phase I am not sure we can guarantee that operations happen. Having no specific phases is designed to eliminate the ambiguity. As it is: even if it currently works for maven locally, do we have any guarantees that it will continue to work?

@pan3793
Copy link
Member Author

pan3793 commented Aug 7, 2024

@steveloughran I understand your concerns, without explicitly setting dependencies between executions, the final execution set and order will always seem fragile.

do we have any guarantees that it will continue to work?

According to https://maven.apache.org/plugins/maven-javadoc-plugin/javadoc-mojo.html, mvn javadoc:javadoc would

Invokes the execution of the following lifecycle phase prior to executing itself: generate-sources.

I think the answer is yes, assuming all executions within a lifecycle phase run sequentially based on the declared order. (I didn't find such a statement in the Maven documentation, but it seems to be true)

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 for this. if it causes problems, we can see what can be done, but this does at least allow progress

@slfan1989
Copy link
Contributor

LGTM +1.

I’m currently organizing the JDK 17 build scripts and will verify this functionality again. Let's see if there are any other comments from other members. (wait an additional 1-2 days)

@pan3793 Thanks for the contribution! @steveloughran Thanks for the review!

@pan3793
Copy link
Member Author

pan3793 commented Aug 26, 2024

kindly ping @slfan1989, can we move this ahead?

@steveloughran steveloughran merged commit 0aab1a2 into apache:trunk Aug 28, 2024
1 of 3 checks passed
@steveloughran
Copy link
Contributor

@pan3793 merged...sorry, thought you had the permissions to do this.

please can you do a PR for branch-3.4, push it up to see what yetus does there and I'll merge afterwards. thanks

@slfan1989
Copy link
Contributor

@steveloughran Thanks for reviewing pr! @pan3793 Thank for your contribution!

Pan is a highly experienced developer with extensive community development experience. Although he is not yet a Hadoop committer, he has great potential to become one in the future.

pan3793 added a commit to pan3793/hadoop that referenced this pull request Aug 29, 2024
steveloughran pushed a commit that referenced this pull request Aug 30, 2024
KeeProMise pushed a commit to KeeProMise/hadoop that referenced this pull request Sep 9, 2024
Hexiaoqiao pushed a commit to Hexiaoqiao/hadoop that referenced this pull request Sep 12, 2024
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants