[WIP] Add paddle backend in devel #4157

HydrogenSulfate · 2024-09-23T07:34:36Z

Add water/se_e2_a train/test/lammps

TODO:

Simplify code
Make sure CmakeList compatible
Merge [Inference] add add_shadow_output_after_dead_parameter_pass PaddlePaddle/Paddle#68476 and <[Dy2St] Skip prune parameter in dy2st and save PaddlePaddle/Paddle#68442 to allow buffer/params(i.e. type_map) kept in inference model.

Summary by CodeRabbit

New Features
- Enhanced Paddle support, enabling users to utilize the PaddlePaddle library for model training and inference.
- Introduced a new DPAtomicModel class for atomic predictions of physical properties, improving model flexibility and accuracy.
- Added functionality to locate and manage Paddle dependencies within the environment.
Bug Fixes
- Improved handling of Paddle library detection and version management.
Documentation
- Updated documentation to reflect new functionalities and improved clarity on Paddle support.
Chores
- Streamlined pre-commit configurations by disabling unused hooks.

double-backward

coderabbitai · 2024-09-23T07:34:45Z

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (4)

doc/freeze/freeze.md
doc/install/install-from-source.md
source/CMakeLists.txt
source/install/build_cc_pd.sh

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

The changes introduce PaddlePaddle support within the DeepMD framework, involving updates to the README.md to reflect new backend compatibility and the addition of the DPAtomicModel class in dp_atomic_model.py. The new class provides atomic predictions for physical properties and includes methods for model management, serialization, and neighbor selection. Overall, these modifications enhance the framework's functionality by integrating Paddle alongside existing backends like TensorFlow and PyTorch.

Changes

Files	Change Summary
`README.md`	Updated to include Paddle as a supported backend alongside TensorFlow and PyTorch, with version notes reflecting this change.
`deepmd/pd/model/atomic_model/dp_atomic_model.py`	Introduced `DPAtomicModel` class for atomic predictions, including multiple methods for managing atomic data and model state.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

github-advanced-security

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

coderabbitai

Actionable comments posted: 7

Outside diff range and nitpick comments (9)

deepmd/pd/infer/inference.py (2)

20-22: Consider removing or explaining the commented-out code.

The commented-out code block appears to be checking the paddle version. If this code is no longer needed, it's best to remove it. If it's intended for future use, consider adding a comment explaining its purpose and when it might be uncommented.

Tools

GitHub Check: CodeQL

[notice] 20-21: Commented-out code
This comment appears to contain commented-out code.

63-66: TODO: Implement JIT compilation or remove if not needed.

The JIT compilation is currently not implemented and raises a NotImplementedError. If JIT compilation is planned for future implementation, consider adding a TODO comment explaining the plan. If it's not needed, you may want to remove the if JIT: block entirely.

Would you like me to create a GitHub issue to track the implementation of JIT compilation?

backend/read_env.py (1)

Line range hint 1-150: Summary: Paddle backend support implementation

The changes to backend/read_env.py successfully introduce Paddle support in a manner consistent with the existing TensorFlow and PyTorch implementations. The modifications include:

Adding necessary imports for Paddle-related functions.

Implementing Paddle support in the get_argument_from_env function.

Updating the return value of get_argument_from_env to include the Paddle version.

While the overall implementation is good, there are a couple of minor issues that need to be addressed:

A variable assignment error where pt_version is used instead of pd_version.

Potential use of an uninitialized pd_version variable when Paddle is disabled.

Once these issues are resolved, the Paddle backend support will be properly integrated into the build configuration process.
deepmd/pd/cxx_op.py (2)
75-76: Fix typographical errors in error message

In lines 75-76, there are typographical errors:

"execuating" should be "executing".

"you may consider to install" should be "you may consider installing".

Apply this diff to correct the typos:
- "PyPI, you may consider to install deepmd-kit execuating "
+ "PyPI, you may consider installing deepmd-kit by executing "
80-80: Correct typo in error message

In line 80, the word "inconsitent" should be corrected to "inconsistent".

Apply this diff to correct the typo:
- "This deepmd-kit package is inconsitent with Paddle "
+ "This deepmd-kit package is inconsistent with Paddle "
deepmd/pd/entrypoints/main.py (4)
388-392: Remove or uncomment the commented-out code

The lines appear to contain commented-out code for handling .pdmodel files. If this code is no longer needed, consider removing it to improve readability. If it's planned for future implementation, add a comment explaining its purpose.

Tools

GitHub Check: CodeQL

[notice] 388-392: Commented-out code
This comment appears to contain commented-out code.

452-457: Remove or uncomment the commented-out code

Similar to earlier, these lines contain commented-out code related to .pdmodel files. To enhance code maintainability, remove unused code or provide context if it's intended for future use.

Tools

GitHub Check: CodeQL

[notice] 452-457: Commented-out code
This comment appears to contain commented-out code.

577-580: Simplify conditional assignment using a ternary operator

You can streamline the assignment of FLAGS by using a ternary operator, making the code more concise.

Apply this diff to refactor the code:
-if not isinstance(args, argparse.Namespace):
-    FLAGS = parse_args(args=args)
-else:
-    FLAGS = args
+FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args
Tools

Ruff

577-580: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

85-87: Consider removing commented-out import statements

The lines contain commented-out import statements. If these imports are no longer necessary, remove them to clean up the code. If they are kept for future reference, consider adding a comment explaining their intended use.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 3068869 and 8d53aec.

Files selected for processing (7)

backend/find_paddle.py (1 hunks)
backend/read_env.py (3 hunks)
deepmd/backend/paddle.py (1 hunks)
deepmd/pd/init.py (1 hunks)
deepmd/pd/cxx_op.py (1 hunks)
deepmd/pd/entrypoints/main.py (1 hunks)
deepmd/pd/infer/inference.py (1 hunks)

Files skipped from review as they are similar to previous changes (2)

backend/find_paddle.py
deepmd/backend/paddle.py

Additional context used

GitHub Check: CodeQL

backend/read_env.py

[failure] 150-150: Potentially uninitialized local variable
Local variable 'pd_version' may be used before it is initialized.

deepmd/pd/entrypoints/main.py

[notice] 108-108: Unused local variable
Variable local_rank is not used.

[notice] 369-369: Unused local variable
Variable extra_files is not used.

[notice] 388-392: Commented-out code
This comment appears to contain commented-out code.

[notice] 452-457: Commented-out code
This comment appears to contain commented-out code.

[warning] 561-565: Unreachable code
This statement is unreachable.

deepmd/pd/infer/inference.py

[notice] 20-21: Commented-out code
This comment appears to contain commented-out code.

Ruff

deepmd/pd/__init__.py

12-12: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

deepmd/pd/cxx_op.py

51-51: Yoda condition detected

Rewrite as pd_cxx11_abi_flag != PD_CXX11_ABI_FLAG

(SIM300)

67-67: Yoda condition detected

Rewrite as pd_py_version != PD_VERSION

(SIM300)

deepmd/pd/entrypoints/main.py

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)

369-369: Local variable extra_files is assigned to but never used

Remove assignment to unused variable extra_files

(F841)

577-580: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

Additional comments not posted (9)

deepmd/pd/__init__.py (3)

1-7: LGTM: License and import section looks good.

The SPDX license identifier is correctly specified, and the import of customized operations is wrapped in a try block for proper error handling. The comment is clear and informative.

15-15: LGTM: PaddlePaddle import is correct.

The import statement for PaddlePaddle is standard and correctly implemented.

17-19: Please provide more context for PaddlePaddle configurations.

The code enables eager execution and all primitive operations for PaddlePaddle. While this might be necessary for your use case, it's important to consider the following:

Enabling these features globally might have performance implications.

The purpose and necessity of these configurations are not entirely clear from the comment.

Could you please provide more information on:

Why these configurations are necessary?

Are there any performance considerations we should be aware of?

Is it possible to make these configurations more targeted or optional?

To better understand the impact of these configurations, we can search for their usage across the codebase:

This will help us understand if these configurations are used consistently across the project.

deepmd/pd/infer/inference.py (2)

1-18: LGTM: Imports and license identifier look good.

The necessary modules and functions are imported correctly, and the SPDX license identifier is properly placed at the top of the file.

25-62: LGTM: Tester class implementation looks solid.

The Tester class is well-structured and handles both single-task and multi-task scenarios appropriately. The initialization process, including loading and preparing the model state dictionary, is implemented clearly with proper error checking for multi-task cases.

backend/read_env.py (2)

16-18: LGTM: New imports for Paddle support.

The new import statements for Paddle-related functions are correctly placed and consistent with the file's purpose.

150-150: LGTM: Updated return statement.

The addition of pd_version to the return tuple is consistent with the function's purpose and follows the existing pattern for other backend versions.

Tools

GitHub Check: CodeQL

[failure] 150-150: Potentially uninitialized local variable
Local variable 'pd_version' may be used before it is initialized.

deepmd/pd/cxx_op.py (1)

48-49: Confirm hardcoded CXX11 ABI flag

At lines 48-49, pd_cxx11_abi_flag is hardcoded to 0, and the line that dynamically retrieves the ABI flag using paddle.compiled_with_cxx11_abi() is commented out. Please verify if hardcoding this value is intentional. If possible, consider uncommenting the line to obtain the actual ABI flag from Paddle, ensuring the check reflects the runtime environment.
deepmd/pd/entrypoints/main.py (1)
561-565: Unreachable code after NotImplementedError in change_bias

The code after raise NotImplementedError is unreachable in the change_bias function. To avoid confusion, remove the unreachable code or adjust the control flow to ensure the code is reachable if intended.

Apply this diff to address the issue:
+if False:  # Placeholder for future implementation
    output_path = (
        FLAGS.output
        if FLAGS.output is not None
        else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams")
    )
Or remove the code entirely if it's not needed.

Likely invalid or redundant comment.

Tools

GitHub Check: CodeQL

[warning] 561-565: Unreachable code
This statement is unreachable.

coderabbitai · 2024-09-23T09:26:35Z

deepmd/pd/__init__.py

+    __all__ = [
+        "ENABLE_CUSTOMIZED_OP",
+    ]
+except Exception as e:
+    __all__ = []


Improve exception handling and remove unused variable.

While the conditional definition of __all__ is good practice, there are a couple of improvements we can make:

Consider catching more specific exceptions rather than using a broad except Exception.

The variable e is assigned but never used, which is flagged by the static analysis tool.

Here's a suggested improvement:

try: from deepmd.pd.cxx_op import ( ENABLE_CUSTOMIZED_OP, ) __all__ = [ "ENABLE_CUSTOMIZED_OP", ] -except Exception as e: +except ImportError: __all__ = []

This change catches only ImportError, which is more specific to the operation being performed, and removes the unused e variable.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

__all__ = [

"ENABLE_CUSTOMIZED_OP",

]

except Exception as e:

__all__ = []

__all__ = [

"ENABLE_CUSTOMIZED_OP",

]

except ImportError:

__all__ = []

Tools

Ruff

12-12: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

coderabbitai · 2024-09-23T09:26:36Z

deepmd/pd/entrypoints/main.py

+    # Initialize DDP
+    local_rank = os.environ.get("LOCAL_RANK")
+    if local_rank is not None:
+        local_rank = int(local_rank)


Variable local_rank is assigned but never used

The variable local_rank is converted to an integer but is not used afterward. Consider utilizing it if necessary or removing the assignment.

Apply this diff to fix the issue:

-local_rank = int(local_rank)

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

local_rank = int(local_rank)

Tools

GitHub Check: CodeQL

[notice] 108-108: Unused local variable
Variable local_rank is not used.

coderabbitai · 2024-09-23T09:26:36Z

deepmd/pd/entrypoints/main.py

+            InputSpec([-1, -1, -1], dtype="int32", name="nlist"),
+        ],
+    )
+    extra_files = {}


Remove unused variable extra_files

The variable extra_files is assigned but never used. If it's not needed, consider removing it to improve code clarity.

Apply this diff to fix the issue:

-extra_files = {}

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

extra_files = {}

Tools

Ruff

369-369: Local variable extra_files is assigned to but never used

Remove assignment to unused variable extra_files

(F841)

GitHub Check: CodeQL

[notice] 369-369: Unused local variable
Variable extra_files is not used.

coderabbitai · 2024-09-23T09:26:36Z

deepmd/pd/entrypoints/main.py

+        elif stat_file_path_single is not None:
+            if not Path(stat_file_path_single).exists():
+                if stat_file_path_single.endswith((".h5", ".hdf5")):
+                    with h5py.File(stat_file_path_single, "w") as f:


Remove unused variable f

The variable f is assigned but never used in the with statement. To eliminate the unused variable warning, remove the variable assignment.

Apply this diff to fix the issue:

-with h5py.File(stat_file_path_single, "w") as f: +with h5py.File(stat_file_path_single, "w"):

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

with h5py.File(stat_file_path_single, "w") as f:

with h5py.File(stat_file_path_single, "w"):

Tools

Ruff

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)

coderabbitai · 2024-09-23T09:26:36Z

deepmd/pd/entrypoints/main.py

+        raise NotImplementedError
+        # for .json
+        output_path = (
+            FLAGS.output
+            if FLAGS.output is not None
+            else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams")
+        )


Unreachable code after raise NotImplementedError

The code following raise NotImplementedError is unreachable and will not be executed. To maintain code cleanliness, consider removing or restructuring this code.

Apply this diff to address the issue:

-raise NotImplementedError -# for .json -output_path = ( - FLAGS.output - if FLAGS.output is not None - else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams") -)

If this code is intended for future use, consider commenting it out with an explanatory note.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

raise NotImplementedError

# for .json

output_path = (

FLAGS.output

if FLAGS.output is not None

else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams")

)

# Intentionally left blank after removing unreachable code

Tools

GitHub Check: CodeQL

[warning] 561-565: Unreachable code
This statement is unreachable.

coderabbitai · 2024-09-23T09:26:36Z

deepmd/pd/entrypoints/main.py

+    if FLAGS.INPUT.endswith(".pdparams"):
+        old_state_dict = paddle.load(FLAGS.INPUT)
+        model_state_dict = copy.deepcopy(old_state_dict.get("model", old_state_dict))
+        model_params = model_state_dict["_extra_state"]["model_params"]
+    # elif FLAGS.INPUT.endswith(".pdmodel"):
+    #     old_model = paddle.jit.load(FLAGS.INPUT[: -len(".pdmodel")])
+    #     model_params_string = old_model.get_model_def_script()
+    #     model_params = json.loads(model_params_string)
+    #     old_state_dict = old_model.state_dict()
+    #     model_state_dict = old_state_dict
+    else:
+        raise RuntimeError(
+            "The model provided must be a checkpoint file with a .pd extension"
+            # "or a frozen model with a .pdparams extension"


Inconsistent file extension handling in change_bias function

The function checks if FLAGS.INPUT ends with .pdparams, but the error message references a .pd extension. Ensure consistency in file extension handling and that error messages accurately reflect the expected input.

Apply this diff to correct the error message:

-raise RuntimeError( - "The model provided must be a checkpoint file with a .pd extension" -) +raise RuntimeError( + "The model provided must be a checkpoint file with a .pdparams extension" +)

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if FLAGS.INPUT.endswith(".pdparams"):

old_state_dict = paddle.load(FLAGS.INPUT)

model_state_dict = copy.deepcopy(old_state_dict.get("model", old_state_dict))

model_params = model_state_dict["_extra_state"]["model_params"]

# elif FLAGS.INPUT.endswith(".pdmodel"):

# old_model = paddle.jit.load(FLAGS.INPUT[: -len(".pdmodel")])

# model_params_string = old_model.get_model_def_script()

# model_params = json.loads(model_params_string)

# old_state_dict = old_model.state_dict()

# model_state_dict = old_state_dict

else:

raise RuntimeError(

"The model provided must be a checkpoint file with a .pd extension"

# "or a frozen model with a .pdparams extension"

if FLAGS.INPUT.endswith(".pdparams"):

old_state_dict = paddle.load(FLAGS.INPUT)

model_state_dict = copy.deepcopy(old_state_dict.get("model", old_state_dict))

model_params = model_state_dict["_extra_state"]["model_params"]

# elif FLAGS.INPUT.endswith(".pdmodel"):

# old_model = paddle.jit.load(FLAGS.INPUT[: -len(".pdmodel")])

# model_params_string = old_model.get_model_def_script()

# model_params = json.loads(model_params_string)

# old_state_dict = old_model.state_dict()

# model_state_dict = old_state_dict

else:

raise RuntimeError(

"The model provided must be a checkpoint file with a .pdparams extension"

)

Tools

GitHub Check: CodeQL

[notice] 452-457: Commented-out code
This comment appears to contain commented-out code.

coderabbitai

Actionable comments posted: 8

Outside diff range and nitpick comments (7)

deepmd/pd/model/model/make_model.py (6)
94-94: Remove or restore the commented-out decorator @paddle.jit.export

The decorator @paddle.jit.export is commented out. If it's no longer needed, consider removing it to improve code readability. If it's required for JIT compilation, consider restoring it.

Apply this diff to remove the commented-out decorator:
-# @paddle.jit.export
298-307: Eliminate unnecessary commented-out code

There is a block of commented-out code that seems obsolete. Removing it will enhance code cleanliness and maintainability.

Apply this diff:
-###
-### type checking would not pass jit, convert to coord prec anyway
-###
-# for vv, kk in zip([fparam, aparam], ["frame", "atomic"]):
-#     if vv is not None and self.reverse_precision_dict[vv.dtype] != input_prec:
-#         log.warning(
-#           f"type of {kk} parameter {self.reverse_precision_dict[vv.dtype]}"
-#           " does not match"
-#           f" that of the coordinate {input_prec}"
-#         )
340-341: Simplify dictionary key checks by removing unnecessary .keys() method

In Python, you can check for the presence of a key directly without using the .keys() method. This simplifies the code and enhances readability.

Apply this diff to simplify the code:
-for kk in odef.keys():
+for kk in odef:
-if kk not in model_ret.keys():
+if kk not in model_ret:
Tools

Ruff

340-340: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

341-341: Use key not in dict instead of key not in dict.keys()

Remove .keys()

(SIM118)

366-366: Correct typo in docstring

There's a typographical error in the docstring: "nothong" should be "nothing".

Apply this diff:
-                it does nothong
+                it does nothing
377-377: Fix typo in comment

The word "effact" should be corrected to "effect" to accurately convey the intended meaning.

Apply this diff:
-                May have side effact on the efficiency.
+                May have side effect on the efficiency.
497-497: Remove or restore the commented-out decorator @paddle.jit.export

The decorator @paddle.jit.export is commented out. If it's not needed, consider removing it. If it's essential for JIT compilation, consider uncommenting it.

Apply this diff:
-# @paddle.jit.export
deepmd/pd/model/descriptor/se_atten.py (1)
400-404: Simplify conditional assignment with ternary operator

The if-else block can be simplified using a ternary operator for conciseness and readability.

Apply this diff to implement the improvement:
             if callable(merged):
                 # only get data for once
                 sampled = merged()
             else:
                 sampled = merged
+            sampled = merged() if callable(merged) else merged
Tools

Ruff

400-404: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block

(SIM108)

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 8d53aec and 50092c6.

Files selected for processing (4)

backend/dynamic_metadata.py (1 hunks)
backend/read_env.py (3 hunks)
deepmd/pd/model/descriptor/se_atten.py (1 hunks)
deepmd/pd/model/model/make_model.py (1 hunks)

Additional context used

GitHub Check: CodeQL

backend/read_env.py

[failure] 150-150: Potentially uninitialized local variable
Local variable 'pd_version' may be used before it is initialized.

deepmd/pd/model/descriptor/se_atten.py

[notice] 474-474: Unused local variable
Variable nall is not used.

Ruff

deepmd/pd/model/descriptor/se_atten.py

69-69: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

86-86: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

400-404: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block

(SIM108)

424-424: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

474-474: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)

deepmd/pd/model/model/make_model.py

340-340: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

341-341: Use key not in dict instead of key not in dict.keys()

Remove .keys()

(SIM118)

Additional comments not posted (4)

backend/read_env.py (3)

16-18: LGTM: Import statement for Paddle support.

The import statement for find_paddle is correctly placed and follows the existing import style. This addition is necessary for the new Paddle support functionality.

150-150: LGTM: Addition of pd_version to return tuple.

The addition of pd_version to the return tuple of get_argument_from_env function is correct and consistent with how TensorFlow and PyTorch versions are returned.

Tools

GitHub Check: CodeQL

[failure] 150-150: Potentially uninitialized local variable
Local variable 'pd_version' may be used before it is initialized.

156-156: LGTM: Update of set_scikit_build_env function.

The update to the set_scikit_build_env function correctly unpacks the new pd_version from the get_argument_from_env return tuple. This change is consistent with the modifications made to get_argument_from_env.

deepmd/pd/model/model/make_model.py (1)

583-599: Ensure consistency in method parameters between forward and forward_common

The forward method directly calls forward_common with the same parameters. Verify that any additional needed parameters are passed and that the method signatures remain consistent.

Run the following script to check for consistency in method signatures:

coderabbitai · 2024-09-24T06:19:11Z

backend/dynamic_metadata.py

@@ -36,7 +36,7 @@ def dynamic_metadata(
    settings: Optional[Dict[str, object]] = None,
 ):
    assert field in ["optional-dependencies", "entry-points", "scripts"]
-    _, _, find_libpython_requires, extra_scripts, tf_version, pt_version = (
+    _, _, find_libpython_requires, extra_scripts, tf_version, pt_version, pd_version = (


💡 Codebase verification

⚠️ Potential issue

Unused pd_version Variable Detected

The variable pd_version is unpacked in backend/dynamic_metadata.py at line 39 but is not used within the dynamic_metadata function. This may lead to unused variable warnings and could indicate incomplete implementation.

If pd_version is intended to be used:

Update the function to utilize pd_version where necessary.

For example, incorporate pd_version in the optional-dependencies section:
elif field == "optional-dependencies": optional_dependencies = pyproject["tool"]["deepmd_build_backend"][ "optional-dependencies" ] optional_dependencies["lmp"].extend(find_libpython_requires) optional_dependencies["ipi"].extend(find_libpython_requires) return { **optional_dependencies, **get_tf_requirement(tf_version), **get_pt_requirement(pt_version), **get_pd_requirement(pd_version), # Add this line }

If pd_version is not needed:

Remove pd_version from the unpacking to avoid unused variable warnings:
_, _, find_libpython_requires, extra_scripts, tf_version, pt_version = (

Analysis chain

Address inconsistency and unused variable

There's an inconsistency between the AI-generated summary and the actual code. The summary mentions a change in the function signature, but this is not reflected in the provided code. Please clarify if there are missing changes or if the summary is incorrect.

The addition of pd_version suggests the introduction of Paddle support, which aligns with the PR objectives. However, pd_version is currently not used in the rest of the function.

To address these issues:

If pd_version is intended to be a parameter, update the function signature:

def dynamic_metadata( field: str, settings: Optional[Dict[str, object]] = None, pd_version: Optional[str] = None, ):

Implement the usage of pd_version in the function body, possibly in the optional-dependencies section:

elif field == "optional-dependencies": optional_dependencies = pyproject["tool"]["deepmd_build_backend"][ "optional-dependencies" ] optional_dependencies["lmp"].extend(find_libpython_requires) optional_dependencies["ipi"].extend(find_libpython_requires) return { **optional_dependencies, **get_tf_requirement(tf_version), **get_pt_requirement(pt_version), **get_pd_requirement(pd_version), # Add this line }

If pd_version is not needed, remove it from the unpacking to avoid unused variable warnings.

To ensure pd_version is used consistently across the codebase, run the following script:

This will help verify the usage of pd_version and identify any Paddle-related functions that might need to be implemented or updated.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check for occurrences of pd_version and potential Paddle-related functions # Search for pd_version echo "Occurrences of pd_version:" rg 'pd_version' --type python # Search for potential Paddle-related functions echo "\nPotential Paddle-related functions:" rg 'get_pd_requirement|paddle' --type python

Length of output: 286

Script:

#!/bin/bash # Description: Check for occurrences of pd_version and potential Paddle-related functions # Search for pd_version in Python files echo "Occurrences of pd_version:" rg 'pd_version' --type py # Search for potential Paddle-related functions in Python files echo "\nPotential Paddle-related functions:" rg 'get_pd_requirement|paddle' --type py

Length of output: 167111

coderabbitai · 2024-09-24T06:19:12Z

backend/read_env.py

+    if os.environ.get("DP_ENABLE_PADDLE", "0") == "1":
+        pd_install_dir, _ = find_paddle()
+        pt_version = get_pt_version(pd_install_dir)
+        cmake_args.extend(
+            [
+                "-DENABLE_PADDLE=ON",
+                f"-DCMAKE_PREFIX_PATH={pd_install_dir}",
+            ]
+        )
+    else:
+        cmake_args.append("-DENABLE_PADDLE=OFF")
+        pd_version = None


⚠️ Potential issue

Fix variable assignment for Paddle version.

The implementation for Paddle support looks good overall, but there's still an issue to address:

On line 128, pt_version is assigned instead of pd_version. This appears to be a copy-paste error that was not fixed in the previous review.

Please apply the following change:

if os.environ.get("DP_ENABLE_PADDLE", "0") == "1": pd_install_dir, _ = find_paddle() - pt_version = get_pt_version(pd_install_dir) + pd_version = get_pt_version(pd_install_dir) cmake_args.extend( [ "-DENABLE_PADDLE=ON", f"-DCMAKE_PREFIX_PATH={pd_install_dir}", ] ) else: cmake_args.append("-DENABLE_PADDLE=OFF") pd_version = None

This change will correctly assign the Paddle version.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if os.environ.get("DP_ENABLE_PADDLE", "0") == "1":

pd_install_dir, _ = find_paddle()

pt_version = get_pt_version(pd_install_dir)

cmake_args.extend(

[

"-DENABLE_PADDLE=ON",

f"-DCMAKE_PREFIX_PATH={pd_install_dir}",

]

)

else:

cmake_args.append("-DENABLE_PADDLE=OFF")

pd_version = None

if os.environ.get("DP_ENABLE_PADDLE", "0") == "1":

pd_install_dir, _ = find_paddle()

pd_version = get_pt_version(pd_install_dir)

cmake_args.extend(

[

"-DENABLE_PADDLE=ON",

f"-DCMAKE_PREFIX_PATH={pd_install_dir}",

]

)

else:

cmake_args.append("-DENABLE_PADDLE=OFF")

pd_version = None

coderabbitai · 2024-09-24T06:19:12Z

deepmd/pd/model/model/make_model.py

+                    axis=-1,
+                )
+
+            if True:  # TODO: Fix controlflow + backward in PIR static graph


⚠️ Potential issue

Revisit the use of the if True statement

Using if True: is redundant and may hinder code clarity. The comment suggests there is a TODO related to fixing control flow and backward compatibility in the PIR static graph. Consider addressing the TODO or removing the redundant if True: block.

Apply this diff to remove the redundant if True::

-if True: # TODO: Fix controlflow + backward in PIR static graph +# TODO: Fix controlflow + backward in PIR static graph

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if True: # TODO: Fix controlflow + backward in PIR static graph

# TODO: Fix controlflow + backward in PIR static graph

coderabbitai · 2024-09-24T06:19:12Z

deepmd/pd/model/model/make_model.py

+    RESERVED_PRECISON_DICT,
+)


⚠️ Potential issue

Correct the typo in the imported variable RESERVED_PRECISON_DICT

There's a possible typo in the variable name RESERVED_PRECISON_DICT. It should likely be RESERVED_PRECISION_DICT. Correcting this will prevent confusion and potential errors.

Apply this diff:

- RESERVED_PRECISON_DICT, + RESERVED_PRECISION_DICT,

Also, update all occurrences of RESERVED_PRECISON_DICT in the code:

- self.reverse_precision_dict = RESERVED_PRECISON_DICT + self.reverse_precision_dict = RESERVED_PRECISION_DICT

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

RESERVED_PRECISON_DICT,

)

RESERVED_PRECISION_DICT,

)

coderabbitai · 2024-09-24T06:19:12Z

deepmd/pd/model/descriptor/se_atten.py

+        nframes, nloc, nnei = nlist.shape
+        atype = extended_atype[:, :nloc]
+        nb = nframes
+        nall = extended_coord.reshape([nb, -1, 3]).shape[1]


⚠️ Potential issue

Remove unused variable nall

The local variable nall is assigned but never used afterward, which may cause confusion.

Apply this diff to fix the issue:

- nall = extended_coord.reshape([nb, -1, 3]).shape[1]

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

nall = extended_coord.reshape([nb, -1, 3]).shape[1]

Tools

Ruff

474-474: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)

GitHub Check: CodeQL

[notice] 474-474: Unused local variable
Variable nall is not used.

coderabbitai · 2024-09-24T06:19:12Z

deepmd/pd/model/descriptor/se_atten.py

+
+    def reinit_exclude(
+        self,
+        exclude_types: List[Tuple[int, int]] = [],


⚠️ Potential issue

Avoid using mutable default arguments

The parameter exclude_types in reinit_exclude method uses a mutable default argument []. It's advisable to set the default to None and initialize within the method to prevent unexpected behaviors.

Apply this diff to fix the issue:

- def reinit_exclude( - self, - exclude_types: List[Tuple[int, int]] = [], - ): + def reinit_exclude( + self, + exclude_types: Optional[List[Tuple[int, int]]] = None, + ): + if exclude_types is None: + exclude_types = []

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

exclude_types: List[Tuple[int, int]] = [],

def reinit_exclude(

self,

exclude_types: Optional[List[Tuple[int, int]]] = None,

):

if exclude_types is None:

exclude_types = []

Tools

Ruff

424-424: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

coderabbitai · 2024-09-24T06:19:12Z

deepmd/pd/model/descriptor/se_atten.py

+        temperature=None,
+        smooth: bool = True,
+        type_one_side: bool = False,
+        exclude_types: List[Tuple[int, int]] = [],


⚠️ Potential issue

Avoid using mutable default arguments

The parameter exclude_types uses a mutable default argument []. Using mutable default arguments can lead to unexpected behaviors. It's recommended to use None as the default value and initialize the list within the function.

Apply this diff to fix the issue:

- exclude_types: List[Tuple[int, int]] = [], + exclude_types: Optional[List[Tuple[int, int]]] = None,

And initialize inside the __init__ method:

+ if exclude_types is None: + exclude_types = []

Committable suggestion was skipped due to low confidence.

Tools

Ruff

86-86: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

coderabbitai · 2024-09-24T06:19:12Z

deepmd/pd/model/descriptor/se_atten.py

+        rcut_smth: float,
+        sel: Union[List[int], int],
+        ntypes: int,
+        neuron: list = [25, 50, 100],


⚠️ Potential issue

Avoid using mutable default arguments

The parameter neuron uses a mutable default argument [25, 50, 100]. Using mutable default arguments can lead to unexpected behaviors because default argument values are evaluated only once at function definition time. It's best practice to set default arguments to None and initialize them within the function.

Apply this diff to fix the issue:

- neuron: list = [25, 50, 100], + neuron: Optional[List[int]] = None,

And initialize inside the __init__ method:

+ if neuron is None: + neuron = [25, 50, 100]

Don't forget to import Optional from typing if not already imported.

Committable suggestion was skipped due to low confidence.

Tools

Ruff

69-69: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

coderabbitai

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (8)

deepmd/pd/entrypoints/main.py (5)
243-342: Consider refactoring train function for improved modularity.

The train function is well-implemented with good error handling and logging. However, it's quite long and handles multiple responsibilities. Consider breaking it down into smaller, more focused functions to improve readability and maintainability. For example, you could create separate functions for:

Configuration processing

Multi-task setup

Fine-tuning setup

Neighbor statistics calculation

This refactoring would make the main train function more concise and easier to understand.

On a positive note, the error handling and logging throughout the function are commendable, providing clear feedback and traceability.

345-375: LGTM: freeze function is well-implemented.

The freeze function is concise and effectively freezes the model for inference. The use of Paddle's static graph API and the informative log message at the end are good practices.

Consider adding a try-except block to handle potential errors during the model freezing process, which could provide more informative error messages to the user.

Example:
try:
    paddle.jit.save(
        model,
        path=FLAGS.output,
        skip_prune_program=True,
    )
    suffix = "json" if PIR_ENABLED.lower() in ["true", "1"] else "pdmodel"
    log.info(
        f"Paddle inference model has been exported to: {FLAGS.output}.{suffix}(.pdiparams)"
    )
except Exception as e:
    log.error(f"Error occurred while freezing the model: {str(e)}")
    raise
377-439: LGTM: show function is well-implemented with good error handling.

The show function effectively displays various attributes of the model, handling both single-task and multi-task models. The error handling and informative output are commendable.

There's commented-out code for handling .pdmodel files (lines 384-387). If this functionality is planned for future implementation, consider adding a TODO comment to make it clear for other developers.

Example TODO comment:
# TODO: Implement support for .pdmodel files
# elif FLAGS.INPUT.split(".")[-1] == "pdmodel":
#     model_params_string = paddle.jit.load(
#         FLAGS.INPUT[: -len(".pdmodel")]
#     ).model_def_script
#     model_params = json.loads(model_params_string)
🧰 Tools

🪛 GitHub Check: CodeQL

[notice] 383-387: Commented-out code
This comment appears to contain commented-out code.

572-603: LGTM: main function is well-structured.

The main function serves as a clear entry point for the script, effectively handling different commands. The structure is concise and easy to understand.

As suggested by the static analysis tool, you can use a ternary operator for the FLAGS assignment to make the code more concise:

Apply this diff to use a ternary operator:
-    if not isinstance(args, argparse.Namespace):
-        FLAGS = parse_args(args=args)
-    else:
-        FLAGS = args
+    FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args
🧰 Tools

🪛 Ruff

573-576: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

1-606: Overall, the implementation is solid with room for some improvements.

The deepmd/pd/entrypoints/main.py file provides a comprehensive set of functionalities for model management in the DeepMD framework using Paddle. The code is generally well-structured with good error handling, logging, and type hinting.

Areas for improvement:

Some functions (e.g., get_trainer, train, change_bias) are quite long and could benefit from further modularization to enhance readability and maintainability.

There are a few instances of commented-out code that could be removed or turned into TODO comments if the functionality is planned for future implementation.

Address the unused variable local_rank in the get_trainer function.

The use of type hints, clear error messages, and comprehensive logging are commendable practices that enhance the code's robustness and maintainability.

🧰 Tools

🪛 Ruff

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)

573-576: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

🪛 GitHub Check: CodeQL

[notice] 108-108: Unused local variable
Variable local_rank is not used.

[notice] 383-387: Commented-out code
This comment appears to contain commented-out code.

[notice] 447-452: Commented-out code
This comment appears to contain commented-out code.

[warning] 557-561: Unreachable code
This statement is unreachable.
deepmd/pd/model/atomic_model/dp_atomic_model.py (3)
233-234: Optimize by removing unused variables

In the forward_atomic method, variables rot_mat, g2, h2, and sw are returned from self.descriptor(...) but are not used afterwards. Removing these unused variables can improve readability and performance.

Apply this diff to remove unused variables:
-            descriptor, rot_mat, g2, h2, sw = self.descriptor(
+            descriptor, _, _, _, _ = self.descriptor(
Alternatively, if the descriptor returns only the needed value:
-            descriptor, rot_mat, g2, h2, sw = self.descriptor(
+            descriptor = self.descriptor(
Adjust self.descriptor accordingly if possible.

259-260: Align parameter names in documentation

In the compute_or_load_stat method, the documentation refers to sampled, but the parameter is named sampled_func. For consistency and clarity, the documentation should use the actual parameter name.

Apply this diff to correct the parameter name in the documentation:
-            When `sampled` is provided, all the statistics parameters will be calculated (or re-calculated for update),
+            When `sampled_func` is provided, all the statistics parameters will be calculated (or re-calculated for update),
299-300: Simplify method documentation

In the get_dim_fparam method, the documentation can be streamlined for clarity.

Apply this diff to simplify the documentation:
-            """Get the number (dimension) of frame parameters of this atomic model."""
+            """Get the dimension of frame parameters."""

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 50092c6 and 892fd80.

📒 Files selected for processing (2)

deepmd/pd/entrypoints/main.py (1 hunks)
deepmd/pd/model/atomic_model/dp_atomic_model.py (1 hunks)

🧰 Additional context used

🪛 Ruff

deepmd/pd/entrypoints/main.py

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)

573-576: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

🪛 GitHub Check: CodeQL

deepmd/pd/entrypoints/main.py

[notice] 108-108: Unused local variable
Variable local_rank is not used.

[notice] 383-387: Commented-out code
This comment appears to contain commented-out code.

[notice] 447-452: Commented-out code
This comment appears to contain commented-out code.

[warning] 557-561: Unreachable code
This statement is unreachable.

deepmd/pd/model/atomic_model/dp_atomic_model.py

[warning] 60-60: Overwriting attribute in super-class or sub-class
Assignment overwrites attribute type_map, which was previously defined in superclass BaseAtomicModel.

🔇 Additional comments (2)

deepmd/pd/entrypoints/main.py (2)

1-90: LGTM: Imports and global setup look good.

The imports and global setup are well-organized and follow best practices. The use of type hints and the separation of standard library imports from third-party imports is commendable.

206-240: LGTM: SummaryPrinter class is well-implemented.

The SummaryPrinter class is well-structured, extends BaseSummaryPrinter appropriately, and provides clear, focused methods for retrieving backend information. The implementation follows good object-oriented design principles.

coderabbitai · 2024-09-27T02:19:10Z

deepmd/pd/entrypoints/main.py

+            f"For multitask model, the model branch must be in the 'model_dict'! "
+            f"Available options are : {list(model_params['model_dict'].keys())}."
+        )
+        log.info(f"Changing out bias for model {model_branch}.")
+    model = training.get_model_for_wrapper(model_params)
+    type_map = (
+        model_params["type_map"]
+        if not multi_task
+        else model_params["model_dict"][model_branch]["type_map"]
+    )
+    model_to_change = model if not multi_task else model[model_branch]
+    if FLAGS.INPUT.endswith(".pd"):
+        wrapper = ModelWrapper(model)
+        wrapper.set_state_dict(old_state_dict["model"])
+    else:
+        # for .pdparams
+        model.set_state_dict(old_state_dict)
+
+    if FLAGS.bias_value is not None:
+        # use user-defined bias
+        assert model_to_change.model_type in [
+            "ener"
+        ], "User-defined bias is only available for energy model!"
+        assert (
+            len(FLAGS.bias_value) == len(type_map)
+        ), f"The number of elements in the bias should be the same as that in the type_map: {type_map}."
+        old_bias = model_to_change.get_out_bias()
+        bias_to_set = paddle.to_tensor(
+            FLAGS.bias_value, dtype=old_bias.dtype, place=old_bias.place
+        ).reshape(old_bias.shape)
+        model_to_change.set_out_bias(bias_to_set)
+        log.info(
+            f"Change output bias of {type_map!s} "
+            f"from {to_numpy_array(old_bias).reshape(-1)!s} "
+            f"to {to_numpy_array(bias_to_set).reshape(-1)!s}."
+        )
+        updated_model = model_to_change
+    else:
+        # calculate bias on given systems
+        if FLAGS.datafile is not None:
+            with open(FLAGS.datafile) as datalist:
+                all_sys = datalist.read().splitlines()
+        else:
+            all_sys = expand_sys_str(FLAGS.system)
+        data_systems = process_systems(all_sys)
+        data_single = DpLoaderSet(
+            data_systems,
+            1,
+            type_map,
+        )
+        mock_loss = training.get_loss(
+            {"inference": True}, 1.0, len(type_map), model_to_change
+        )
+        data_requirement = mock_loss.label_requirement
+        data_requirement += training.get_additional_data_requirement(model_to_change)
+        data_single.add_data_requirement(data_requirement)
+        nbatches = FLAGS.numb_batch if FLAGS.numb_batch != 0 else float("inf")
+        sampled_data = make_stat_input(
+            data_single.systems,
+            data_single.dataloaders,
+            nbatches,
+        )
+        updated_model = training.model_change_out_bias(
+            model_to_change, sampled_data, _bias_adjust_mode=bias_adjust_mode
+        )
+
+    if not multi_task:
+        model = updated_model
+    else:
+        model[model_branch] = updated_model
+
+    if FLAGS.INPUT.endswith(".pd"):
+        output_path = (
+            FLAGS.output
+            if FLAGS.output is not None
+            else FLAGS.INPUT.replace(".pd", "_updated.pd")
+        )
+        wrapper = ModelWrapper(model)
+        if "model" in old_state_dict:
+            old_state_dict["model"] = wrapper.state_dict()
+            old_state_dict["model"]["_extra_state"] = model_state_dict["_extra_state"]
+        else:
+            old_state_dict = wrapper.state_dict()
+            old_state_dict["_extra_state"] = model_state_dict["_extra_state"]
+        paddle.save(old_state_dict, output_path)
+    else:
+        raise NotImplementedError
+        # for .json
+        output_path = (
+            FLAGS.output
+            if FLAGS.output is not None
+            else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams")
+        )
+        model = paddle.jit.to_static(model)
+        paddle.jit.save(
+            model,
+            output_path,
+            {},
+        )
+    log.info(f"Saved model to {output_path}")


🛠️ Refactor suggestion

⚠️ Potential issue

Consider refactoring change_bias function and addressing incomplete functionality.

The change_bias function is well-implemented with good error handling and type checking. However, it's quite long and could benefit from further modularization. Consider breaking it down into smaller, more focused functions to improve readability and maintainability.

There's commented-out code and a NotImplementedError for handling .json files (lines 447-452 and 555-567). If this functionality is planned for future implementation, consider adding a TODO comment. If not, it might be better to remove the commented-out code to improve code clarity.

Additionally, there's unreachable code after the NotImplementedError (lines 557-561). This code should be removed or placed before the raise NotImplementedError statement if it's intended to be used in the future.

Apply this diff to address the unreachable code:

- raise NotImplementedError - # for .json - output_path = ( - FLAGS.output - if FLAGS.output is not None - else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams") - ) + # TODO: Implement support for .json files + raise NotImplementedError("Support for .json files is not yet implemented")

Committable suggestion was skipped due to low confidence.

🧰 Tools

🪛 GitHub Check: CodeQL

[notice] 447-452: Commented-out code
This comment appears to contain commented-out code.

[warning] 557-561: Unreachable code
This statement is unreachable.

coderabbitai · 2024-09-27T02:19:10Z

deepmd/pd/entrypoints/main.py

+
+    # Initialize DDP
+    local_rank = os.environ.get("LOCAL_RANK")
+    if local_rank is not None:
+        local_rank = int(local_rank)
+        assert dist.is_nccl_available()
+        dist.init_process_group(backend="nccl")
+
+    def prepare_trainer_input_single(
+        model_params_single, data_dict_single, rank=0, seed=None
+    ):
+        training_dataset_params = data_dict_single["training_data"]
+        validation_dataset_params = data_dict_single.get("validation_data", None)
+        validation_systems = (
+            validation_dataset_params["systems"] if validation_dataset_params else None
+        )
+        training_systems = training_dataset_params["systems"]
+        training_systems = process_systems(training_systems)
+        if validation_systems is not None:
+            validation_systems = process_systems(validation_systems)
+
+        # stat files
+        stat_file_path_single = data_dict_single.get("stat_file", None)
+        if rank != 0:
+            stat_file_path_single = None
+        elif stat_file_path_single is not None:
+            if not Path(stat_file_path_single).exists():
+                if stat_file_path_single.endswith((".h5", ".hdf5")):
+                    with h5py.File(stat_file_path_single, "w") as f:
+                        pass
+                else:
+                    Path(stat_file_path_single).mkdir()
+            stat_file_path_single = DPPath(stat_file_path_single, "a")
+
+        # validation and training data
+        # avoid the same batch sequence among devices
+        rank_seed = (seed + rank) % (2**32) if seed is not None else None
+        validation_data_single = (
+            DpLoaderSet(
+                validation_systems,
+                validation_dataset_params["batch_size"],
+                model_params_single["type_map"],
+                seed=rank_seed,
+            )
+            if validation_systems
+            else None
+        )
+        train_data_single = DpLoaderSet(
+            training_systems,
+            training_dataset_params["batch_size"],
+            model_params_single["type_map"],
+            seed=rank_seed,
+        )
+        return (
+            train_data_single,
+            validation_data_single,
+            stat_file_path_single,
+        )
+
+    rank = dist.get_rank() if dist.is_available() and dist.is_initialized() else 0
+    data_seed = config["training"].get("seed", None)
+    if not multi_task:
+        (
+            train_data,
+            validation_data,
+            stat_file_path,
+        ) = prepare_trainer_input_single(
+            config["model"],
+            config["training"],
+            rank=rank,
+            seed=data_seed,
+        )
+    else:
+        train_data, validation_data, stat_file_path = {}, {}, {}
+        for model_key in config["model"]["model_dict"]:
+            (
+                train_data[model_key],
+                validation_data[model_key],
+                stat_file_path[model_key],
+            ) = prepare_trainer_input_single(
+                config["model"]["model_dict"][model_key],
+                config["training"]["data_dict"][model_key],
+                rank=rank,
+                seed=data_seed,
+            )
+
+    trainer = training.Trainer(
+        config,
+        train_data,
+        stat_file_path=stat_file_path,
+        validation_data=validation_data,
+        init_model=init_model,
+        restart_model=restart_model,
+        finetune_model=finetune_model,
+        force_load=force_load,
+        shared_links=shared_links,
+        finetune_links=finetune_links,
+        init_frz_model=init_frz_model,
+    )
+    return trainer


🛠️ Refactor suggestion

⚠️ Potential issue

Consider refactoring get_trainer for improved readability and maintainability.

The get_trainer function is well-structured but quite long. Consider breaking it down into smaller, more focused functions to improve readability and maintainability. For example, you could create separate functions for preparing single-task and multi-task inputs.

Additionally, there's an unused variable local_rank on line 108. If it's not needed, you can remove it to improve code clarity.

Apply this diff to remove the unused variable:

- local_rank = int(local_rank)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get_trainer(

config,

init_model=None,

restart_model=None,

finetune_model=None,

force_load=False,

init_frz_model=None,

shared_links=None,

finetune_links=None,

):

multi_task = "model_dict" in config.get("model", {})

# Initialize DDP

local_rank = os.environ.get("LOCAL_RANK")

if local_rank is not None:

local_rank = int(local_rank)

assert dist.is_nccl_available()

dist.init_process_group(backend="nccl")

def prepare_trainer_input_single(

model_params_single, data_dict_single, rank=0, seed=None

):

training_dataset_params = data_dict_single["training_data"]

validation_dataset_params = data_dict_single.get("validation_data", None)

validation_systems = (

validation_dataset_params["systems"] if validation_dataset_params else None

)

training_systems = training_dataset_params["systems"]

training_systems = process_systems(training_systems)

if validation_systems is not None:

validation_systems = process_systems(validation_systems)

# stat files

stat_file_path_single = data_dict_single.get("stat_file", None)

if rank != 0:

stat_file_path_single = None

elif stat_file_path_single is not None:

if not Path(stat_file_path_single).exists():

if stat_file_path_single.endswith((".h5", ".hdf5")):

with h5py.File(stat_file_path_single, "w") as f:

pass

else:

Path(stat_file_path_single).mkdir()

stat_file_path_single = DPPath(stat_file_path_single, "a")

# validation and training data

# avoid the same batch sequence among devices

rank_seed = (seed + rank) % (2**32) if seed is not None else None

validation_data_single = (

DpLoaderSet(

validation_systems,

validation_dataset_params["batch_size"],

model_params_single["type_map"],

seed=rank_seed,

)

if validation_systems

else None

)

train_data_single = DpLoaderSet(

training_systems,

training_dataset_params["batch_size"],

model_params_single["type_map"],

seed=rank_seed,

)

return (

train_data_single,

validation_data_single,

stat_file_path_single,

)

rank = dist.get_rank() if dist.is_available() and dist.is_initialized() else 0

data_seed = config["training"].get("seed", None)

if not multi_task:

(

train_data,

validation_data,

stat_file_path,

) = prepare_trainer_input_single(

config["model"],

config["training"],

rank=rank,

seed=data_seed,

)

else:

train_data, validation_data, stat_file_path = {}, {}, {}

for model_key in config["model"]["model_dict"]:

(

train_data[model_key],

validation_data[model_key],

stat_file_path[model_key],

) = prepare_trainer_input_single(

config["model"]["model_dict"][model_key],

config["training"]["data_dict"][model_key],

rank=rank,

seed=data_seed,

)

trainer = training.Trainer(

config,

train_data,

stat_file_path=stat_file_path,

validation_data=validation_data,

init_model=init_model,

restart_model=restart_model,

finetune_model=finetune_model,

force_load=force_load,

shared_links=shared_links,

finetune_links=finetune_links,

init_frz_model=init_frz_model,

)

return trainer

def get_trainer(

config,

init_model=None,

restart_model=None,

finetune_model=None,

force_load=False,

init_frz_model=None,

shared_links=None,

finetune_links=None,

):

multi_task = "model_dict" in config.get("model", {})

# Initialize DDP

local_rank = os.environ.get("LOCAL_RANK")

if local_rank is not None:

assert dist.is_nccl_available()

dist.init_process_group(backend="nccl")

def prepare_trainer_input_single(

model_params_single, data_dict_single, rank=0, seed=None

):

training_dataset_params = data_dict_single["training_data"]

validation_dataset_params = data_dict_single.get("validation_data", None)

validation_systems = (

validation_dataset_params["systems"] if validation_dataset_params else None

)

training_systems = training_dataset_params["systems"]

training_systems = process_systems(training_systems)

if validation_systems is not None:

validation_systems = process_systems(validation_systems)

# stat files

stat_file_path_single = data_dict_single.get("stat_file", None)

if rank != 0:

stat_file_path_single = None

elif stat_file_path_single is not None:

if not Path(stat_file_path_single).exists():

if stat_file_path_single.endswith((".h5", ".hdf5")):

with h5py.File(stat_file_path_single, "w") as f:

pass

else:

Path(stat_file_path_single).mkdir()

stat_file_path_single = DPPath(stat_file_path_single, "a")

# validation and training data

# avoid the same batch sequence among devices

rank_seed = (seed + rank) % (2**32) if seed is not None else None

validation_data_single = (

DpLoaderSet(

validation_systems,

validation_dataset_params["batch_size"],

model_params_single["type_map"],

seed=rank_seed,

)

if validation_systems

else None

)

train_data_single = DpLoaderSet(

training_systems,

training_dataset_params["batch_size"],

model_params_single["type_map"],

seed=rank_seed,

)

return (

train_data_single,

validation_data_single,

stat_file_path_single,

)

rank = dist.get_rank() if dist.is_available() and dist.is_initialized() else 0

data_seed = config["training"].get("seed", None)

if not multi_task:

(

train_data,

validation_data,

stat_file_path,

) = prepare_trainer_input_single(

config["model"],

config["training"],

rank=rank,

seed=data_seed,

)

else:

train_data, validation_data, stat_file_path = {}, {}, {}

for model_key in config["model"]["model_dict"]:

(

train_data[model_key],

validation_data[model_key],

stat_file_path[model_key],

) = prepare_trainer_input_single(

config["model"]["model_dict"][model_key],

config["training"]["data_dict"][model_key],

rank=rank,

seed=data_seed,

)

trainer = training.Trainer(

config,

train_data,

stat_file_path=stat_file_path,

validation_data=validation_data,

init_model=init_model,

restart_model=restart_model,

finetune_model=finetune_model,

force_load=force_load,

shared_links=shared_links,

finetune_links=finetune_links,

init_frz_model=init_frz_model,

)

return trainer

🧰 Tools

🪛 Ruff

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)

🪛 GitHub Check: CodeQL

[notice] 108-108: Unused local variable
Variable local_rank is not used.

coderabbitai · 2024-09-27T02:19:10Z

deepmd/pd/model/atomic_model/dp_atomic_model.py

+    ):
+        super().__init__(type_map, **kwargs)
+        ntypes = len(type_map)
+        self.type_map = type_map


⚠️ Potential issue

Avoid overwriting type_map attribute from superclass

The assignment self.type_map = type_map overwrites the type_map attribute already set in the superclass BaseAtomicModel during super().__init__(type_map, **kwargs). This could lead to unintended side effects if the superclass relies on the original type_map.

Consider removing the redundant assignment or ensure that overwriting type_map does not cause any issues.

🧰 Tools

🪛 GitHub Check: CodeQL

[warning] 60-60: Overwriting attribute in super-class or sub-class
Assignment overwrites attribute type_map, which was previously defined in superclass BaseAtomicModel.

coderabbitai · 2024-09-27T02:19:10Z

deepmd/pd/model/atomic_model/dp_atomic_model.py

+        def string_to_array(s: str) -> int:
+            return [ord(c) for c in s]
+


⚠️ Potential issue

Correct the return type annotation of string_to_array function

The string_to_array function is annotated to return int, but it actually returns a List[int]. This mismatch may lead to confusion or type-checking issues.

Apply this diff to correct the return type annotation:

-def string_to_array(s: str) -> int: +def string_to_array(s: str) -> List[int]:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def string_to_array(s: str) -> int:

return [ord(c) for c in s]

def string_to_array(s: str) -> List[int]:

return [ord(c) for c in s]

```

Note: The suggestion includes the correct type annotation as per the diff snippet. However, to ensure that the `List` type is properly recognized, you may need to import it from the `typing` module at the beginning of the file if it's not already imported:

```python

from typing import List

coderabbitai · 2024-09-27T02:19:11Z

deepmd/pd/model/atomic_model/dp_atomic_model.py

+            self.fitting_net.output_def()
+            if self.fitting_net is not None
+            else self.coord_denoise_net.output_def()
+        )


⚠️ Potential issue

Handle potential None values in fitting_output_def

In the fitting_output_def method, if both self.fitting_net and self.coord_denoise_net are None, an AttributeError may be raised. Consider adding a check to handle this scenario gracefully.

Apply this diff to add error handling:

def fitting_output_def(self) -> FittingOutputDef: """Get the output def of the fitting net.""" - return ( - self.fitting_net.output_def() - if self.fitting_net is not None - else self.coord_denoise_net.output_def() - ) + if self.fitting_net is not None: + return self.fitting_net.output_def() + elif self.coord_denoise_net is not None: + return self.coord_denoise_net.output_def() + else: + raise ValueError("Both fitting_net and coord_denoise_net are None.")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

self.fitting_net.output_def()

if self.fitting_net is not None

else self.coord_denoise_net.output_def()

)

def fitting_output_def(self) -> FittingOutputDef:

"""Get the output def of the fitting net."""

if self.fitting_net is not None:

return self.fitting_net.output_def()

elif self.coord_denoise_net is not None:

return self.coord_denoise_net.output_def()

else:

raise ValueError("Both fitting_net and coord_denoise_net are None.")

deepmd/pd/model/atomic_model/dp_atomic_model.py

coderabbitai

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between b3a6408 and 72241ea.

📒 Files selected for processing (5)

backend/dynamic_metadata.py (3 hunks)
backend/find_paddle.py (1 hunks)
backend/read_env.py (5 hunks)
deepmd/pd/entrypoints/main.py (1 hunks)
deepmd/pd/model/atomic_model/dp_atomic_model.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

backend/dynamic_metadata.py
backend/find_paddle.py
backend/read_env.py

🧰 Additional context used

🪛 Ruff

deepmd/pd/entrypoints/main.py

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)

573-576: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

🪛 GitHub Check: CodeQL

deepmd/pd/entrypoints/main.py

[notice] 108-108: Unused local variable
Variable local_rank is not used.

[notice] 383-387: Commented-out code
This comment appears to contain commented-out code.

[notice] 447-452: Commented-out code
This comment appears to contain commented-out code.

[warning] 557-561: Unreachable code
This statement is unreachable.

deepmd/pd/model/atomic_model/dp_atomic_model.py

[warning] 60-60: Overwriting attribute in super-class or sub-class
Assignment overwrites attribute type_map, which was previously defined in superclass BaseAtomicModel.

coderabbitai · 2024-09-27T05:09:16Z

deepmd/pd/model/atomic_model/dp_atomic_model.py

+            if self.pair_excl is not None:
+                pair_exclude_types = self.pair_excl.get_exclude_types()
+                for sample in sampled:
+                    sample["pair_exclude_types"] = list(pair_exclude_types)
+            if self.atom_excl is not None:
+                atom_exclude_types = self.atom_excl.get_exclude_types()
+                for sample in sampled:
+                    sample["atom_exclude_types"] = list(atom_exclude_types)


⚠️ Potential issue

Potential AttributeError: Uninitialized attributes self.pair_excl and self.atom_excl

In the compute_or_load_stat method, the attributes self.pair_excl and self.atom_excl are accessed within wrapped_sampler, but they are not initialized in the __init__ method or elsewhere in the class. This could lead to an AttributeError at runtime.

Consider initializing these attributes in the __init__ method or adding checks before accessing them. Here's a possible fix:

+ def __init__( + self, + descriptor, + fitting, + type_map: List[str], + **kwargs, + ): + super().__init__(type_map, **kwargs) + # Initialize attributes + self.pair_excl = None + self.atom_excl = None

Or modify the conditional checks:

@functools.lru_cache def wrapped_sampler(): sampled = sampled_func() - if self.pair_excl is not None: + if hasattr(self, 'pair_excl') and self.pair_excl is not None: pair_exclude_types = self.pair_excl.get_exclude_types() for sample in sampled: sample["pair_exclude_types"] = list(pair_exclude_types) - if self.atom_excl is not None: + if hasattr(self, 'atom_excl') and self.atom_excl is not None: atom_exclude_types = self.atom_excl.get_exclude_types() for sample in sampled: sample["atom_exclude_types"] = list(atom_exclude_types)

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-09-27T05:09:16Z

deepmd/pd/model/atomic_model/dp_atomic_model.py

+        if self.do_grad_r() or self.do_grad_c():
+            extended_coord.stop_gradient = False


💡 Codebase verification

Missing Definitions for do_grad_r() and do_grad_c() Methods in DPAtomicModel

The methods do_grad_r() and do_grad_c() are not defined within the DPAtomicModel class or its superclass BaseAtomicModel. This absence will lead to an AttributeError when these methods are invoked.

File: deepmd/pd/model/atomic_model/dp_atomic_model.py

Lines: 229-230

🔗 Analysis chain

Verify the existence of do_grad_r() and do_grad_c() methods

The methods self.do_grad_r() and self.do_grad_c() are called, but they are not defined within the DPAtomicModel class. If these methods are not implemented in the superclass or elsewhere, this will result in an AttributeError.

Please ensure that do_grad_r() and do_grad_c() are properly defined or inherited. To verify their existence, you can run the following script:

This script searches for the definitions of do_grad_r and do_grad_c in all Python files. Ensure that these methods exist and are accessible to avoid runtime errors.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check if methods `do_grad_r` and `do_grad_c` are defined in the class hierarchy. # Search for the method definitions in the codebase ast-grep --lang python --pattern $'def do_grad_r(self):\n $$$' ast-grep --lang python --pattern $'def do_grad_c(self):\n $$$'

Length of output: 131

Script:

#!/bin/bash # Description: Search for definitions of `do_grad_r` and `do_grad_c` methods across the entire Python codebase. # Search for `do_grad_r` method definitions rg 'def do_grad_r\s*$' --type python # Search for `do_grad_c` method definitions rg 'def do_grad_c\s*\(' --type python # Additionally, find the class hierarchy for DPAtomicModel to check inheritance rg 'class DPAtomicModel\s*\(.*$:' --type python -A 5

Length of output: 228

Script:

#!/bin/bash # Description: Search for definitions of `do_grad_r` and `do_grad_c` methods across all Python files, and identify the superclass of DPAtomicModel. # Search for `do_grad_r` method definitions rg 'def do_grad_r\s*$' -g "*.py" # Search for `do_grad_c` method definitions rg 'def do_grad_c\s*\(' -g "*.py" # Find the superclass of DPAtomicModel rg 'class DPAtomicModel\s*\(([^)]+)$:' -g "*.py"

Length of output: 886

Script:

#!/bin/bash # Description: Verify that `BaseAtomicModel` defines `do_grad_r` and `do_grad_c` methods. # Find the file(s) where `BaseAtomicModel` is defined rg 'class BaseAtomicModel\s*$.*$:' -g "*.py" # Assume the previous command outputs the file paths; replace <file_path> accordingly # Search for `do_grad_r` method within `BaseAtomicModel` rg 'def do_grad_r\s*\(' -g "*.py" --context 10 # Search for `do_grad_c` method within `BaseAtomicModel` rg 'def do_grad_c\s*\(' -g "*.py" --context 10

Length of output: 13262

coderabbitai · 2024-09-27T05:09:17Z

deepmd/pd/entrypoints/main.py

+    # elif FLAGS.INPUT.endswith(".json"):
+    #     old_model = paddle.jit.load(FLAGS.INPUT[: -len(".json")])
+    #     model_params_string = old_model.get_model_def_script()
+    #     model_params = json.loads(model_params_string)
+    #     old_state_dict = old_model.state_dict()
+    #     model_state_dict = old_state_dict


⚠️ Potential issue

Eliminate commented-out code or implement the intended functionality

The commented-out code in lines 447-452 suggests a planned feature for handling .json model files. If this functionality is not going to be implemented soon, consider removing the commented code. Otherwise, include a TODO comment to indicate future work.

Apply this diff to remove the commented-out code:

-# elif FLAGS.INPUT.endswith(".json"): -# old_model = paddle.jit.load(FLAGS.INPUT[: -len(".json")]) -# model_params_string = old_model.get_model_def_script() -# model_params = json.loads(model_params_string) -# old_state_dict = old_model.state_dict() -# model_state_dict = old_state_dict

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# elif FLAGS.INPUT.endswith(".json"):

# old_model = paddle.jit.load(FLAGS.INPUT[: -len(".json")])

# model_params_string = old_model.get_model_def_script()

# model_params = json.loads(model_params_string)

# old_state_dict = old_model.state_dict()

# model_state_dict = old_state_dict

🧰 Tools

🪛 GitHub Check: CodeQL

[notice] 447-452: Commented-out code
This comment appears to contain commented-out code.

coderabbitai · 2024-09-27T05:09:17Z

deepmd/pd/entrypoints/main.py

+    # elif FLAGS.INPUT.split(".")[-1] == "pdmodel":
+    #     model_params_string = paddle.jit.load(
+    #         FLAGS.INPUT[: -len(".pdmodel")]
+    #     ).model_def_script
+    #     model_params = json.loads(model_params_string)


⚠️ Potential issue

Remove commented-out code for clarity

The code between lines 383-387 is commented out. If this code is no longer needed, consider removing it to improve code readability. If it's intended for future implementation, add a descriptive comment explaining its purpose.

Apply this diff to remove the commented-out code:

-# elif FLAGS.INPUT.split(".")[-1] == "pdmodel": -# model_params_string = paddle.jit.load( -# FLAGS.INPUT[: -len(".pdmodel")] -# ).model_def_script -# model_params = json.loads(model_params_string)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# elif FLAGS.INPUT.split(".")[-1] == "pdmodel":

# model_params_string = paddle.jit.load(

# FLAGS.INPUT[: -len(".pdmodel")]

# ).model_def_script

# model_params = json.loads(model_params_string)

🧰 Tools

🪛 GitHub Check: CodeQL

[notice] 383-387: Commented-out code
This comment appears to contain commented-out code.

coderabbitai · 2024-09-27T05:09:17Z

deepmd/pd/entrypoints/main.py

+    if not isinstance(args, argparse.Namespace):
+        FLAGS = parse_args(args=args)
+    else:
+        FLAGS = args


🛠️ Refactor suggestion

Simplify the if-else block using a ternary operator

The if-else block in the main function can be simplified using a ternary operator for improved readability.

Apply this diff to refactor the code:

def main(args: Optional[Union[List[str], argparse.Namespace]] = None): - if not isinstance(args, argparse.Namespace): - FLAGS = parse_args(args=args) - else: - FLAGS = args + FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if not isinstance(args, argparse.Namespace):

FLAGS = parse_args(args=args)

else:

FLAGS = args

def main(args: Optional[Union[List[str], argparse.Namespace]] = None):

FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

🧰 Tools

🪛 Ruff

573-576: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)

README.md (3)
22-22: Approved with a minor suggestion for clarity.

The addition of Paddle as a supported backend is consistent with the PR objectives and enhances the framework's compatibility. Well done!

For improved clarity, consider rephrasing slightly:
- **interfaced with multiple backends**, including TensorFlow, PyTorch and Paddle, the most popular deep learning frameworks, making the training process highly automatic and efficient.
+ **interfaced with multiple backends**, including TensorFlow, PyTorch, and Paddle (three of the most popular deep learning frameworks), making the training process highly automatic and efficient.
This change adds a comma for proper list formatting and clarifies that these are examples of popular frameworks without implying they are the only popular ones.

75-77: Approved with suggestions for additional information.

The addition of version 3 highlights is appropriate and consistent with the PR objectives. However, to provide more value to readers, consider expanding on the new features:
#### v3

- Multiple backends supported. Add PyTorch and Paddle backend.
- The DPA-2 model.
+ - Multiple backends supported: Added PyTorch and Paddle backends, expanding compatibility and user choice.
+ - Introduced the DPA-2 model, enhancing [brief description of improvements or capabilities].
+ - [Any other significant features or improvements in v3]
This expansion would give readers a clearer understanding of the advancements in version 3. Please add any other significant features or improvements that are part of this major version update.

Line range hint 1-146: Overall assessment of README.md changes

The updates to the README.md file accurately reflect the addition of the Paddle backend and introduce the highlights of version 3. These changes align well with the PR objectives and effectively communicate the enhancements to potential users.

To further improve the documentation:

Consider expanding on the capabilities and benefits of the new Paddle backend.

Provide more details about the DPA-2 model and its advantages.

If there are any other significant features or improvements in version 3, include them in the highlights section.

These additions would give readers a more comprehensive understanding of the new version's capabilities and improvements.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 72241ea and f1cae59.

📒 Files selected for processing (2)

README.md (2 hunks)
deepmd/pd/model/atomic_model/dp_atomic_model.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

deepmd/pd/model/atomic_model/dp_atomic_model.py

HydrogenSulfate added 17 commits September 3, 2024 16:00

Add paddle backend code(WIP)

59b9af5

update runnable code with water/se_e2_a

7f32618

update correct water/se_e2_a code

217bf36

fix extra state

0ad720d

Fix typo and bugs

1d7b0d1

fix concat

f6eeef6

update fix code

ec021f7

add normalize composite impl

55f71f6

use prim function instead of vanilla function, supporting

ebdbed2

double-backward

Merge branch 'devel' into add_paddle_backend

e0aeb73

Merge branch 'devel' into add_paddle_backend

0896c90

update inference code(WIP)

2e79d68

update CMAKE code(WIP)

482d588

update CMAKE

a3c4663

add build script

2b4832a

Update water/se_e2_a + LAMMPS code

f1100e4

Merge branch 'devel' into add_paddle_backend

3068869

HydrogenSulfate marked this pull request as draft September 23, 2024 07:34

HydrogenSulfate added 3 commits September 23, 2024 15:43

fix get_pd_version

68a2d62

fix read_env.py

ff7e0ef

fix suffix

023ba53

github-actions bot added Python OP C++ LAMMPS Examples C labels Sep 23, 2024

fix pd/cxx_op.py

8a1834f

github-advanced-security bot found potential problems Sep 23, 2024

View reviewed changes

HydrogenSulfate added 5 commits September 23, 2024 16:12

fix get_item_paddle

66734bc

restore in.lammps

ba02ae8

restore pyproject.toml

40157dd

simplify CMAKE

8d53aec

restore c_api.cc

e39d466

coderabbitai bot reviewed Sep 23, 2024

View reviewed changes

HydrogenSulfate added 2 commits September 24, 2024 14:03

fix bugs

b97571e

change pt -> pd

50092c6

coderabbitai bot reviewed Sep 24, 2024

View reviewed changes

HydrogenSulfate added 6 commits September 24, 2024 15:21

refactor DeepPotPD.cc

e02dd11

update commonPD.h

8343077

remove boost

09c54f3

refine code

bc854b2

update refined infer code

892fd80

Merge branch 'devel' into add_paddle_backend

04b9064

coderabbitai bot reviewed Sep 27, 2024

View reviewed changes

HydrogenSulfate added 4 commits September 27, 2024 11:58

remove redundant code

b3a6408

refine docstring of get_buffer

15a7e75

Merge branch 'devel' into add_paddle_backend

c792563

update pd version code

72241ea

coderabbitai bot reviewed Sep 27, 2024

View reviewed changes

HydrogenSulfate added 6 commits September 27, 2024 13:15

restore non related files

8694476

add paddle to related docs

e246a34

optimize cmake paddle macro name

5ee8bcf

update parallel training with paddle backend

e3c1ceb

use 0-D Tensor as buffer shape

49ba5a5

support DCU(rocm)

f1cae59

github-actions bot added the Docs label Sep 29, 2024

coderabbitai bot reviewed Sep 29, 2024

View reviewed changes

simplify compile code

a83fb63

	with h5py.File(stat_file_path_single, "w") as f:
	with h5py.File(stat_file_path_single, "w"):

	if True: # TODO: Fix controlflow + backward in PIR static graph
	# TODO: Fix controlflow + backward in PIR static graph

-        exclude_types: List[Tuple[int, int]] = [],
+        def reinit_exclude(
+            self,
+            exclude_types: Optional[List[Tuple[int, int]]] = None,
+        ):
+            if exclude_types is None:
+                exclude_types = []

		def string_to_array(s: str) -> int:
		return [ord(c) for c in s]

-        def string_to_array(s: str) -> int:
-            return [ord(c) for c in s]
+        def string_to_array(s: str) -> List[int]:
+            return [ord(c) for c in s]
+```
+Note: The suggestion includes the correct type annotation as per the diff snippet. However, to ensure that the `List` type is properly recognized, you may need to import it from the `typing` module at the beginning of the file if it's not already imported:
+```python
+from typing import List

-            self.fitting_net.output_def()
-            if self.fitting_net is not None
-            else self.coord_denoise_net.output_def()
-        )
+    def fitting_output_def(self) -> FittingOutputDef:
+        """Get the output def of the fitting net."""
+        if self.fitting_net is not None:
+            return self.fitting_net.output_def()
+        elif self.coord_denoise_net is not None:
+            return self.coord_denoise_net.output_def()
+        else:
+            raise ValueError("Both fitting_net and coord_denoise_net are None.")

		if self.do_grad_r() or self.do_grad_c():
		extended_coord.stop_gradient = False

[WIP] Add paddle backend in devel #4157

Are you sure you want to change the base?

[WIP] Add paddle backend in devel #4157

Conversation

HydrogenSulfate commented Sep 23, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Summary by CodeRabbit

coderabbitai bot commented Sep 23, 2024 • edited Loading

Review skipped

Walkthrough

Changes

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

github-advanced-security bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Sep 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 23, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 23, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Sep 24, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 24, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 24, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 24, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 24, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 24, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 24, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 24, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Sep 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 27, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Sep 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 27, 2024

Choose a reason for hiding this comment

coderabbitai bot Sep 27, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

HydrogenSulfate commented Sep 23, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 23, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)