Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add paddle backend in devel #4157

Draft
wants to merge 46 commits into
base: devel
Choose a base branch
from

Conversation

HydrogenSulfate
Copy link
Contributor

@HydrogenSulfate HydrogenSulfate commented Sep 23, 2024

Add water/se_e2_a train/test/lammps

TODO:

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Enhanced Paddle support, enabling users to utilize the PaddlePaddle library for model training and inference.
    • Introduced a new DPAtomicModel class for atomic predictions of physical properties, improving model flexibility and accuracy.
    • Added functionality to locate and manage Paddle dependencies within the environment.
  • Bug Fixes

    • Improved handling of Paddle library detection and version management.
  • Documentation

    • Updated documentation to reflect new functionalities and improved clarity on Paddle support.
  • Chores

    • Streamlined pre-commit configurations by disabling unused hooks.

Copy link
Contributor

coderabbitai bot commented Sep 23, 2024

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (4)
  • doc/freeze/freeze.md
  • doc/install/install-from-source.md
  • source/CMakeLists.txt
  • source/install/build_cc_pd.sh

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough
📝 Walkthrough
📝 Walkthrough

Walkthrough

The changes introduce PaddlePaddle support within the DeepMD framework, involving updates to the README.md to reflect new backend compatibility and the addition of the DPAtomicModel class in dp_atomic_model.py. The new class provides atomic predictions for physical properties and includes methods for model management, serialization, and neighbor selection. Overall, these modifications enhance the framework's functionality by integrating Paddle alongside existing backends like TensorFlow and PyTorch.

Changes

Files Change Summary
README.md Updated to include Paddle as a supported backend alongside TensorFlow and PyTorch, with version notes reflecting this change.
deepmd/pd/model/atomic_model/dp_atomic_model.py Introduced DPAtomicModel class for atomic predictions, including multiple methods for managing atomic data and model state.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@HydrogenSulfate HydrogenSulfate marked this pull request as draft September 23, 2024 07:34
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Outside diff range and nitpick comments (9)
deepmd/pd/infer/inference.py (2)

20-22: Consider removing or explaining the commented-out code.

The commented-out code block appears to be checking the paddle version. If this code is no longer needed, it's best to remove it. If it's intended for future use, consider adding a comment explaining its purpose and when it might be uncommented.

Tools
GitHub Check: CodeQL

[notice] 20-21: Commented-out code
This comment appears to contain commented-out code.


63-66: TODO: Implement JIT compilation or remove if not needed.

The JIT compilation is currently not implemented and raises a NotImplementedError. If JIT compilation is planned for future implementation, consider adding a TODO comment explaining the plan. If it's not needed, you may want to remove the if JIT: block entirely.

Would you like me to create a GitHub issue to track the implementation of JIT compilation?

backend/read_env.py (1)

Line range hint 1-150: Summary: Paddle backend support implementation

The changes to backend/read_env.py successfully introduce Paddle support in a manner consistent with the existing TensorFlow and PyTorch implementations. The modifications include:

  1. Adding necessary imports for Paddle-related functions.
  2. Implementing Paddle support in the get_argument_from_env function.
  3. Updating the return value of get_argument_from_env to include the Paddle version.

While the overall implementation is good, there are a couple of minor issues that need to be addressed:

  1. A variable assignment error where pt_version is used instead of pd_version.
  2. Potential use of an uninitialized pd_version variable when Paddle is disabled.

Once these issues are resolved, the Paddle backend support will be properly integrated into the build configuration process.

deepmd/pd/cxx_op.py (2)

75-76: Fix typographical errors in error message

In lines 75-76, there are typographical errors:

  • "execuating" should be "executing".
  • "you may consider to install" should be "you may consider installing".

Apply this diff to correct the typos:

- "PyPI, you may consider to install deepmd-kit execuating "
+ "PyPI, you may consider installing deepmd-kit by executing "

80-80: Correct typo in error message

In line 80, the word "inconsitent" should be corrected to "inconsistent".

Apply this diff to correct the typo:

- "This deepmd-kit package is inconsitent with Paddle "
+ "This deepmd-kit package is inconsistent with Paddle "
deepmd/pd/entrypoints/main.py (4)

388-392: Remove or uncomment the commented-out code

The lines appear to contain commented-out code for handling .pdmodel files. If this code is no longer needed, consider removing it to improve readability. If it's planned for future implementation, add a comment explaining its purpose.

Tools
GitHub Check: CodeQL

[notice] 388-392: Commented-out code
This comment appears to contain commented-out code.


452-457: Remove or uncomment the commented-out code

Similar to earlier, these lines contain commented-out code related to .pdmodel files. To enhance code maintainability, remove unused code or provide context if it's intended for future use.

Tools
GitHub Check: CodeQL

[notice] 452-457: Commented-out code
This comment appears to contain commented-out code.


577-580: Simplify conditional assignment using a ternary operator

You can streamline the assignment of FLAGS by using a ternary operator, making the code more concise.

Apply this diff to refactor the code:

-if not isinstance(args, argparse.Namespace):
-    FLAGS = parse_args(args=args)
-else:
-    FLAGS = args
+FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args
Tools
Ruff

577-580: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)


85-87: Consider removing commented-out import statements

The lines contain commented-out import statements. If these imports are no longer necessary, remove them to clean up the code. If they are kept for future reference, consider adding a comment explaining their intended use.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 3068869 and 8d53aec.

Files selected for processing (7)
  • backend/find_paddle.py (1 hunks)
  • backend/read_env.py (3 hunks)
  • deepmd/backend/paddle.py (1 hunks)
  • deepmd/pd/init.py (1 hunks)
  • deepmd/pd/cxx_op.py (1 hunks)
  • deepmd/pd/entrypoints/main.py (1 hunks)
  • deepmd/pd/infer/inference.py (1 hunks)
Files skipped from review as they are similar to previous changes (2)
  • backend/find_paddle.py
  • deepmd/backend/paddle.py
Additional context used
GitHub Check: CodeQL
backend/read_env.py

[failure] 150-150: Potentially uninitialized local variable
Local variable 'pd_version' may be used before it is initialized.

deepmd/pd/entrypoints/main.py

[notice] 108-108: Unused local variable
Variable local_rank is not used.


[notice] 369-369: Unused local variable
Variable extra_files is not used.


[notice] 388-392: Commented-out code
This comment appears to contain commented-out code.


[notice] 452-457: Commented-out code
This comment appears to contain commented-out code.


[warning] 561-565: Unreachable code
This statement is unreachable.

deepmd/pd/infer/inference.py

[notice] 20-21: Commented-out code
This comment appears to contain commented-out code.

Ruff
deepmd/pd/__init__.py

12-12: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

deepmd/pd/cxx_op.py

51-51: Yoda condition detected

Rewrite as pd_cxx11_abi_flag != PD_CXX11_ABI_FLAG

(SIM300)


67-67: Yoda condition detected

Rewrite as pd_py_version != PD_VERSION

(SIM300)

deepmd/pd/entrypoints/main.py

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)


369-369: Local variable extra_files is assigned to but never used

Remove assignment to unused variable extra_files

(F841)


577-580: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

Additional comments not posted (9)
deepmd/pd/__init__.py (3)

1-7: LGTM: License and import section looks good.

The SPDX license identifier is correctly specified, and the import of customized operations is wrapped in a try block for proper error handling. The comment is clear and informative.


15-15: LGTM: PaddlePaddle import is correct.

The import statement for PaddlePaddle is standard and correctly implemented.


17-19: Please provide more context for PaddlePaddle configurations.

The code enables eager execution and all primitive operations for PaddlePaddle. While this might be necessary for your use case, it's important to consider the following:

  1. Enabling these features globally might have performance implications.
  2. The purpose and necessity of these configurations are not entirely clear from the comment.

Could you please provide more information on:

  1. Why these configurations are necessary?
  2. Are there any performance considerations we should be aware of?
  3. Is it possible to make these configurations more targeted or optional?

To better understand the impact of these configurations, we can search for their usage across the codebase:

This will help us understand if these configurations are used consistently across the project.

deepmd/pd/infer/inference.py (2)

1-18: LGTM: Imports and license identifier look good.

The necessary modules and functions are imported correctly, and the SPDX license identifier is properly placed at the top of the file.


25-62: LGTM: Tester class implementation looks solid.

The Tester class is well-structured and handles both single-task and multi-task scenarios appropriately. The initialization process, including loading and preparing the model state dictionary, is implemented clearly with proper error checking for multi-task cases.

backend/read_env.py (2)

16-18: LGTM: New imports for Paddle support.

The new import statements for Paddle-related functions are correctly placed and consistent with the file's purpose.


150-150: LGTM: Updated return statement.

The addition of pd_version to the return tuple is consistent with the function's purpose and follows the existing pattern for other backend versions.

Tools
GitHub Check: CodeQL

[failure] 150-150: Potentially uninitialized local variable
Local variable 'pd_version' may be used before it is initialized.

deepmd/pd/cxx_op.py (1)

48-49: Confirm hardcoded CXX11 ABI flag

At lines 48-49, pd_cxx11_abi_flag is hardcoded to 0, and the line that dynamically retrieves the ABI flag using paddle.compiled_with_cxx11_abi() is commented out. Please verify if hardcoding this value is intentional. If possible, consider uncommenting the line to obtain the actual ABI flag from Paddle, ensuring the check reflects the runtime environment.

deepmd/pd/entrypoints/main.py (1)

561-565: Unreachable code after NotImplementedError in change_bias

The code after raise NotImplementedError is unreachable in the change_bias function. To avoid confusion, remove the unreachable code or adjust the control flow to ensure the code is reachable if intended.

Apply this diff to address the issue:

+if False:  # Placeholder for future implementation
    output_path = (
        FLAGS.output
        if FLAGS.output is not None
        else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams")
    )

Or remove the code entirely if it's not needed.

Likely invalid or redundant comment.

Tools
GitHub Check: CodeQL

[warning] 561-565: Unreachable code
This statement is unreachable.

Comment on lines +9 to +13
__all__ = [
"ENABLE_CUSTOMIZED_OP",
]
except Exception as e:
__all__ = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve exception handling and remove unused variable.

While the conditional definition of __all__ is good practice, there are a couple of improvements we can make:

  1. Consider catching more specific exceptions rather than using a broad except Exception.
  2. The variable e is assigned but never used, which is flagged by the static analysis tool.

Here's a suggested improvement:

 try:
     from deepmd.pd.cxx_op import (
         ENABLE_CUSTOMIZED_OP,
     )

     __all__ = [
         "ENABLE_CUSTOMIZED_OP",
     ]
-except Exception as e:
+except ImportError:
     __all__ = []

This change catches only ImportError, which is more specific to the operation being performed, and removes the unused e variable.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
__all__ = [
"ENABLE_CUSTOMIZED_OP",
]
except Exception as e:
__all__ = []
__all__ = [
"ENABLE_CUSTOMIZED_OP",
]
except ImportError:
__all__ = []
Tools
Ruff

12-12: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

# Initialize DDP
local_rank = os.environ.get("LOCAL_RANK")
if local_rank is not None:
local_rank = int(local_rank)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable local_rank is assigned but never used

The variable local_rank is converted to an integer but is not used afterward. Consider utilizing it if necessary or removing the assignment.

Apply this diff to fix the issue:

-local_rank = int(local_rank)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
local_rank = int(local_rank)
Tools
GitHub Check: CodeQL

[notice] 108-108: Unused local variable
Variable local_rank is not used.

InputSpec([-1, -1, -1], dtype="int32", name="nlist"),
],
)
extra_files = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused variable extra_files

The variable extra_files is assigned but never used. If it's not needed, consider removing it to improve code clarity.

Apply this diff to fix the issue:

-extra_files = {}
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
extra_files = {}
Tools
Ruff

369-369: Local variable extra_files is assigned to but never used

Remove assignment to unused variable extra_files

(F841)

GitHub Check: CodeQL

[notice] 369-369: Unused local variable
Variable extra_files is not used.

elif stat_file_path_single is not None:
if not Path(stat_file_path_single).exists():
if stat_file_path_single.endswith((".h5", ".hdf5")):
with h5py.File(stat_file_path_single, "w") as f:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused variable f

The variable f is assigned but never used in the with statement. To eliminate the unused variable warning, remove the variable assignment.

Apply this diff to fix the issue:

-with h5py.File(stat_file_path_single, "w") as f:
+with h5py.File(stat_file_path_single, "w"):
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with h5py.File(stat_file_path_single, "w") as f:
with h5py.File(stat_file_path_single, "w"):
Tools
Ruff

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)

Comment on lines +559 to +565
raise NotImplementedError
# for .json
output_path = (
FLAGS.output
if FLAGS.output is not None
else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams")
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unreachable code after raise NotImplementedError

The code following raise NotImplementedError is unreachable and will not be executed. To maintain code cleanliness, consider removing or restructuring this code.

Apply this diff to address the issue:

-raise NotImplementedError
-# for .json
-output_path = (
-    FLAGS.output
-    if FLAGS.output is not None
-    else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams")
-)

If this code is intended for future use, consider commenting it out with an explanatory note.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
raise NotImplementedError
# for .json
output_path = (
FLAGS.output
if FLAGS.output is not None
else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams")
)
# Intentionally left blank after removing unreachable code
Tools
GitHub Check: CodeQL

[warning] 561-565: Unreachable code
This statement is unreachable.

Comment on lines 448 to 461
if FLAGS.INPUT.endswith(".pdparams"):
old_state_dict = paddle.load(FLAGS.INPUT)
model_state_dict = copy.deepcopy(old_state_dict.get("model", old_state_dict))
model_params = model_state_dict["_extra_state"]["model_params"]
# elif FLAGS.INPUT.endswith(".pdmodel"):
# old_model = paddle.jit.load(FLAGS.INPUT[: -len(".pdmodel")])
# model_params_string = old_model.get_model_def_script()
# model_params = json.loads(model_params_string)
# old_state_dict = old_model.state_dict()
# model_state_dict = old_state_dict
else:
raise RuntimeError(
"The model provided must be a checkpoint file with a .pd extension"
# "or a frozen model with a .pdparams extension"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent file extension handling in change_bias function

The function checks if FLAGS.INPUT ends with .pdparams, but the error message references a .pd extension. Ensure consistency in file extension handling and that error messages accurately reflect the expected input.

Apply this diff to correct the error message:

-raise RuntimeError(
-    "The model provided must be a checkpoint file with a .pd extension"
-)
+raise RuntimeError(
+    "The model provided must be a checkpoint file with a .pdparams extension"
+)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if FLAGS.INPUT.endswith(".pdparams"):
old_state_dict = paddle.load(FLAGS.INPUT)
model_state_dict = copy.deepcopy(old_state_dict.get("model", old_state_dict))
model_params = model_state_dict["_extra_state"]["model_params"]
# elif FLAGS.INPUT.endswith(".pdmodel"):
# old_model = paddle.jit.load(FLAGS.INPUT[: -len(".pdmodel")])
# model_params_string = old_model.get_model_def_script()
# model_params = json.loads(model_params_string)
# old_state_dict = old_model.state_dict()
# model_state_dict = old_state_dict
else:
raise RuntimeError(
"The model provided must be a checkpoint file with a .pd extension"
# "or a frozen model with a .pdparams extension"
if FLAGS.INPUT.endswith(".pdparams"):
old_state_dict = paddle.load(FLAGS.INPUT)
model_state_dict = copy.deepcopy(old_state_dict.get("model", old_state_dict))
model_params = model_state_dict["_extra_state"]["model_params"]
# elif FLAGS.INPUT.endswith(".pdmodel"):
# old_model = paddle.jit.load(FLAGS.INPUT[: -len(".pdmodel")])
# model_params_string = old_model.get_model_def_script()
# model_params = json.loads(model_params_string)
# old_state_dict = old_model.state_dict()
# model_state_dict = old_state_dict
else:
raise RuntimeError(
"The model provided must be a checkpoint file with a .pdparams extension"
)
Tools
GitHub Check: CodeQL

[notice] 452-457: Commented-out code
This comment appears to contain commented-out code.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Outside diff range and nitpick comments (7)
deepmd/pd/model/model/make_model.py (6)

94-94: Remove or restore the commented-out decorator @paddle.jit.export

The decorator @paddle.jit.export is commented out. If it's no longer needed, consider removing it to improve code readability. If it's required for JIT compilation, consider restoring it.

Apply this diff to remove the commented-out decorator:

-# @paddle.jit.export

298-307: Eliminate unnecessary commented-out code

There is a block of commented-out code that seems obsolete. Removing it will enhance code cleanliness and maintainability.

Apply this diff:

-###
-### type checking would not pass jit, convert to coord prec anyway
-###
-# for vv, kk in zip([fparam, aparam], ["frame", "atomic"]):
-#     if vv is not None and self.reverse_precision_dict[vv.dtype] != input_prec:
-#         log.warning(
-#           f"type of {kk} parameter {self.reverse_precision_dict[vv.dtype]}"
-#           " does not match"
-#           f" that of the coordinate {input_prec}"
-#         )

340-341: Simplify dictionary key checks by removing unnecessary .keys() method

In Python, you can check for the presence of a key directly without using the .keys() method. This simplifies the code and enhances readability.

Apply this diff to simplify the code:

-for kk in odef.keys():
+for kk in odef:
-if kk not in model_ret.keys():
+if kk not in model_ret:
Tools
Ruff

340-340: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


341-341: Use key not in dict instead of key not in dict.keys()

Remove .keys()

(SIM118)


366-366: Correct typo in docstring

There's a typographical error in the docstring: "nothong" should be "nothing".

Apply this diff:

-                it does nothong
+                it does nothing

377-377: Fix typo in comment

The word "effact" should be corrected to "effect" to accurately convey the intended meaning.

Apply this diff:

-                May have side effact on the efficiency.
+                May have side effect on the efficiency.

497-497: Remove or restore the commented-out decorator @paddle.jit.export

The decorator @paddle.jit.export is commented out. If it's not needed, consider removing it. If it's essential for JIT compilation, consider uncommenting it.

Apply this diff:

-# @paddle.jit.export
deepmd/pd/model/descriptor/se_atten.py (1)

400-404: Simplify conditional assignment with ternary operator

The if-else block can be simplified using a ternary operator for conciseness and readability.

Apply this diff to implement the improvement:

             if callable(merged):
                 # only get data for once
                 sampled = merged()
             else:
                 sampled = merged
+            sampled = merged() if callable(merged) else merged
Tools
Ruff

400-404: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block

(SIM108)

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 8d53aec and 50092c6.

Files selected for processing (4)
  • backend/dynamic_metadata.py (1 hunks)
  • backend/read_env.py (3 hunks)
  • deepmd/pd/model/descriptor/se_atten.py (1 hunks)
  • deepmd/pd/model/model/make_model.py (1 hunks)
Additional context used
GitHub Check: CodeQL
backend/read_env.py

[failure] 150-150: Potentially uninitialized local variable
Local variable 'pd_version' may be used before it is initialized.

deepmd/pd/model/descriptor/se_atten.py

[notice] 474-474: Unused local variable
Variable nall is not used.

Ruff
deepmd/pd/model/descriptor/se_atten.py

69-69: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


86-86: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


400-404: Use ternary operator sampled = merged() if callable(merged) else merged instead of if-else-block

(SIM108)


424-424: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


474-474: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)

deepmd/pd/model/model/make_model.py

340-340: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


341-341: Use key not in dict instead of key not in dict.keys()

Remove .keys()

(SIM118)

Additional comments not posted (4)
backend/read_env.py (3)

16-18: LGTM: Import statement for Paddle support.

The import statement for find_paddle is correctly placed and follows the existing import style. This addition is necessary for the new Paddle support functionality.


150-150: LGTM: Addition of pd_version to return tuple.

The addition of pd_version to the return tuple of get_argument_from_env function is correct and consistent with how TensorFlow and PyTorch versions are returned.

Tools
GitHub Check: CodeQL

[failure] 150-150: Potentially uninitialized local variable
Local variable 'pd_version' may be used before it is initialized.


156-156: LGTM: Update of set_scikit_build_env function.

The update to the set_scikit_build_env function correctly unpacks the new pd_version from the get_argument_from_env return tuple. This change is consistent with the modifications made to get_argument_from_env.

deepmd/pd/model/model/make_model.py (1)

583-599: Ensure consistency in method parameters between forward and forward_common

The forward method directly calls forward_common with the same parameters. Verify that any additional needed parameters are passed and that the method signatures remain consistent.

Run the following script to check for consistency in method signatures:

@@ -36,7 +36,7 @@ def dynamic_metadata(
settings: Optional[Dict[str, object]] = None,
):
assert field in ["optional-dependencies", "entry-points", "scripts"]
_, _, find_libpython_requires, extra_scripts, tf_version, pt_version = (
_, _, find_libpython_requires, extra_scripts, tf_version, pt_version, pd_version = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

⚠️ Potential issue

Unused pd_version Variable Detected

The variable pd_version is unpacked in backend/dynamic_metadata.py at line 39 but is not used within the dynamic_metadata function. This may lead to unused variable warnings and could indicate incomplete implementation.

  • If pd_version is intended to be used:
    • Update the function to utilize pd_version where necessary.
    • For example, incorporate pd_version in the optional-dependencies section:
      elif field == "optional-dependencies":
          optional_dependencies = pyproject["tool"]["deepmd_build_backend"][
              "optional-dependencies"
          ]
          optional_dependencies["lmp"].extend(find_libpython_requires)
          optional_dependencies["ipi"].extend(find_libpython_requires)
          return {
              **optional_dependencies,
              **get_tf_requirement(tf_version),
              **get_pt_requirement(pt_version),
              **get_pd_requirement(pd_version),  # Add this line
          }
  • If pd_version is not needed:
    • Remove pd_version from the unpacking to avoid unused variable warnings:
      _, _, find_libpython_requires, extra_scripts, tf_version, pt_version = (
Analysis chain

Address inconsistency and unused variable

  1. There's an inconsistency between the AI-generated summary and the actual code. The summary mentions a change in the function signature, but this is not reflected in the provided code. Please clarify if there are missing changes or if the summary is incorrect.

  2. The addition of pd_version suggests the introduction of Paddle support, which aligns with the PR objectives. However, pd_version is currently not used in the rest of the function.

To address these issues:

  1. If pd_version is intended to be a parameter, update the function signature:
def dynamic_metadata(
    field: str,
    settings: Optional[Dict[str, object]] = None,
    pd_version: Optional[str] = None,
):
  1. Implement the usage of pd_version in the function body, possibly in the optional-dependencies section:
elif field == "optional-dependencies":
    optional_dependencies = pyproject["tool"]["deepmd_build_backend"][
        "optional-dependencies"
    ]
    optional_dependencies["lmp"].extend(find_libpython_requires)
    optional_dependencies["ipi"].extend(find_libpython_requires)
    return {
        **optional_dependencies,
        **get_tf_requirement(tf_version),
        **get_pt_requirement(pt_version),
        **get_pd_requirement(pd_version),  # Add this line
    }
  1. If pd_version is not needed, remove it from the unpacking to avoid unused variable warnings.

To ensure pd_version is used consistently across the codebase, run the following script:

This will help verify the usage of pd_version and identify any Paddle-related functions that might need to be implemented or updated.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for occurrences of pd_version and potential Paddle-related functions

# Search for pd_version
echo "Occurrences of pd_version:"
rg 'pd_version' --type python

# Search for potential Paddle-related functions
echo "\nPotential Paddle-related functions:"
rg 'get_pd_requirement|paddle' --type python

Length of output: 286


Script:

#!/bin/bash
# Description: Check for occurrences of pd_version and potential Paddle-related functions

# Search for pd_version in Python files
echo "Occurrences of pd_version:"
rg 'pd_version' --type py

# Search for potential Paddle-related functions in Python files
echo "\nPotential Paddle-related functions:"
rg 'get_pd_requirement|paddle' --type py

Length of output: 167111

Comment on lines 126 to 137
if os.environ.get("DP_ENABLE_PADDLE", "0") == "1":
pd_install_dir, _ = find_paddle()
pt_version = get_pt_version(pd_install_dir)
cmake_args.extend(
[
"-DENABLE_PADDLE=ON",
f"-DCMAKE_PREFIX_PATH={pd_install_dir}",
]
)
else:
cmake_args.append("-DENABLE_PADDLE=OFF")
pd_version = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix variable assignment for Paddle version.

The implementation for Paddle support looks good overall, but there's still an issue to address:

On line 128, pt_version is assigned instead of pd_version. This appears to be a copy-paste error that was not fixed in the previous review.

Please apply the following change:

 if os.environ.get("DP_ENABLE_PADDLE", "0") == "1":
     pd_install_dir, _ = find_paddle()
-    pt_version = get_pt_version(pd_install_dir)
+    pd_version = get_pt_version(pd_install_dir)
     cmake_args.extend(
         [
             "-DENABLE_PADDLE=ON",
             f"-DCMAKE_PREFIX_PATH={pd_install_dir}",
         ]
     )
 else:
     cmake_args.append("-DENABLE_PADDLE=OFF")
     pd_version = None

This change will correctly assign the Paddle version.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if os.environ.get("DP_ENABLE_PADDLE", "0") == "1":
pd_install_dir, _ = find_paddle()
pt_version = get_pt_version(pd_install_dir)
cmake_args.extend(
[
"-DENABLE_PADDLE=ON",
f"-DCMAKE_PREFIX_PATH={pd_install_dir}",
]
)
else:
cmake_args.append("-DENABLE_PADDLE=OFF")
pd_version = None
if os.environ.get("DP_ENABLE_PADDLE", "0") == "1":
pd_install_dir, _ = find_paddle()
pd_version = get_pt_version(pd_install_dir)
cmake_args.extend(
[
"-DENABLE_PADDLE=ON",
f"-DCMAKE_PREFIX_PATH={pd_install_dir}",
]
)
else:
cmake_args.append("-DENABLE_PADDLE=OFF")
pd_version = None

axis=-1,
)

if True: # TODO: Fix controlflow + backward in PIR static graph
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Revisit the use of the if True statement

Using if True: is redundant and may hinder code clarity. The comment suggests there is a TODO related to fixing control flow and backward compatibility in the PIR static graph. Consider addressing the TODO or removing the redundant if True: block.

Apply this diff to remove the redundant if True::

-if True:  # TODO: Fix controlflow + backward in PIR static graph
+# TODO: Fix controlflow + backward in PIR static graph
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if True: # TODO: Fix controlflow + backward in PIR static graph
# TODO: Fix controlflow + backward in PIR static graph

Comment on lines +38 to +39
RESERVED_PRECISON_DICT,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Correct the typo in the imported variable RESERVED_PRECISON_DICT

There's a possible typo in the variable name RESERVED_PRECISON_DICT. It should likely be RESERVED_PRECISION_DICT. Correcting this will prevent confusion and potential errors.

Apply this diff:

-    RESERVED_PRECISON_DICT,
+    RESERVED_PRECISION_DICT,

Also, update all occurrences of RESERVED_PRECISON_DICT in the code:

-        self.reverse_precision_dict = RESERVED_PRECISON_DICT
+        self.reverse_precision_dict = RESERVED_PRECISION_DICT
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RESERVED_PRECISON_DICT,
)
RESERVED_PRECISION_DICT,
)

nframes, nloc, nnei = nlist.shape
atype = extended_atype[:, :nloc]
nb = nframes
nall = extended_coord.reshape([nb, -1, 3]).shape[1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove unused variable nall

The local variable nall is assigned but never used afterward, which may cause confusion.

Apply this diff to fix the issue:

-            nall = extended_coord.reshape([nb, -1, 3]).shape[1]
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
nall = extended_coord.reshape([nb, -1, 3]).shape[1]
Tools
Ruff

474-474: Local variable nall is assigned to but never used

Remove assignment to unused variable nall

(F841)

GitHub Check: CodeQL

[notice] 474-474: Unused local variable
Variable nall is not used.


def reinit_exclude(
self,
exclude_types: List[Tuple[int, int]] = [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid using mutable default arguments

The parameter exclude_types in reinit_exclude method uses a mutable default argument []. It's advisable to set the default to None and initialize within the method to prevent unexpected behaviors.

Apply this diff to fix the issue:

-        def reinit_exclude(
-            self,
-            exclude_types: List[Tuple[int, int]] = [],
-        ):
+        def reinit_exclude(
+            self,
+            exclude_types: Optional[List[Tuple[int, int]]] = None,
+        ):
+            if exclude_types is None:
+                exclude_types = []
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
exclude_types: List[Tuple[int, int]] = [],
def reinit_exclude(
self,
exclude_types: Optional[List[Tuple[int, int]]] = None,
):
if exclude_types is None:
exclude_types = []
Tools
Ruff

424-424: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

temperature=None,
smooth: bool = True,
type_one_side: bool = False,
exclude_types: List[Tuple[int, int]] = [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid using mutable default arguments

The parameter exclude_types uses a mutable default argument []. Using mutable default arguments can lead to unexpected behaviors. It's recommended to use None as the default value and initialize the list within the function.

Apply this diff to fix the issue:

-            exclude_types: List[Tuple[int, int]] = [],
+            exclude_types: Optional[List[Tuple[int, int]]] = None,

And initialize inside the __init__ method:

+            if exclude_types is None:
+                exclude_types = []

Committable suggestion was skipped due to low confidence.

Tools
Ruff

86-86: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

rcut_smth: float,
sel: Union[List[int], int],
ntypes: int,
neuron: list = [25, 50, 100],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid using mutable default arguments

The parameter neuron uses a mutable default argument [25, 50, 100]. Using mutable default arguments can lead to unexpected behaviors because default argument values are evaluated only once at function definition time. It's best practice to set default arguments to None and initialize them within the function.

Apply this diff to fix the issue:

-            neuron: list = [25, 50, 100],
+            neuron: Optional[List[int]] = None,

And initialize inside the __init__ method:

+            if neuron is None:
+                neuron = [25, 50, 100]

Don't forget to import Optional from typing if not already imported.

Committable suggestion was skipped due to low confidence.

Tools
Ruff

69-69: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (8)
deepmd/pd/entrypoints/main.py (5)

243-342: Consider refactoring train function for improved modularity.

The train function is well-implemented with good error handling and logging. However, it's quite long and handles multiple responsibilities. Consider breaking it down into smaller, more focused functions to improve readability and maintainability. For example, you could create separate functions for:

  1. Configuration processing
  2. Multi-task setup
  3. Fine-tuning setup
  4. Neighbor statistics calculation

This refactoring would make the main train function more concise and easier to understand.

On a positive note, the error handling and logging throughout the function are commendable, providing clear feedback and traceability.


345-375: LGTM: freeze function is well-implemented.

The freeze function is concise and effectively freezes the model for inference. The use of Paddle's static graph API and the informative log message at the end are good practices.

Consider adding a try-except block to handle potential errors during the model freezing process, which could provide more informative error messages to the user.

Example:

try:
    paddle.jit.save(
        model,
        path=FLAGS.output,
        skip_prune_program=True,
    )
    suffix = "json" if PIR_ENABLED.lower() in ["true", "1"] else "pdmodel"
    log.info(
        f"Paddle inference model has been exported to: {FLAGS.output}.{suffix}(.pdiparams)"
    )
except Exception as e:
    log.error(f"Error occurred while freezing the model: {str(e)}")
    raise

377-439: LGTM: show function is well-implemented with good error handling.

The show function effectively displays various attributes of the model, handling both single-task and multi-task models. The error handling and informative output are commendable.

There's commented-out code for handling .pdmodel files (lines 384-387). If this functionality is planned for future implementation, consider adding a TODO comment to make it clear for other developers.

Example TODO comment:

# TODO: Implement support for .pdmodel files
# elif FLAGS.INPUT.split(".")[-1] == "pdmodel":
#     model_params_string = paddle.jit.load(
#         FLAGS.INPUT[: -len(".pdmodel")]
#     ).model_def_script
#     model_params = json.loads(model_params_string)
🧰 Tools
🪛 GitHub Check: CodeQL

[notice] 383-387: Commented-out code
This comment appears to contain commented-out code.


572-603: LGTM: main function is well-structured.

The main function serves as a clear entry point for the script, effectively handling different commands. The structure is concise and easy to understand.

As suggested by the static analysis tool, you can use a ternary operator for the FLAGS assignment to make the code more concise:

Apply this diff to use a ternary operator:

-    if not isinstance(args, argparse.Namespace):
-        FLAGS = parse_args(args=args)
-    else:
-        FLAGS = args
+    FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args
🧰 Tools
🪛 Ruff

573-576: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)


1-606: Overall, the implementation is solid with room for some improvements.

The deepmd/pd/entrypoints/main.py file provides a comprehensive set of functionalities for model management in the DeepMD framework using Paddle. The code is generally well-structured with good error handling, logging, and type hinting.

Areas for improvement:

  1. Some functions (e.g., get_trainer, train, change_bias) are quite long and could benefit from further modularization to enhance readability and maintainability.
  2. There are a few instances of commented-out code that could be removed or turned into TODO comments if the functionality is planned for future implementation.
  3. Address the unused variable local_rank in the get_trainer function.

The use of type hints, clear error messages, and comprehensive logging are commendable practices that enhance the code's robustness and maintainability.

🧰 Tools
🪛 Ruff

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)


573-576: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

🪛 GitHub Check: CodeQL

[notice] 108-108: Unused local variable
Variable local_rank is not used.


[notice] 383-387: Commented-out code
This comment appears to contain commented-out code.


[notice] 447-452: Commented-out code
This comment appears to contain commented-out code.


[warning] 557-561: Unreachable code
This statement is unreachable.

deepmd/pd/model/atomic_model/dp_atomic_model.py (3)

233-234: Optimize by removing unused variables

In the forward_atomic method, variables rot_mat, g2, h2, and sw are returned from self.descriptor(...) but are not used afterwards. Removing these unused variables can improve readability and performance.

Apply this diff to remove unused variables:

-            descriptor, rot_mat, g2, h2, sw = self.descriptor(
+            descriptor, _, _, _, _ = self.descriptor(

Alternatively, if the descriptor returns only the needed value:

-            descriptor, rot_mat, g2, h2, sw = self.descriptor(
+            descriptor = self.descriptor(

Adjust self.descriptor accordingly if possible.


259-260: Align parameter names in documentation

In the compute_or_load_stat method, the documentation refers to sampled, but the parameter is named sampled_func. For consistency and clarity, the documentation should use the actual parameter name.

Apply this diff to correct the parameter name in the documentation:

-            When `sampled` is provided, all the statistics parameters will be calculated (or re-calculated for update),
+            When `sampled_func` is provided, all the statistics parameters will be calculated (or re-calculated for update),

299-300: Simplify method documentation

In the get_dim_fparam method, the documentation can be streamlined for clarity.

Apply this diff to simplify the documentation:

-            """Get the number (dimension) of frame parameters of this atomic model."""
+            """Get the dimension of frame parameters."""
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 50092c6 and 892fd80.

📒 Files selected for processing (2)
  • deepmd/pd/entrypoints/main.py (1 hunks)
  • deepmd/pd/model/atomic_model/dp_atomic_model.py (1 hunks)
🧰 Additional context used
🪛 Ruff
deepmd/pd/entrypoints/main.py

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)


573-576: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

🪛 GitHub Check: CodeQL
deepmd/pd/entrypoints/main.py

[notice] 108-108: Unused local variable
Variable local_rank is not used.


[notice] 383-387: Commented-out code
This comment appears to contain commented-out code.


[notice] 447-452: Commented-out code
This comment appears to contain commented-out code.


[warning] 557-561: Unreachable code
This statement is unreachable.

deepmd/pd/model/atomic_model/dp_atomic_model.py

[warning] 60-60: Overwriting attribute in super-class or sub-class
Assignment overwrites attribute type_map, which was previously defined in superclass BaseAtomicModel.

🔇 Additional comments (2)
deepmd/pd/entrypoints/main.py (2)

1-90: LGTM: Imports and global setup look good.

The imports and global setup are well-organized and follow best practices. The use of type hints and the separation of standard library imports from third-party imports is commendable.


206-240: LGTM: SummaryPrinter class is well-implemented.

The SummaryPrinter class is well-structured, extends BaseSummaryPrinter appropriately, and provides clear, focused methods for retrieving backend information. The implementation follows good object-oriented design principles.

Comment on lines +442 to +568
f"For multitask model, the model branch must be in the 'model_dict'! "
f"Available options are : {list(model_params['model_dict'].keys())}."
)
log.info(f"Changing out bias for model {model_branch}.")
model = training.get_model_for_wrapper(model_params)
type_map = (
model_params["type_map"]
if not multi_task
else model_params["model_dict"][model_branch]["type_map"]
)
model_to_change = model if not multi_task else model[model_branch]
if FLAGS.INPUT.endswith(".pd"):
wrapper = ModelWrapper(model)
wrapper.set_state_dict(old_state_dict["model"])
else:
# for .pdparams
model.set_state_dict(old_state_dict)

if FLAGS.bias_value is not None:
# use user-defined bias
assert model_to_change.model_type in [
"ener"
], "User-defined bias is only available for energy model!"
assert (
len(FLAGS.bias_value) == len(type_map)
), f"The number of elements in the bias should be the same as that in the type_map: {type_map}."
old_bias = model_to_change.get_out_bias()
bias_to_set = paddle.to_tensor(
FLAGS.bias_value, dtype=old_bias.dtype, place=old_bias.place
).reshape(old_bias.shape)
model_to_change.set_out_bias(bias_to_set)
log.info(
f"Change output bias of {type_map!s} "
f"from {to_numpy_array(old_bias).reshape(-1)!s} "
f"to {to_numpy_array(bias_to_set).reshape(-1)!s}."
)
updated_model = model_to_change
else:
# calculate bias on given systems
if FLAGS.datafile is not None:
with open(FLAGS.datafile) as datalist:
all_sys = datalist.read().splitlines()
else:
all_sys = expand_sys_str(FLAGS.system)
data_systems = process_systems(all_sys)
data_single = DpLoaderSet(
data_systems,
1,
type_map,
)
mock_loss = training.get_loss(
{"inference": True}, 1.0, len(type_map), model_to_change
)
data_requirement = mock_loss.label_requirement
data_requirement += training.get_additional_data_requirement(model_to_change)
data_single.add_data_requirement(data_requirement)
nbatches = FLAGS.numb_batch if FLAGS.numb_batch != 0 else float("inf")
sampled_data = make_stat_input(
data_single.systems,
data_single.dataloaders,
nbatches,
)
updated_model = training.model_change_out_bias(
model_to_change, sampled_data, _bias_adjust_mode=bias_adjust_mode
)

if not multi_task:
model = updated_model
else:
model[model_branch] = updated_model

if FLAGS.INPUT.endswith(".pd"):
output_path = (
FLAGS.output
if FLAGS.output is not None
else FLAGS.INPUT.replace(".pd", "_updated.pd")
)
wrapper = ModelWrapper(model)
if "model" in old_state_dict:
old_state_dict["model"] = wrapper.state_dict()
old_state_dict["model"]["_extra_state"] = model_state_dict["_extra_state"]
else:
old_state_dict = wrapper.state_dict()
old_state_dict["_extra_state"] = model_state_dict["_extra_state"]
paddle.save(old_state_dict, output_path)
else:
raise NotImplementedError
# for .json
output_path = (
FLAGS.output
if FLAGS.output is not None
else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams")
)
model = paddle.jit.to_static(model)
paddle.jit.save(
model,
output_path,
{},
)
log.info(f"Saved model to {output_path}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Consider refactoring change_bias function and addressing incomplete functionality.

The change_bias function is well-implemented with good error handling and type checking. However, it's quite long and could benefit from further modularization. Consider breaking it down into smaller, more focused functions to improve readability and maintainability.

There's commented-out code and a NotImplementedError for handling .json files (lines 447-452 and 555-567). If this functionality is planned for future implementation, consider adding a TODO comment. If not, it might be better to remove the commented-out code to improve code clarity.

Additionally, there's unreachable code after the NotImplementedError (lines 557-561). This code should be removed or placed before the raise NotImplementedError statement if it's intended to be used in the future.

Apply this diff to address the unreachable code:

-        raise NotImplementedError
-        # for .json
-        output_path = (
-            FLAGS.output
-            if FLAGS.output is not None
-            else FLAGS.INPUT.replace(".pdparams", "_updated.pdparams")
-        )
+        # TODO: Implement support for .json files
+        raise NotImplementedError("Support for .json files is not yet implemented")

Committable suggestion was skipped due to low confidence.

🧰 Tools
🪛 GitHub Check: CodeQL

[notice] 447-452: Commented-out code
This comment appears to contain commented-out code.


[warning] 557-561: Unreachable code
This statement is unreachable.

Comment on lines 93 to 203

# Initialize DDP
local_rank = os.environ.get("LOCAL_RANK")
if local_rank is not None:
local_rank = int(local_rank)
assert dist.is_nccl_available()
dist.init_process_group(backend="nccl")

def prepare_trainer_input_single(
model_params_single, data_dict_single, rank=0, seed=None
):
training_dataset_params = data_dict_single["training_data"]
validation_dataset_params = data_dict_single.get("validation_data", None)
validation_systems = (
validation_dataset_params["systems"] if validation_dataset_params else None
)
training_systems = training_dataset_params["systems"]
training_systems = process_systems(training_systems)
if validation_systems is not None:
validation_systems = process_systems(validation_systems)

# stat files
stat_file_path_single = data_dict_single.get("stat_file", None)
if rank != 0:
stat_file_path_single = None
elif stat_file_path_single is not None:
if not Path(stat_file_path_single).exists():
if stat_file_path_single.endswith((".h5", ".hdf5")):
with h5py.File(stat_file_path_single, "w") as f:
pass
else:
Path(stat_file_path_single).mkdir()
stat_file_path_single = DPPath(stat_file_path_single, "a")

# validation and training data
# avoid the same batch sequence among devices
rank_seed = (seed + rank) % (2**32) if seed is not None else None
validation_data_single = (
DpLoaderSet(
validation_systems,
validation_dataset_params["batch_size"],
model_params_single["type_map"],
seed=rank_seed,
)
if validation_systems
else None
)
train_data_single = DpLoaderSet(
training_systems,
training_dataset_params["batch_size"],
model_params_single["type_map"],
seed=rank_seed,
)
return (
train_data_single,
validation_data_single,
stat_file_path_single,
)

rank = dist.get_rank() if dist.is_available() and dist.is_initialized() else 0
data_seed = config["training"].get("seed", None)
if not multi_task:
(
train_data,
validation_data,
stat_file_path,
) = prepare_trainer_input_single(
config["model"],
config["training"],
rank=rank,
seed=data_seed,
)
else:
train_data, validation_data, stat_file_path = {}, {}, {}
for model_key in config["model"]["model_dict"]:
(
train_data[model_key],
validation_data[model_key],
stat_file_path[model_key],
) = prepare_trainer_input_single(
config["model"]["model_dict"][model_key],
config["training"]["data_dict"][model_key],
rank=rank,
seed=data_seed,
)

trainer = training.Trainer(
config,
train_data,
stat_file_path=stat_file_path,
validation_data=validation_data,
init_model=init_model,
restart_model=restart_model,
finetune_model=finetune_model,
force_load=force_load,
shared_links=shared_links,
finetune_links=finetune_links,
init_frz_model=init_frz_model,
)
return trainer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Consider refactoring get_trainer for improved readability and maintainability.

The get_trainer function is well-structured but quite long. Consider breaking it down into smaller, more focused functions to improve readability and maintainability. For example, you could create separate functions for preparing single-task and multi-task inputs.

Additionally, there's an unused variable local_rank on line 108. If it's not needed, you can remove it to improve code clarity.

Apply this diff to remove the unused variable:

-    local_rank = int(local_rank)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_trainer(
config,
init_model=None,
restart_model=None,
finetune_model=None,
force_load=False,
init_frz_model=None,
shared_links=None,
finetune_links=None,
):
multi_task = "model_dict" in config.get("model", {})
# Initialize DDP
local_rank = os.environ.get("LOCAL_RANK")
if local_rank is not None:
local_rank = int(local_rank)
assert dist.is_nccl_available()
dist.init_process_group(backend="nccl")
def prepare_trainer_input_single(
model_params_single, data_dict_single, rank=0, seed=None
):
training_dataset_params = data_dict_single["training_data"]
validation_dataset_params = data_dict_single.get("validation_data", None)
validation_systems = (
validation_dataset_params["systems"] if validation_dataset_params else None
)
training_systems = training_dataset_params["systems"]
training_systems = process_systems(training_systems)
if validation_systems is not None:
validation_systems = process_systems(validation_systems)
# stat files
stat_file_path_single = data_dict_single.get("stat_file", None)
if rank != 0:
stat_file_path_single = None
elif stat_file_path_single is not None:
if not Path(stat_file_path_single).exists():
if stat_file_path_single.endswith((".h5", ".hdf5")):
with h5py.File(stat_file_path_single, "w") as f:
pass
else:
Path(stat_file_path_single).mkdir()
stat_file_path_single = DPPath(stat_file_path_single, "a")
# validation and training data
# avoid the same batch sequence among devices
rank_seed = (seed + rank) % (2**32) if seed is not None else None
validation_data_single = (
DpLoaderSet(
validation_systems,
validation_dataset_params["batch_size"],
model_params_single["type_map"],
seed=rank_seed,
)
if validation_systems
else None
)
train_data_single = DpLoaderSet(
training_systems,
training_dataset_params["batch_size"],
model_params_single["type_map"],
seed=rank_seed,
)
return (
train_data_single,
validation_data_single,
stat_file_path_single,
)
rank = dist.get_rank() if dist.is_available() and dist.is_initialized() else 0
data_seed = config["training"].get("seed", None)
if not multi_task:
(
train_data,
validation_data,
stat_file_path,
) = prepare_trainer_input_single(
config["model"],
config["training"],
rank=rank,
seed=data_seed,
)
else:
train_data, validation_data, stat_file_path = {}, {}, {}
for model_key in config["model"]["model_dict"]:
(
train_data[model_key],
validation_data[model_key],
stat_file_path[model_key],
) = prepare_trainer_input_single(
config["model"]["model_dict"][model_key],
config["training"]["data_dict"][model_key],
rank=rank,
seed=data_seed,
)
trainer = training.Trainer(
config,
train_data,
stat_file_path=stat_file_path,
validation_data=validation_data,
init_model=init_model,
restart_model=restart_model,
finetune_model=finetune_model,
force_load=force_load,
shared_links=shared_links,
finetune_links=finetune_links,
init_frz_model=init_frz_model,
)
return trainer
def get_trainer(
config,
init_model=None,
restart_model=None,
finetune_model=None,
force_load=False,
init_frz_model=None,
shared_links=None,
finetune_links=None,
):
multi_task = "model_dict" in config.get("model", {})
# Initialize DDP
local_rank = os.environ.get("LOCAL_RANK")
if local_rank is not None:
assert dist.is_nccl_available()
dist.init_process_group(backend="nccl")
def prepare_trainer_input_single(
model_params_single, data_dict_single, rank=0, seed=None
):
training_dataset_params = data_dict_single["training_data"]
validation_dataset_params = data_dict_single.get("validation_data", None)
validation_systems = (
validation_dataset_params["systems"] if validation_dataset_params else None
)
training_systems = training_dataset_params["systems"]
training_systems = process_systems(training_systems)
if validation_systems is not None:
validation_systems = process_systems(validation_systems)
# stat files
stat_file_path_single = data_dict_single.get("stat_file", None)
if rank != 0:
stat_file_path_single = None
elif stat_file_path_single is not None:
if not Path(stat_file_path_single).exists():
if stat_file_path_single.endswith((".h5", ".hdf5")):
with h5py.File(stat_file_path_single, "w") as f:
pass
else:
Path(stat_file_path_single).mkdir()
stat_file_path_single = DPPath(stat_file_path_single, "a")
# validation and training data
# avoid the same batch sequence among devices
rank_seed = (seed + rank) % (2**32) if seed is not None else None
validation_data_single = (
DpLoaderSet(
validation_systems,
validation_dataset_params["batch_size"],
model_params_single["type_map"],
seed=rank_seed,
)
if validation_systems
else None
)
train_data_single = DpLoaderSet(
training_systems,
training_dataset_params["batch_size"],
model_params_single["type_map"],
seed=rank_seed,
)
return (
train_data_single,
validation_data_single,
stat_file_path_single,
)
rank = dist.get_rank() if dist.is_available() and dist.is_initialized() else 0
data_seed = config["training"].get("seed", None)
if not multi_task:
(
train_data,
validation_data,
stat_file_path,
) = prepare_trainer_input_single(
config["model"],
config["training"],
rank=rank,
seed=data_seed,
)
else:
train_data, validation_data, stat_file_path = {}, {}, {}
for model_key in config["model"]["model_dict"]:
(
train_data[model_key],
validation_data[model_key],
stat_file_path[model_key],
) = prepare_trainer_input_single(
config["model"]["model_dict"][model_key],
config["training"]["data_dict"][model_key],
rank=rank,
seed=data_seed,
)
trainer = training.Trainer(
config,
train_data,
stat_file_path=stat_file_path,
validation_data=validation_data,
init_model=init_model,
restart_model=restart_model,
finetune_model=finetune_model,
force_load=force_load,
shared_links=shared_links,
finetune_links=finetune_links,
init_frz_model=init_frz_model,
)
return trainer
🧰 Tools
🪛 Ruff

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)

🪛 GitHub Check: CodeQL

[notice] 108-108: Unused local variable
Variable local_rank is not used.

):
super().__init__(type_map, **kwargs)
ntypes = len(type_map)
self.type_map = type_map
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid overwriting type_map attribute from superclass

The assignment self.type_map = type_map overwrites the type_map attribute already set in the superclass BaseAtomicModel during super().__init__(type_map, **kwargs). This could lead to unintended side effects if the superclass relies on the original type_map.

Consider removing the redundant assignment or ensure that overwriting type_map does not cause any issues.

🧰 Tools
🪛 GitHub Check: CodeQL

[warning] 60-60: Overwriting attribute in super-class or sub-class
Assignment overwrites attribute type_map, which was previously defined in superclass BaseAtomicModel.

Comment on lines 72 to 74
def string_to_array(s: str) -> int:
return [ord(c) for c in s]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Correct the return type annotation of string_to_array function

The string_to_array function is annotated to return int, but it actually returns a List[int]. This mismatch may lead to confusion or type-checking issues.

Apply this diff to correct the return type annotation:

-def string_to_array(s: str) -> int:
+def string_to_array(s: str) -> List[int]:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def string_to_array(s: str) -> int:
return [ord(c) for c in s]
def string_to_array(s: str) -> List[int]:
return [ord(c) for c in s]
```
Note: The suggestion includes the correct type annotation as per the diff snippet. However, to ensure that the `List` type is properly recognized, you may need to import it from the `typing` module at the beginning of the file if it's not already imported:
```python
from typing import List

Comment on lines +117 to +120
self.fitting_net.output_def()
if self.fitting_net is not None
else self.coord_denoise_net.output_def()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle potential None values in fitting_output_def

In the fitting_output_def method, if both self.fitting_net and self.coord_denoise_net are None, an AttributeError may be raised. Consider adding a check to handle this scenario gracefully.

Apply this diff to add error handling:

         def fitting_output_def(self) -> FittingOutputDef:
             """Get the output def of the fitting net."""
-            return (
-                self.fitting_net.output_def()
-                if self.fitting_net is not None
-                else self.coord_denoise_net.output_def()
-            )
+            if self.fitting_net is not None:
+                return self.fitting_net.output_def()
+            elif self.coord_denoise_net is not None:
+                return self.coord_denoise_net.output_def()
+            else:
+                raise ValueError("Both fitting_net and coord_denoise_net are None.")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.fitting_net.output_def()
if self.fitting_net is not None
else self.coord_denoise_net.output_def()
)
def fitting_output_def(self) -> FittingOutputDef:
"""Get the output def of the fitting net."""
if self.fitting_net is not None:
return self.fitting_net.output_def()
elif self.coord_denoise_net is not None:
return self.coord_denoise_net.output_def()
else:
raise ValueError("Both fitting_net and coord_denoise_net are None.")

deepmd/pd/model/atomic_model/dp_atomic_model.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between b3a6408 and 72241ea.

📒 Files selected for processing (5)
  • backend/dynamic_metadata.py (3 hunks)
  • backend/find_paddle.py (1 hunks)
  • backend/read_env.py (5 hunks)
  • deepmd/pd/entrypoints/main.py (1 hunks)
  • deepmd/pd/model/atomic_model/dp_atomic_model.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • backend/dynamic_metadata.py
  • backend/find_paddle.py
  • backend/read_env.py
🧰 Additional context used
🪛 Ruff
deepmd/pd/entrypoints/main.py

132-132: Local variable f is assigned to but never used

Remove assignment to unused variable f

(F841)


573-576: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

🪛 GitHub Check: CodeQL
deepmd/pd/entrypoints/main.py

[notice] 108-108: Unused local variable
Variable local_rank is not used.


[notice] 383-387: Commented-out code
This comment appears to contain commented-out code.


[notice] 447-452: Commented-out code
This comment appears to contain commented-out code.


[warning] 557-561: Unreachable code
This statement is unreachable.

deepmd/pd/model/atomic_model/dp_atomic_model.py

[warning] 60-60: Overwriting attribute in super-class or sub-class
Assignment overwrites attribute type_map, which was previously defined in superclass BaseAtomicModel.

Comment on lines +282 to +289
if self.pair_excl is not None:
pair_exclude_types = self.pair_excl.get_exclude_types()
for sample in sampled:
sample["pair_exclude_types"] = list(pair_exclude_types)
if self.atom_excl is not None:
atom_exclude_types = self.atom_excl.get_exclude_types()
for sample in sampled:
sample["atom_exclude_types"] = list(atom_exclude_types)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential AttributeError: Uninitialized attributes self.pair_excl and self.atom_excl

In the compute_or_load_stat method, the attributes self.pair_excl and self.atom_excl are accessed within wrapped_sampler, but they are not initialized in the __init__ method or elsewhere in the class. This could lead to an AttributeError at runtime.

Consider initializing these attributes in the __init__ method or adding checks before accessing them. Here's a possible fix:

+    def __init__(
+        self,
+        descriptor,
+        fitting,
+        type_map: List[str],
+        **kwargs,
+    ):
+        super().__init__(type_map, **kwargs)
+        # Initialize attributes
+        self.pair_excl = None
+        self.atom_excl = None

Or modify the conditional checks:

             @functools.lru_cache
             def wrapped_sampler():
                 sampled = sampled_func()
-                if self.pair_excl is not None:
+                if hasattr(self, 'pair_excl') and self.pair_excl is not None:
                     pair_exclude_types = self.pair_excl.get_exclude_types()
                     for sample in sampled:
                         sample["pair_exclude_types"] = list(pair_exclude_types)
-                if self.atom_excl is not None:
+                if hasattr(self, 'atom_excl') and self.atom_excl is not None:
                     atom_exclude_types = self.atom_excl.get_exclude_types()
                     for sample in sampled:
                         sample["atom_exclude_types"] = list(atom_exclude_types)

Committable suggestion was skipped due to low confidence.

Comment on lines +229 to +230
if self.do_grad_r() or self.do_grad_c():
extended_coord.stop_gradient = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Missing Definitions for do_grad_r() and do_grad_c() Methods in DPAtomicModel

The methods do_grad_r() and do_grad_c() are not defined within the DPAtomicModel class or its superclass BaseAtomicModel. This absence will lead to an AttributeError when these methods are invoked.

  • File: deepmd/pd/model/atomic_model/dp_atomic_model.py
  • Lines: 229-230
🔗 Analysis chain

Verify the existence of do_grad_r() and do_grad_c() methods

The methods self.do_grad_r() and self.do_grad_c() are called, but they are not defined within the DPAtomicModel class. If these methods are not implemented in the superclass or elsewhere, this will result in an AttributeError.

Please ensure that do_grad_r() and do_grad_c() are properly defined or inherited. To verify their existence, you can run the following script:

This script searches for the definitions of do_grad_r and do_grad_c in all Python files. Ensure that these methods exist and are accessible to avoid runtime errors.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if methods `do_grad_r` and `do_grad_c` are defined in the class hierarchy.

# Search for the method definitions in the codebase
ast-grep --lang python --pattern $'def do_grad_r(self):\n    $$$'
ast-grep --lang python --pattern $'def do_grad_c(self):\n    $$$'

Length of output: 131


Script:

#!/bin/bash
# Description: Search for definitions of `do_grad_r` and `do_grad_c` methods across the entire Python codebase.

# Search for `do_grad_r` method definitions
rg 'def do_grad_r\s*\(' --type python

# Search for `do_grad_c` method definitions
rg 'def do_grad_c\s*\(' --type python

# Additionally, find the class hierarchy for DPAtomicModel to check inheritance
rg 'class DPAtomicModel\s*\(.*\):' --type python -A 5

Length of output: 228


Script:

#!/bin/bash
# Description: Search for definitions of `do_grad_r` and `do_grad_c` methods across all Python files, and identify the superclass of DPAtomicModel.

# Search for `do_grad_r` method definitions
rg 'def do_grad_r\s*\(' -g "*.py"

# Search for `do_grad_c` method definitions
rg 'def do_grad_c\s*\(' -g "*.py"

# Find the superclass of DPAtomicModel
rg 'class DPAtomicModel\s*\(([^)]+)\):' -g "*.py"

Length of output: 886


Script:

#!/bin/bash
# Description: Verify that `BaseAtomicModel` defines `do_grad_r` and `do_grad_c` methods.

# Find the file(s) where `BaseAtomicModel` is defined
rg 'class BaseAtomicModel\s*\(.*\):' -g "*.py"

# Assume the previous command outputs the file paths; replace <file_path> accordingly
# Search for `do_grad_r` method within `BaseAtomicModel`
rg 'def do_grad_r\s*\(' -g "*.py" --context 10

# Search for `do_grad_c` method within `BaseAtomicModel`
rg 'def do_grad_c\s*\(' -g "*.py" --context 10

Length of output: 13262

Comment on lines +447 to +452
# elif FLAGS.INPUT.endswith(".json"):
# old_model = paddle.jit.load(FLAGS.INPUT[: -len(".json")])
# model_params_string = old_model.get_model_def_script()
# model_params = json.loads(model_params_string)
# old_state_dict = old_model.state_dict()
# model_state_dict = old_state_dict
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Eliminate commented-out code or implement the intended functionality

The commented-out code in lines 447-452 suggests a planned feature for handling .json model files. If this functionality is not going to be implemented soon, consider removing the commented code. Otherwise, include a TODO comment to indicate future work.

Apply this diff to remove the commented-out code:

-# elif FLAGS.INPUT.endswith(".json"):
-#     old_model = paddle.jit.load(FLAGS.INPUT[: -len(".json")])
-#     model_params_string = old_model.get_model_def_script()
-#     model_params = json.loads(model_params_string)
-#     old_state_dict = old_model.state_dict()
-#     model_state_dict = old_state_dict
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# elif FLAGS.INPUT.endswith(".json"):
# old_model = paddle.jit.load(FLAGS.INPUT[: -len(".json")])
# model_params_string = old_model.get_model_def_script()
# model_params = json.loads(model_params_string)
# old_state_dict = old_model.state_dict()
# model_state_dict = old_state_dict
🧰 Tools
🪛 GitHub Check: CodeQL

[notice] 447-452: Commented-out code
This comment appears to contain commented-out code.

Comment on lines +383 to +387
# elif FLAGS.INPUT.split(".")[-1] == "pdmodel":
# model_params_string = paddle.jit.load(
# FLAGS.INPUT[: -len(".pdmodel")]
# ).model_def_script
# model_params = json.loads(model_params_string)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove commented-out code for clarity

The code between lines 383-387 is commented out. If this code is no longer needed, consider removing it to improve code readability. If it's intended for future implementation, add a descriptive comment explaining its purpose.

Apply this diff to remove the commented-out code:

-# elif FLAGS.INPUT.split(".")[-1] == "pdmodel":
-#     model_params_string = paddle.jit.load(
-#         FLAGS.INPUT[: -len(".pdmodel")]
-#     ).model_def_script
-#     model_params = json.loads(model_params_string)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# elif FLAGS.INPUT.split(".")[-1] == "pdmodel":
# model_params_string = paddle.jit.load(
# FLAGS.INPUT[: -len(".pdmodel")]
# ).model_def_script
# model_params = json.loads(model_params_string)
🧰 Tools
🪛 GitHub Check: CodeQL

[notice] 383-387: Commented-out code
This comment appears to contain commented-out code.

Comment on lines +573 to +576
if not isinstance(args, argparse.Namespace):
FLAGS = parse_args(args=args)
else:
FLAGS = args
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify the if-else block using a ternary operator

The if-else block in the main function can be simplified using a ternary operator for improved readability.

Apply this diff to refactor the code:

 def main(args: Optional[Union[List[str], argparse.Namespace]] = None):
-    if not isinstance(args, argparse.Namespace):
-        FLAGS = parse_args(args=args)
-    else:
-        FLAGS = args
+    FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if not isinstance(args, argparse.Namespace):
FLAGS = parse_args(args=args)
else:
FLAGS = args
def main(args: Optional[Union[List[str], argparse.Namespace]] = None):
FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args
🧰 Tools
🪛 Ruff

573-576: Use ternary operator FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args instead of if-else-block

Replace if-else-block with FLAGS = parse_args(args=args) if not isinstance(args, argparse.Namespace) else args

(SIM108)

@github-actions github-actions bot added the Docs label Sep 29, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
README.md (3)

22-22: Approved with a minor suggestion for clarity.

The addition of Paddle as a supported backend is consistent with the PR objectives and enhances the framework's compatibility. Well done!

For improved clarity, consider rephrasing slightly:

- **interfaced with multiple backends**, including TensorFlow, PyTorch and Paddle, the most popular deep learning frameworks, making the training process highly automatic and efficient.
+ **interfaced with multiple backends**, including TensorFlow, PyTorch, and Paddle (three of the most popular deep learning frameworks), making the training process highly automatic and efficient.

This change adds a comma for proper list formatting and clarifies that these are examples of popular frameworks without implying they are the only popular ones.


75-77: Approved with suggestions for additional information.

The addition of version 3 highlights is appropriate and consistent with the PR objectives. However, to provide more value to readers, consider expanding on the new features:

#### v3

- Multiple backends supported. Add PyTorch and Paddle backend.
- The DPA-2 model.
+ - Multiple backends supported: Added PyTorch and Paddle backends, expanding compatibility and user choice.
+ - Introduced the DPA-2 model, enhancing [brief description of improvements or capabilities].
+ - [Any other significant features or improvements in v3]

This expansion would give readers a clearer understanding of the advancements in version 3. Please add any other significant features or improvements that are part of this major version update.


Line range hint 1-146: Overall assessment of README.md changes

The updates to the README.md file accurately reflect the addition of the Paddle backend and introduce the highlights of version 3. These changes align well with the PR objectives and effectively communicate the enhancements to potential users.

To further improve the documentation:

  1. Consider expanding on the capabilities and benefits of the new Paddle backend.
  2. Provide more details about the DPA-2 model and its advantages.
  3. If there are any other significant features or improvements in version 3, include them in the highlights section.

These additions would give readers a more comprehensive understanding of the new version's capabilities and improvements.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 72241ea and f1cae59.

📒 Files selected for processing (2)
  • README.md (2 hunks)
  • deepmd/pd/model/atomic_model/dp_atomic_model.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • deepmd/pd/model/atomic_model/dp_atomic_model.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant