-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【PIR Dist Op Reg No.28】 reg dgc #62781
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
5b3b1bd
to
2dc683d
Compare
@@ -454,6 +454,13 @@ | |||
optional : in_accum, in_state, out_scale, out_accum, out_state | |||
inplace : (scale -> out_scale, in_accum -> out_accum, in_state -> out_state) | |||
|
|||
- op : dgc | |||
args : (Tensor u, Tensor v, Tensor grad, Tensor param, Tensor current_step, Tensor nranks, float[] sparsity, , float m=0.9, bool use_nesterov=true, float rampup_begin_step=0.0, float rampup_step=0.0, float regular_coeff=0.0, int regular_type=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
多打了逗号
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感谢已修改,请问一下新的报错是注册算子的时候还有地方写错了吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
从代码来看,
Paddle/paddle/fluid/operators/CMakeLists.txt
Lines 134 to 135 in 65126fa
if (WITH_DGC) | |
op_library(dgc_op DEPS dgc) |
这里应该是对应流水线没有开启WITH_DGC选项,需要修改CMakeLists,
在第30行前加入这样的判断:
if(NOT WITH_DGC)
list(REMOVE_ITEM TEST_INTERP_CASES test_dgc_translate)
endif()
多谢,已修改 |
这里这样修改后似乎还是遇到老问题 |
if(NOT WITH_DGC) | ||
list(REMOVE_ITEM TEST_INTERP_CASES test_dgc_translator) | ||
endif() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加错地方了,加在test/ir/pir/translator/CMakeLists.txt里
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
抱歉,不知道怎么加在那边了
@kangguangli 这个麻烦再看一下,还是原来的问题 |
现在的报错是在翻译过程中报错:
这个错误是说翻译的时候在旧IR算子中没有发现 param 这个输入,一般是由于param为空导致的,原因是 旧IR下有些输入没有正确的声明optional,后续遇到此类问题,可以在算子的yaml里添加下面的声明试试。
|
谢谢,已这样修改,但有点困惑,哪里可以看到旧IR算子中缺少这个输入呢 |
从报错提示中可以看出旧IR中缺少这个参数,如果对相关代码感兴趣,可以看看这个目录下的代码paddle/fluid/ir_adaptor/translator,PFCC仓库应该也有一份文档。 另外,PR模版现在有变更,麻烦按照 https://github.com/PaddlePaddle/Paddle/wiki/PULL-REQUEST-TEMPLATE--REFERENCE 修改一下,我们的任务应该属于 执行架构(Execute Infrastructure)和 框架本身开发的改动。 |
多谢已修改 |
args : (Tensor u, Tensor v, Tensor grad, Tensor param, Tensor current_step, Tensor nranks, float[] sparsity, float m=0.9, bool use_nesterov=true, float rampup_begin_step=0.0, float rampup_step=0.0, float regular_coeff=0.0, int regular_type=0) | ||
output : Tensor(u_out), Tensor(v_out), Tensor(encode_grad), Tensor(grad_out), Tensor(k), Tensor(gather_buff) | ||
kernel : | ||
func : dgc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func : dgc | |
func : dgc | |
param: [u, v, ...] |
在kernel里面需要指定传给dgc_kernel参数,顺序和kernel参数一致,其他地方不用修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
多谢,已修改,请问一下这里为什么要添加呢
@kangguangli 请问这个pr模版有什么问题吗,按照格式修改过了? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR模版需要把原来的注释部分删除
output : Tensor(u_out), Tensor(v_out), Tensor(encode_grad), Tensor(grad_out), Tensor(k), Tensor(gather_buff) | ||
kernel : | ||
func : dgc | ||
param : [u, v, grad, param, current_step, nranks, sparsity, m, use_nesterov, rampup_begin_step, rampup_step, regular_coeff, regular_type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个顺序和kernel顺序不完全一致,再和paddle/phi/kernels/gpu/dgc_kernel.cu里的DGCKernel对照下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改,因为这里我看sparsity没有提到默认值所以提前了一下
f5be4ad
to
68bf75c
Compare
@kangguangli 这个麻烦review一下,windows ci挂了 |
@@ -0,0 +1,77 @@ | |||
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2023 -> 2024
* feat(pir): reg dgc * feat(pir): dgc * feat(pir): dgc * feat(pir): reg dgc * feat(pir): reg dgc * feat(pir): reg dgc * feat(pir): reg dgc * feat(pir): reg dgc * feat(pir): reg dgc
* feat(pir): reg dgc * feat(pir): dgc * feat(pir): dgc * feat(pir): reg dgc * feat(pir): reg dgc * feat(pir): reg dgc * feat(pir): reg dgc * feat(pir): reg dgc * feat(pir): reg dgc
PR Category
Execute Infrastructure
PR Types
Devs
Description
#60436
注册算子dgc