Skip to content
This repository has been archived by the owner on Jul 5, 2024. It is now read-only.

Investigate error log on ci light-test padding_len overflow #1507

Closed
hero78119 opened this issue Jun 30, 2023 · 9 comments · Fixed by #1515
Closed

Investigate error log on ci light-test padding_len overflow #1507

hero78119 opened this issue Jun 30, 2023 · 9 comments · Fixed by #1515
Assignees
Labels
T-bug Type: bug

Comments

@hero78119
Copy link
Member

hero78119 commented Jun 30, 2023

What command(s) is the bug in?

on ci light test of all the exiting pull requests.

Describe the bug

Can find this error log in all the existing pull request ci light test.

For example https://github.com/privacy-scaling-explorations/zkevm-circuits/actions/runs/5413925634/jobs/9840226199?pr=1500

Light test is pass, however this error shows a concern.
It seems only happened when one command matching more > 1 tests. Maybe related to recent modification of evm circuit cache.

[2023-06-29T15:26:28Z ERROR zkevm_circuits::witness::rw] RwMap::padding_len overflow, target_len: 41, rows_len: 42
[2023-06-29T15:26:28Z ERROR zkevm_circuits::witness::rw] RwMap::padding_len overflow, target_len: 41, rows_len: 42
[2023-06-29T15:26:28Z ERROR zkevm_circuits::witness::rw] RwMap::padding_len overflow, target_len: 41, rows_len: 42
[2023-06-29T15:26:28Z ERROR zkevm_circuits::witness::rw] RwMap::padding_len overflow, target_len: 41, rows_len: 42
[2023-06-29T15:26:28Z ERROR zkevm_circuits::witness::rw] RwMap::padding_len overflow, target_len: 41, rows_len: 42
[2023-06-29T15:26:28Z ERROR zkevm_circuits::witness::rw] RwMap::padding_len overflow, target_len: 41, rows_len: 42
[2023-06-29T15:26:28Z ERROR zkevm_circuits::witness::rw] RwMap::padding_len overflow, target_len: 34, rows_len: 35
[2023-06-29T15:26:28Z ERROR zkevm_circuits::witness::rw] RwMap::padding_len overflow, target_len: 34, rows_len: 35
[2023-06-29T15:26:28Z ERROR zkevm_circuits::witness::rw] RwMap::padding_len overflow, target_len: 34, rows_len: 35
[2023-06-29T15:26:29Z ERROR zkevm_circuits::witness::rw] RwMap::padding_len overflow, target_len: 34, rows_len: 35

Concrete steps to reproduce the bug. If it's able reproduce via testool, please share test_id from jenkins report

Can reproduce this error locally on main branch

RUST_LOG=DEBUG cargo test --features test --package zkevm-circuits --lib -- evm_circuit::execution::create::test::test_create_address_collision_error

This error not dedicated to create. Any command matching > 1 tests can reproduce

@hero78119 hero78119 added the T-bug Type: bug label Jun 30, 2023
@hero78119
Copy link
Member Author

Hi @davidnevadoc do you think is it related to the recent change #1445 ?

@hero78119
Copy link
Member Author

also + @ed255 maybe have any quick answer :)

@hero78119 hero78119 changed the title Fix ci light-test padding_len overflow Investigate error log on ci light-test padding_len overflow Jun 30, 2023
@ed255
Copy link
Member

ed255 commented Jun 30, 2023

Hi @davidnevadoc do you think is it related to the recent change #1445 ?

Yes, I think this comes directly related to this discussion #1445 (comment)

My guess is that there's a check that is too "overprotective" and logs an error even though the parameters are safe. The other option is that we're skipping enabling some necessary rows (and thus skipping their constraints), which could mean a soundness issue. So I believe this needs some investigation. I would also suggest to avoid logging errors without panicking in such situations; if the circuit parameters not enough to hold the full witness, we should panic.

@davidnevadoc
Copy link
Contributor

Yes, this was introduced in #1445 .
I was a bit on the fence about this issue because I too got this error. However, I think it is due to what @ed255 mentioned: An overprotective check.
By having max_rws +3 here


the error will go away. But I could not logically explain such a magic number so I left the +1 which is what I think makes sense according to the specs.

@hero78119 hero78119 mentioned this issue Jul 3, 2023
6 tasks
@hero78119
Copy link
Member Author

hero78119 commented Jul 3, 2023

Yes, this was introduced in #1445 . I was a bit on the fence about this issue because I too got this error. However, I think it is due to what @ed255 mentioned: An overprotective check. By having max_rws +3 here

the error will go away. But I could not logically explain such a magic number so I left the +1 which is what I think makes sense according to the specs.

Thanks for the hint. After I applied the max_rws +3 commit 360ad81 on word-lo-hi the error indeed gone.

However agree that we should investigate what's the root cause for why magic number + 2

github-merge-queue bot pushed a commit that referenced this issue Jul 3, 2023
### Description

This pr is to complete milestone `word-lo-hi` back to main branch.

Since `word-lo-hi` branch is out-of-sync with main branch due to few
features merged to main recently, in order to save review effort to
review 1 pr instead of 2 pr, a new branch `word-lo-hi-merge-main` is
created which covers
- [x] based on latest word-lo-hi branch
- [x] remove `_word` suffix and remove deprecated table field
- [x] also sync main and resolve conflict
- [x] Fix problem on main branch: applied a workaround `max_rwc + 1 + 2`
magic number mention in another issue
#1507
- [x] ~~investigate super circuit degree 9 -> 12~~
#1499
After fixed, degree 9 -> 10

With this pr, reviewer just need to review once.

### Issue Link


#1509

### Type of change

- [x] Breaking change (fix or feature that would cause existing
functionality to not work as expected)

### Contents

to complete milestone `word-lo-hi`

---------

Co-authored-by: Chih Cheng Liang <chihchengliang@gmail.com>
Co-authored-by: adria0.eth <5526331+adria0@users.noreply.github.com>
Co-authored-by: adria0 <adria0@nowhere>
Co-authored-by: Eduard S <eduardsanou@posteo.net>
Co-authored-by: Han <tinghan0110@gmail.com>
Co-authored-by: Paul <108982045+ChengYueJia@users.noreply.github.com>
Co-authored-by: Steven <asongala@163.com>
Co-authored-by: Rohit Narurkar <rohit.narurkar@protonmail.com>
Co-authored-by: KimiWu <leonhartwu@gmail.com>
Co-authored-by: Raphael <raphael.r.toledo@gmail.com>
Co-authored-by: Zhang Zhuo <mycinbrin@gmail.com>
Co-authored-by: naure <naure@users.noreply.github.com>
Co-authored-by: Aurélien Nicolas <info@nau.re>
Co-authored-by: Enrico Bottazzi <85900164+enricobottazzi@users.noreply.github.com>
Co-authored-by: Raphael <contact@raphael-toledo.com>
Co-authored-by: Wu Sung-Ming <wusm@sea.com>
Co-authored-by: Leo Lara <leolara@users.noreply.github.com>
@hero78119 hero78119 self-assigned this Jul 4, 2023
@hero78119
Copy link
Member Author

Yes, this was introduced in #1445 . I was a bit on the fence about this issue because I too got this error. However, I think it is due to what @ed255 mentioned: An overprotective check. By having max_rws +3 here

the error will go away. But I could not logically explain such a magic number so I left the +1 which is what I think makes sense according to the specs.

Hi @davidnevadoc , after L377, there is send_end_block which still push 2 RW op plus context lookup, so if wanna double confirm if we set max_rws = max_rws + 1 but not counting rw op happened in set_end_block is it still correct?

@hero78119
Copy link
Member Author

Tried to fixed in PR #1515

@davidnevadoc
Copy link
Contributor

there is send_end_block which still push 2 RW op plus context lookup

Oh there it is!! You are right, I missed those and that is why I left the +1. Now that we know where the other 2 come from I'm comfortable having the +3 👍

@hero78119
Copy link
Member Author

there is send_end_block which still push 2 RW op plus context lookup

Oh there it is!! You are right, I missed those and that is why I left the +1. Now that we know where the other 2 come from I'm comfortable having the +3 👍

Actually +1 + 1 (overall +2) work in this case, 😄 the pr should fixed it generally 👆

github-merge-queue bot pushed a commit that referenced this issue Jul 6, 2023
### Description

Fixed rws padding logic and padding error handling.

### Issue Link


#1507

### Type of change

- [x] Bug fix (non-breaking change which fixes an issue)

### Contents

- more documentation on padding logic
- fix max_rws calculation
- de-duplicated startop in end_block.tx by skip second one.
- panic instead of confused error log in padding calculation function.
@ChihChengLiang ChihChengLiang linked a pull request Jul 7, 2023 that will close this issue
1 task
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
T-bug Type: bug
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants