Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[meta] Potential fix for issue #801 #802

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion meta/saiserialize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1268,7 +1268,8 @@ std::string sai_serialize_acl_resource_list(
return j.dump();
}

json arr = json::array();
json::array_t a;
json arr(a);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why this fixes the existing issue? also if you grep this file by json::array() its used in 7 places, so it would require to be fixed in all of them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line, "json arr = json::array();" calls constructor, "basic_json(init, false, value_t::array);" in file json.hpp:1615, whereas, "json::array_t a; json arr(a);" invokes different constructor "basic_json(const array_t& val)".

Here, the issue appears to be stack corruption, where passed parameter seems to have a value different from the one passed by the calling method. We see this issue only in this flow on API "sai_serialize_acl_resource_list".
When modified the constructor call to pass only 2 arg and assign 3 arg from default value as below, issue is not seen.
"basic_json(init, false);" in file json.hpp:1615

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this is actual root cause, since the same exact line is used in different serialization methods in this file as well, and if they don't cause same error, then this could potentially by compiler bug or some memory alignment, for example same constructor us used in sai_serialize_json_fdb_event_notification_data and sai_serialize_fdb_event_ntf or sai_serialize_port_oper_status_ntf. So if you don't hit this error anywhere else, then it ether means that you don't receive those notifications in your case (if error is persistent), or the error is related to some compiler/optimization. Also, this error is not triggered on x86/64 platform, only on armhf, which would suggest armhf platform is flawed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will you be able to see if you will hit this error without -O2 enabled ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, do you have idea which actual line in json.hpp is causing this error? maybe it's better to fix this in json.hpp rather than here, if many different components may use json.hpp and this array constructor

Copy link
Contributor

@tahmed-dev tahmed-dev Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lguohan, it is not clear to me from the stack backtrace why boost v 1.71 is at play here. @kcudnik are we using API across libraries that are using boost? My understanding is the swss-common had compile time dependency on boost smart pointers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boost was recently added as dependency as am' aware, and sairedis library is not currently using any boost bindings

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was in azure pipeline build docker. Wondering if @rajkumar38 has mismatched binaries by any chance???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was in azure pipeline build docker. Wondering if @rajkumar38 has mismatched binaries by any chance???

I believe all packages are installed in build docker using apt-get command and I don't see any package mismatch (verified with dpkg -l in build docker env)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will you be able to see if you will hit this error without -O2 enabled ?

With changes from PR 6532, I recompiled libswsscommon, libsairedis and libsaimetadata debs with O2 flag disabled. With this, I'm not seeing the crash.
Issue is seen when PR 6532 and O2 flag enabled together.
I don't think any compiler settings changed recently and is not clear how PR 6532 is causing the regression.


for (uint32_t i = 0; i < aclresource.count; ++i)
{
Expand Down