-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reimplement PCRE2_UNREACHABLE() assertions with a slightly safer approach #490
base: master
Are you sure you want to change the base?
Conversation
One comment: the
I think the implementation in this PR only meets point 1 but not point 2; but if that is not correct, please mention. |
You are correct, there is no production implementation and therefore there is no optimization hint either (except for DEBUG builds by the mentioned attribute that is expected from abort(), but that of course is implementation dependent). Interestingly enough from all the places where it was used, the only place where it would make a difference was the function fixed in #489, and the difference wouldn't be good for our users. If the assertion was ever reached, and their library was built with a recent enough compiler in an x86 system, they will get a crash, instead of getting an error, because as you clearly pointed out, the "return" call itself would be removed at compile time; was that behaviour what was expected from that "optimization"? |
In answer to that question, what was expected was that if we assert a certain path is unreachable, that path should really be unreachable, and because the compiler is told that is unreachable, it can (in some cases) generate better code. For example, if you tell the compiler that the If the unreachable path is really unreachable, then there will be no crashes in production (because it will never be reached). On the other hand, if we are not able to guarantee, through whatever method, that our "unreachable" code is really unreachable... that is a problem. 😦 Since PCRE2 has significant installed base, it would be nice if the codebase was at a high enough level of quality that we could have full confidence that the "unreachable" paths are really unreachable. In that case, they could safely benefit from However, if the codebase is not at that level of quality yet, then you are very right that In any case, thanks very much for looking into this issue. Another thought: It would be nice if, in production builds, all instances of |
Just one suggestion: I think the assertion failure message in this PR would be easier to understand if it said something like "Execution must not reach this point" or "Execution reached unexpected point". |
Reaching an unreachable code is always a big problem, since you cannot write tests to see what happens. Hence whatever we do, we probably have issues. I don't mind unreachable in matching code, because its purpose is high performance. For pattern compilation, we could use errors, although the problem is the same: we cannot test them. |
I would prefer no unreachables in production, as they are additional code, their behaviour changes from compiler to compiler, and even between versions of the same compiler and have bugs, but agree that since they are meant to be really unreachable (assuming the original selection was done carefully enough) they could be used to maybe improve performance. Anyway, as requested by Alex updated the series to also modify behaviour in non debug builds and cover both use cases and updated the PR subject to reflect that. |
Btw I would prefer an assertion before all internal errors. When I develop code, and get those errors, it is too hard to find their sources. |
Carlo, how would this be:
Just a suggestion. |
Suggestion: Aside from PCRE2_ASSERT, can we add another macro which expands to a failing assertion (abort) in debug builds, but This macro must only be used in functions which return |
sure, fork away and go ahead, I promise not to squash this until all discussion is settled and review your code to suggest improvements. note though that having different assertion per type of function is going to be messy, since each function has different ways to report those errors and even some have none, so IMHO might be easier to open code each as needed using |
src/pcre2_util.h
Outdated
#else | ||
#define PCRE2_UNREACHABLE() do {} while(0) | ||
#warning "missing implementation of PCRE2_UNREACHABLE()" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a strong requirement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know and I didn't like it either, but how else can we let people know that their compiler might silently continue running code after reaching something labeled as PCRE2_UNREACHABLE()
.
This whole funny business with #489 where we could reasonable think it was ok to delete the fail safe code twice in a month, makes me think this is too confusing and needs at least a stronger "heads up".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW removed it from the latest revisions, because it was too painful to have it when building on anything outside the 3 compilers we have implementations for, but that means we also have to be extra careful in the opposite direction, because something like:
case THIS:
do_good();
PCRE2_UNREACHABLE();
case THAT:
do_bad();
is perfectly valid code and will build cleanly on those 3 compilers we have in CI, but result in a dangerous fall through when built by a different compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Carlo, just a thought... Aside from gcc, clang, and MS Visual C++, do you know any other C compilers which are commonly used to build PCRE2? If so, I can do a bit of research and see if they have a similar built-in.
I understand that even so, the issue will still remain that some users, somewhere may use a compiler for which PCRE2_UNREACHABLE is not implemented. But it wouldn't be bad to have implementations for as many compilers as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you know any other C compilers which are commonly used to build PCRE2?
Oracle's (AKA Sun) Studio, the latest release doesn't have this (with the C++ version implemented as a NOP internally, so it will even be detected and used but still behave as erratically as described above). IBM's xlc and DEC's for the VMS are also likely common.
The important point though is not that we "could" add support for them, but the fact that we are opening ourselves to issues like this by using this builtins() that are specifically described as triggering "undefined behaviour" and so will need to be more careful going forward on our code review (I guess) or have to invent something else to protect us from that.
The part we seem to be having difficulty getting to an understanding is that there are cases were some code that would be normally unreachable and might be at a specific point when audited, could be still affected by future changes and in those cases, removing apparent dead code that was there to be a safety net for those cases (either explicitly or even worst, implicitly through dead code elimination by a compiler) is not a good idea and need to be handled differently (at least in non debug builds)
this was implemented in the last revision, but with the caveat that we need the exception from the previous point in at least one case (again only needed for the non debug builds, although it could also be applied there for consistency)
returning PCRE2_ERROR_INTERNAL might not be what is required on all places were this assert was introduced originally, it would also seem like a lot of work for something that could be done without the intermediate step and accomplished at the end of this series if you just did the promised audit for each entry over it, or are you implying that none of those asserts were audited previously and are therefore unsafe? as stated in the subject, my end objective is not to remove all those assertions but just make sure that it is clear how to use them going forward (even by changing their names so there is no longer confusion) and their use is safe from unexpected side effects in all the places were they are currently being used. |
Sure, let's do that... it was just an idea. |
2 occurrences of Recommend that Some occurrences in pcre2_compile.c are definitely safe. But as Zoltan mentioned in this thread, compilation is less performance-sensitive than matching... so maybe it's better to use I do think that the name Other options include: |
Fair enough, then will post a v4 that would use FWIW as per the "audit", I was expecting commits not comments, which might be better to hold until that update is pushed, so no unnecessary conflicts are created; presume you are ok with my "POC" audit in patch 4?, your comment would seem to imply you agree but it will be easier if the comment would be in the code instead; will hold othewise pushing an update if you would rather see changes so I can include them with it. |
Sure, I will provide some commits then. Yes, patch 4 looks good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I usually prefer one patch.
Very nice patch! Shall we land it? |
Will mark as ready, as soon as I complete testing |
src/pcre2_auto_possess.c
Outdated
@@ -456,6 +456,7 @@ switch(c) | |||
{ | |||
/* Early return if there is not enough space. This should never | |||
happen, since all clists are shorter than 5 character now. */ | |||
PCRE2_DEBUG_UNREACHABLE(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs more than just an assert based on a comment IMHO, and changes not only here.
the data this is using is generated from the Unicode tables by maint/GenerateUcd.py so the comment is not completely correct and could change when Unicode is updated.
not sure what the "correct" thing to do should be but it at least includes:
- GenerateUcd.py is made aware of this limit and either prints a warning if exceeded or refuses to create the table with a suitable sugestion on what to do next.
- The limit is increased or a test is added showing what the behaviour is and this "stops" being "dead code"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in any case, included a little more information that could be used to start a fix for this and that could be also used IMHO as a good example on how to use these macros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that Carlo has added a MAX_LIST variable in GenerateUcd.py, though it doesn't seem to be used yet in that script.
Would adding a check around line 851 that throws an exception if len(s) > (MAX_LIST - 3) be a good idea? Then anyone who is regenerating the property tables will notice if this is a problem.
The reason for comparing len(s) to MAX_LIST - 3 is because two slots in list are used for other purposes, and it needs to be terminated with the special value NOTACHAR, so the number of slots available for codepoints are MAX_LIST - 3.
If this ever happens, simply increasing the value of MAX_LIST in pcre2_auto_possess.c and GenerateUcd.py is all that is needed... then the stack-allocated arrays used to store those values will get bigger and everything will be OK.
I'm just thinking more about this... If we adjust GenerateUcd.py so that it crashes if ever ucd_caseless_sets are too big... do we still need the fail-safe in pcre2_auto_possess.c? Would it be enough to simply assert PCRE2_ASSERT(clist_dest < list + MAX_LIST)?
This is not simply a technical issue, but more of a "project policy" issue.
This whole time, there was (apparently) a bug in |
6e81c9f
to
11bfded
Compare
@alexdowad: do you have any comments on the additional points raised?, I think it would be nice to at least document them a little better if you want to keep them in this PR; otherwise I am tempted to (as you suggested) remove them and then you could pick them up and reintroduce them one by one after more careful consideration. FWIW, I am trusting all the original ones (and maybe some of the new ones that got lost in the sea of changes) were audited and are safe, otherwise. |
I'll be happy to go through and add more comments. Apologies, but my life has become a bit busy in the last couple of days, so it will be appreciated if you can allow 48 hours or so.
I would be happy to make a full pass through again and double-check all occurrences of PCRE2_{,DEBUG_}UNREACHABLE. |
Sorry, there is no need to worry or apologize about that, we are volunteers anyway and you can take as much time as needed. |
Opened a PR improving comments in this PR a bit more; see PR 3 in carenas/debugass. |
I have read through the entire PR diff again. Aside from a couple small improvements to comments, I don't personally see anything else to improve right now. I think the rationale for Aside from There a number of other UNREACHABLE and DEBUG_UNREACHABLE assertions in this PR which are not specifically commented. For most of them, the function has obviously been designed to exit from inside an "infinite loop", and if ever the statement after that loop is reached, then something which the author did not anticipate must have happened. Is there anything else in particular which we should say in the comments? I think |
Ah, one more thing... should I fuzz this PR for a while to see if any of the assertions are triggered? |
7f73cb5
to
b6da3c5
Compare
The original asserts weren't very useful in debug mode as they were lacking information on where they were being triggered and were also unreliable and dangerous as they could result in important code being removed and trigger crashes (even in non debug mode). Instead of implementing one generic assert for both modes, build a more useful one for each one, so PCRE2_UNREACHABLE() could be also used in non debug builds to help with optimization. Reinstate all original assertions to use the new versions, which will have the sideeffect of fixing indentation issues introduced in the original, and include additional asserts that were provided as the original ones were being audited for safety. Note that during such audit the use of the original asserts might had been refactored so it also includes all those relevant code changes. While at it, update cmake and CI to help with testing as well as other documentation. Co-authored-by: Alex Dowad <alexinbeijing@gmail.com>
Reimplement the original assertions to use different versions for debug and non debug builds.
Non debug asserts use compiler specific builtins that are not safe to invoke, and therefore their use has been audited to paths that should NEVER be executed.
Co-Authored-by: Alex Dowad alexinbeijing@gmail.com