Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various cpp fixes for Sun /usr/lib/cpp compatibility #12

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

richlowe
Copy link
Contributor

This contains a chunk of fixes to improve compatible with Sun cpp (or
correctness in general). It started out as a fix for TritonDataCenter/smartos-live#171
but grew as more problems became apparent.

There are still at least 2 issues outstanding:

  1. As documented in the first non-HACK commit, our treatment of spaces
    in macros is, while considerably better, still not strictly compatible
    in a couple of ways that might be important

  2. #if conditions are not macro-expanded prior to being processed, such
    that #if SOME_MACRO(5) is a syntax error. This is proving particularly
    frustrating to fix.

This will definitely require more testing on the SmartOS side to be really sure
it's safe, I'd recommend a complete platform build, testing of any SDC
bits that use dtrace -C, at least. I'd also probably suggest running
test.sh | squash.py over a SmartOS /usr/include and checking for
anything that would affect the parseability of the result. (this is why
I left them in the source).

- test.sh: run /usr/lib/cpp and ./cpp against either specified headers,
  or all (most) headers in /usr/include, and report on differences.

- squash.py: coalesce the results of test.sh (expected on stdin), so
  each diff hunk is reported only once (line numbers are ignored,
  context is not)
This is a dubiously useful hack, but invaluable when debugging the
preprocessor.

If CPP_DEBUG_DEFINITIONS is in the environment, output to stderr at
every macro definition its name, and the defining file and line number,
such that the operative definition of common macros (such as __P) can be
determined.
Sun cpp removes all leading and trailing space from a macro pasting, and
compress whitespace in the macro body to a single space character.

There is some deviation from this in the Sun implementation which we
don't duplicate.

1) The presence of comments in the macro body affect the minimization of
   runs of spaces.

2) When newlines are encountered in the parameter list of a macro
   invocation, Sun cpp inserts that many newlines prior to any of the
   pasted text, and then in the pasted text pastes those newlines as
   (minimized) spaces.  Escaped new-lines are de-escaped, and otherwise
   treated similarly (in effect, the \ is removed).
We should only error about an unterminated macro parameter list if a
parameter list was begun.  If we never even seen the _first_
parenthesis, there is nothing to terminate.

Previously, we would set the parenthesis level to -1 when expecting
parameters (such that when we saw the opening paren it became the 0th
level), but when checking for unterminated expansion we would strictly
compare to 0, and thus flag a macro which needed parameters but lacked
them as having an unterminated parameter list.
… state

Previously, we would parse a macro foo() presuming that we would always
see the opening parenthesis indicating the begining of the parameter
list.  If we saw 'foo' we would consume the token _following_ 'foo'
presuming it would be the parenthesis, and if it was not, would not
paste it to the output.
<inet/tcp.h> requires considerably more defines to preprocess than the
4,000 we previously allowed.  Allow an (equally arbitrary) 16,000
symbols instead.
The previous implementation would parse 0x7ff as 0x755 (etc).
@richlowe
Copy link
Contributor Author

This is work I've been talking to @rmustacc about. It needs more testing (and probably more effort) before it's really suitable to merge.

@rmustacc
Copy link
Contributor

rmustacc commented Oct 6, 2013

@jclulow I heard that you were going to be testing some of these changes, should I be doing that instead?

@richlowe
Copy link
Contributor Author

richlowe commented Oct 7, 2013

I'm more concerned with review than testing. Making sure SDC isn't broken, etc, is important, but the scripts I attached and their results (and just how plainly wrong some of the prior behaviour was) suggest it's pretty unlikely. The major concern I have is that the code is so interestingly structured (because here on the 11/34, we don't have much memory to play with!), that I could have screwed it entirely and have it still just about work right :\

I think that fixing cpp like this is probably the right thing to do short-term, but I pretty firmly believe that -- given just how broken this cpp is and was -- that switching to an ANSI cpp while perhaps wrong is actually not any worse. I think the damage to dtrace -C would be real, but minimal (and probably defensible). The real problem would be how much pain it caused as.

plitc added a commit to plitc/illumos-extra that referenced this pull request Nov 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants