Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the terminal parser/adapter and related classes use modern, safe structures #3956

Merged
merged 19 commits into from
Dec 19, 2019

Conversation

miniksa
Copy link
Member

@miniksa miniksa commented Dec 13, 2019

Summary of the Pull Request

Refactors parsing/adapting libraries and consumers to use safer and/or more consistent mechanisms for passing information.

PR Checklist

  • I work here
  • Tests still pass
  • Am a core contributor.

Detailed Description of the Pull Request / Additional comments

This is in support of hopefully turning audit mode on to more projects. If I turned it on, it would immediately complain about certain classes of issues like pointer and size, pointer math, etc. The changes in this refactoring will eliminate those off the top.

Additionally, this has caught a bunch of comments all over the VT classes that weren't updated to match the parameters lists.

Additionally, this has caught a handful of member variables on classes that were completely unused (and now gone).

Additionally, I'm killing almost all hungarian and shortening variable names. I'm only really leaving 'p' for pointers.

Additionally, this is vaguely in support of a future where we can have "infinite scrollback" in that I'm moving things to size_t across the board. I know it's a bit of a memory cost, but all the casting and moving between types is error prone and unfun to save a couple bytes.

Validation Steps Performed

  • build it
  • run all the tests

@miniksa
Copy link
Member Author

miniksa commented Dec 13, 2019

@DHowett-MSFT, what's your opinion on using .front() and .back() to refer to two elements in a sequence when we've detected size is 2 instead of .at()? I've been doing that because I believe front and back are noexcept while at can throw and it might make more functions noexcept.

(I've also been doing just front for things with 1 element only.)

@DHowett-MSFT
Copy link
Contributor

I'm surprised front and back are noexcept; what happens if they're called on a container of size 0? Anyway, sure, if we can be assured that they're only size 2.

@miniksa
Copy link
Member Author

miniksa commented Dec 13, 2019

Also, if you all hate std::basic_string_view<T> as a way of wrapping an arbitrary array-like thing in a read-only iterator... we can propose a replacement. I thought about gsl::span<const T> except that for some confounded reason they use ptrdiff_t as their fundamental type which conflicts with most of STL at size_t.

@DHowett-MSFT
Copy link
Contributor

#ifndef __cplusplus_20_whatever
namespace std {
    template <typename T> using span = basic_string_view<T>;
}

@DHowett-MSFT
Copy link
Contributor

NAK on my own suggestion

…ng string views everywhere, size_ts over smaller sizes in prep for massive buffers, and so on and so on.
@miniksa
Copy link
Member Author

miniksa commented Dec 16, 2019

Turns out when you mess up StateMachine::ProcessString, you get >750 test failures.

Then it turns out when you fix it, you get down to 4 test failures in two fixes.

@zadjii-msft zadjii-msft added Area-CodeHealth Issues related to code cleanliness, linting, rules, warnings, errors, static analysis, etc. Area-VT Virtual Terminal sequence support Product-Conhost For issues in the Console codebase labels Dec 16, 2019
@miniksa
Copy link
Member Author

miniksa commented Dec 16, 2019

Audit warnings:

Module Before After
TerminalParser 409 258
TerminalAdapter (NC) (NC)
TerminalInput 103 103
Host 1722 1711
TerminalCore 291 275

(NC) = cannot compile

My goal was TerminalParser and TerminalCore here. Mild to moderate success.

Curious about Adapter, but it refuses to compile under audit mode.

@miniksa
Copy link
Member Author

miniksa commented Dec 16, 2019

Copy link
Member Author

@miniksa miniksa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of comments. Some are a few things I found on my own, but most of them are to guide other reviewers into the actual zones where things changed functionality a bit, not just a simple find/replace of a variable name or type.

src/terminal/adapter/InteractDispatch.cpp Show resolved Hide resolved
src/terminal/parser/ft_fuzzwrapper/echoDispatch.cpp Outdated Show resolved Hide resolved
src/terminal/parser/stateMachine.hpp Show resolved Hide resolved
@miniksa miniksa marked this pull request as ready for review December 17, 2019 18:09
@miniksa miniksa self-assigned this Dec 17, 2019
Comment on lines +1313 to +1317
auto wchIter = _run.cbegin();
while (wchIter < _run.cend() - 1)
{
ProcessCharacter(*pwch);
ProcessCharacter(*wchIter);
wchIter++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a fancy way to write for(const auto wch : _run), right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, it's the -1 that gets it innit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except it's more like for (const auto wch : _run.substr(0, _run.size() - 1)) as it leaves the last one off.

…ture and it matches up with the other unlimited parsing philosophy changes in this PR.
@DHowett-MSFT
Copy link
Contributor

nit: before you merge make sure this PR's title is imperative (and completes the sentence "This commit will...")

@miniksa miniksa changed the title Terminal parser/adapter and related classes use modern, safe structures Make the terminal parser/adapter and related classes use modern, safe structures Dec 19, 2019
@miniksa
Copy link
Member Author

miniksa commented Dec 19, 2019

@j4james, my apologies in advance for now destabilizing all of your outstanding PRs. I'm going to go through them now to make sure we make progress on them. If you can just match the new style going forward, that's great. If you want me to take on the ownership of migrating the style of them because it's my fault this changed and partially my fault we let yours not get reviewed for a while, let me know and I'm happy to make branches that merge this appropriately into your work.

@miniksa miniksa merged commit 6f667f4 into master Dec 19, 2019
@j4james
Copy link
Collaborator

j4james commented Dec 19, 2019

@j4james, my apologies in advance for now destabilizing all of your outstanding PRs. I'm going to go through them now to make sure we make progress on them. If you can just match the new style going forward, that's great. If you want me to take on the ownership of migrating the style of them because it's my fault this changed and partially my fault we let yours not get reviewed for a while, let me know and I'm happy to make branches that merge this appropriately into your work.

No worries. I love the new style and I'll be happy to migrate my PRs. I need to work tomorrow, though, so will take a look on the weekend.

@miniksa miniksa deleted the dev/miniksa/gardening3 branch December 31, 2019 18:39
@miniksa miniksa restored the dev/miniksa/gardening3 branch January 8, 2020 18:04
ghost pushed a commit that referenced this pull request Jan 8, 2020
## Summary of the Pull Request
When refactoring the `StateMachine::ProcessString` algorithm to use safer structures, I made an off-by-one error when attempting to simplify the loop. 

## References
- Introduced in #3956 

## PR Checklist
* [x] Closes #4116
* [x] I work here.
* [x] Tests added/passed
* [x] No documentation
* [x] I'm a core contributor.

## Detailed Description of the Pull Request / Additional comments
The algorithm in use exploited holding onto some pointers and sizes as it rotated around the loop to call back as member variables in the pass-through function `FlushToTerminal`. 

As a part of the refactor, I adjusted to persisting a `std::wstring_view` of the currently processing string instead of pointer/size. I also attempted to simplify the loop at the same time as both the individual and group branches were performing some redundant operations in respect to updating the "run" length. 

Turns out, I made a mistake here. I wrote it so it worked correctly for the bottom half where we transition from bulk printing to an escape but then I messed up the top case.

## Validation Steps Performed
- [x] Manual validation of the exact command given in the bug report.
- [x] Wrote automated tests to validate both paths through the `ProcessString` loop that work with the `_run` variable.
@j4james j4james mentioned this pull request Apr 1, 2020
5 tasks
ghost pushed a commit that referenced this pull request Apr 1, 2020
## Summary of the Pull Request

This PR clamps the parameter values in the VT `StateMachine` parser to 32767, which was the initial limit prior to PR #3956. This fixes a number of overflow bugs (some of which could cause the app to crash), since much of the code is not prepared to handle values outside the range of a `short`.

## References

#3956 - the PR where the cap was changed to the range of `size_t`
#4254 - one example of a crash caused by the higher range

## PR Checklist
* [x] Closes #5160
* [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA
* [x] Tests added/passed
* [ ] Requires documentation to be updated
* [ ] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx

## Detailed Description of the Pull Request / Additional comments

The DEC STD 070 reference recommends supporting up to at least 16384 for parameter values, so 32767 should be more than enough for any standard VT sequence. It might be nice to increase the limit to 65535 at some point, since that is the cap used by both XTerm and VTE. However, that is not essential, since there are very few situations where you'd even notice the difference. For now, 32767 is the safest choice for us, since anything greater than that has the potential to overflow and crash the app in a number of places.

## Validation Steps Performed

I had to make a couple of modifications to the range checks in the `OutputEngineTest`, more or less reverting to the pre-#3956 behavior, but after that all of the unit tests passed as expected.

I manually confirmed that this fixes the hanging test case from #5160, as well as overflow issues in the cursor operations, and crashes in `IL` and `DL` (see #4254 (comment)).
@DHowett-MSFT DHowett-MSFT deleted the dev/miniksa/gardening3 branch May 5, 2020 02:05
DHowett added a commit that referenced this pull request Jul 14, 2020
We were using std::basic_string_view as a stand-in for std::span so that
we could change over all at once when C++20 dropped with full span
support. That day's not here yet, but as of 54a7fce we're using GSL 3,
whose span is C++20-compliant.

This commit replaces every instance of basic_string_view that was not
referring to an actual string with a span of the appropriate type.

I moved the `const` qualifier into span's `T` because while
`basic_string_view.at()` returns `const T&`, `span.at()` returns `T&`
(without the const). I wanted to maintain the invariant that members of
the span were immutable.

* Mechanical Changes
   * `sv.at(x)` -> `gsl::at(sp, x)`

I had to replace a `std::basic_string<>` with a `std::vector>` in
ConImeInfo, and I chose to replace a manual array walk in
ScreenInfoUiaProviderBase with a ranged-for. Please review those
specifically.

This will almost certainly cause a code size regression in Windows
because I'm blowing out all the PGO counts. Whoops.

Related: #3956, #975.
DHowett added a commit that referenced this pull request Jul 15, 2020
We were using std::basic_string_view as a stand-in for std::span so that
we could change over all at once when C++20 dropped with full span
support. That day's not here yet, but as of 54a7fce we're using GSL 3,
whose span is C++20-compliant.

This commit replaces every instance of basic_string_view that was not
referring to an actual string with a span of the appropriate type.

I moved the `const` qualifier into span's `T` because while
`basic_string_view.at()` returns `const T&`, `span.at()` returns `T&`
(without the const). I wanted to maintain the invariant that members of
the span were immutable.

* Mechanical Changes
   * `sv.at(x)` -> `gsl::at(sp, x)`
   * `sv.c{begin,end}` -> `sp.{begin,end}` (span's iterators are const)

I had to replace a `std::basic_string<>` with a `std::vector<>` in
ConImeInfo, and I chose to replace a manual array walk in
ScreenInfoUiaProviderBase with a ranged-for. Please review those
specifically.

This will almost certainly cause a code size regression in Windows
because I'm blowing out all the PGO counts. Whoops.

Related: #3956, #975.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-CodeHealth Issues related to code cleanliness, linting, rules, warnings, errors, static analysis, etc. Area-VT Virtual Terminal sequence support Issue-Task It's a feature request, but it doesn't really need a major design. Needs-Second It's a PR that needs another sign-off Product-Conhost For issues in the Console codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants