Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unneeded VT-specific control character handling #4289

Merged
8 commits merged into from
Jan 29, 2020

Conversation

j4james
Copy link
Collaborator

@j4james j4james commented Jan 18, 2020

Summary of the Pull Request

This PR removes all of the VT-specific functionality from the WriteCharsLegacy function that dealt with control characters, since those controls are now handled in the state machine when in VT mode. It also removes most of the control character handling from the Terminal::_WriteBuffer method for the same reason.

References

This is a followup to PR #4171

PR Checklist

Detailed Description of the Pull Request / Additional comments

There are four changes to the WriteCharsLegacy implementation:

  1. The TAB character had special case handling in VT mode which is now no longer required. This fixes a bug in the Python REPL editor (when run from a cmd shell in Windows Terminal), which would prevent you tabbing past the end of the line. It also fixes Backspacing a tab character in conpty causes visual corruption #3971.

  2. Following on from point 1, the WC_NONDESTRUCTIVE_TAB flag could also now be removed. It only ever applied in VT mode, in which case the TAB character isn't handled in WriteCharsLegacy, so there isn't a need for a non-destructive version.

  3. There used to be special case handling for a BS character at the beginning of the line when in VT mode, and that is also no longer required. This fixes an edge-case bug which would prevent a glyph being output for code point 8 when ENABLE_PROCESSED_OUTPUT was disabled.

  4. There was quite a lot of special case handling for control characters in the "end-of-line wrap" implementation, which is no longer required. This fixes a bug which would prevent "low ASCII" characters from wrapping when output at the end of a line.

Then in the Terminal::_WriteBuffer implementation, I've simply removed all control character handling, except for LF. The Terminal is always in VT mode, so the control characters are always handled by the state machine. The exception for the LF character is simply because it doesn't have a proper implementation yet, so it still passes the character through to _WriteBuffer. That will get cleaned up eventually, but I thought that could wait for a later PR.

Finally, with the removal of the VT mode handling in WriteCharsLegacy, there was no longer a need for the SCREEN_INFORMATION::InVTMode method to be publicly accessible. That has now been made private.

Validation Steps Performed

I've only tested manually, making sure the conhost and Windows Terminal still basically work, and confirming that the above-mentioned bugs are fixed by these changes.

Copy link
Member

@zadjii-msft zadjii-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scares me, especially the bits about the delayed EOL wrap bits, but I think that this will work right.

@zadjii-msft
Copy link
Member

@msftbot make sure @miniksa signs off on this

@ghost ghost added the AutoMerge Marked for automatic merge by the bot when requirements are met label Jan 19, 2020
@ghost
Copy link

ghost commented Jan 19, 2020

Hello @zadjii-msft!

Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:

  • I'll only merge this pull request if it's approved by @miniksa

If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you".

@miniksa
Copy link
Member

miniksa commented Jan 19, 2020

I'll get to this on Tuesday probably. We have Monday off and I'm watching the NFC game right now like @zadjii-msft should be given the Packers are still in it. :P

@zadjii-msft zadjii-msft added the Needs-Second It's a PR that needs another sign-off label Jan 19, 2020
@zadjii-msft
Copy link
Member

zadjii-msft commented Jan 19, 2020

Ahaha yea, I was just filling the time between the games 😝

༼ つ ◕_◕ ༽つ TAKE MY ENERGY PACKERS ༼ つ ◕_◕ ༽つ

EDIT: oof

@miniksa
Copy link
Member

miniksa commented Jan 22, 2020

@zadjii-msft, oof indeed.

Also, sorry, Tuesday turned out to be Performance Review day, so I spent all day procrastinating on and writing it.

Looking now. I intend to double check the changes against the historical versions of WriteCharsLegacy from conhostv1 to make sure we're only tampering with things that I've added in the course of the last several years for VT reasons.

@j4james
Copy link
Collaborator Author

j4james commented Jan 23, 2020

Looking now. I intend to double check the changes against the historical versions of WriteCharsLegacy from conhostv1 to make sure we're only tampering with things that I've added in the course of the last several years for VT reasons.

I was hoping you'd be able to do that! Ideally we could eventually get this back to exactly how it was in the v1 conhost, once we've got a dedicated stream writer for VT mode (i.e. #780).

But if you have any concerns about these changes, I was going to suggest maybe splitting it up into separates PRs? The most important change is probably the first commit, because that fixes #3971. The rest are really edge-case bugs that are unlikely to be encountered in practice.

Copy link
Member

@miniksa miniksa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are mostly notes to myself. I'm still reviewing and I don't want GH to choke and lose them somehow.

src/cascadia/TerminalCore/Terminal.cpp Outdated Show resolved Hide resolved
src/host/_stream.cpp Show resolved Hide resolved
src/host/outputStream.cpp Show resolved Hide resolved
@miniksa
Copy link
Member

miniksa commented Jan 23, 2020

But if you have any concerns about these changes, I was going to suggest maybe splitting it up into separates PRs? The most important change is probably the first commit, because that fixes #3971. The rest are really edge-case bugs that are unlikely to be encountered in practice.

I'm not concerned about splitting them up. @zadjii-msft and @DHowett-MSFT are probably rightfully afraid of this area, but making a big mess all at once was my signature thing in this space a few years ago. So just give me a bit to walk through. I'd rather work the whole problem at once instead of chasing 5 PRs.

@miniksa
Copy link
Member

miniksa commented Jan 23, 2020

These are mostly notes to myself. I'm still reviewing and I don't want GH to choke and lose them somehow.

Also I will likely finish in the morning because I have to go pick up the kiddo from daycare soon.

@ghost ghost removed the AutoMerge Marked for automatic merge by the bot when requirements are met label Jan 23, 2020
@miniksa
Copy link
Member

miniksa commented Jan 27, 2020

Yes, yes, @zadjii-msft, I hear you.

@miniksa
Copy link
Member

miniksa commented Jan 27, 2020

OK, just looking through WriteCharsLegacy callers.... I've taken some notes.

It looks like:

Usages of WriteCharsLegacy

The API is only used in 2 ways, flags wise:

  1. WC_LIMIT_BACKSPACE | WC_NONDESTRUCTIVE_TAB | WC_DELAY_EOL_WRAP - 1 use
  2. WC_DESTRUCTIVE_BACKSPACE | WC_KEEP_CURSOR_VISIBLE | WC_ECHO - all the other uses
  • ** NOTE on 2: One of them has a slight variance, readDataCooked:724. This one optionally sets WC_KEEP_CURSOR_VISIBLE. But I'm 95% sure that is pointless and it could just set it all the time to no ill effect or be worked around otherwise.

And the usages only fall into two categories:

  • A. Supporting stream writing operations - 2 uses
  • B. Supporting cooked input lines - all other uses

Inventory of callers:

Callsite Flags Use Type
_outputstream.cpp:81 1 A
_stream.cpp:939 2 A
readDataCooked:844 2 B
readDataCooked:782 2 B
readDataCooked:724 2 B
readDataCooked:611 2 B
readDataCooked:529 2 B
commandNumberPopup.cpp:69 2 B
commandNumberPopup.cpp:38 2 B
Cmdline.cpp:984 2 B
Cmdline.cpp:934 2 B
Cmdline.cpp:891 2 B
Cmdline.cpp:310 2 B
Cmdline.cpp:474 2 B
Cmdline.cpp:509 2 B
Cmdline.cpp:545 2 B
Cmdline.cpp:1031 2 B
Cmdline.cpp:269 2 B
Cmdline.cpp:572 2 B
Cmdline.cpp:599 2 B
Cmdline.cpp:1086 2 B

Commentary

  • If the type 2 flags calls become all that's left, there's really no point in passing the flags at all, testing the flags for these, or having any variation at all left in WriteCharsLegacy around them because the behavior won't vary.
  • All type B usages are probably due for some kind of refactor anyway at some point because the way they manipulate the buffer is always weird and mostly unsafe. We tried to protect it slightly with the readDataCooked class but that work never finished.

Actions

Part 1

  • Finish this PR (+/- the move of the final \n operation into a function)
  • Move terminus of conhost.exe parser (_outputstream.cpp:81) from calling WriteCharsLegacy into calling TextBuffer::Write since that's where it will end up getting to anyway after running through metric tons of code to realize it has no C0s anymore and just needs to call TextBuffer::Write anyway.
  • Clean up terminus of wt.exe parser (Terminal::_WriteBuffer) before the TextBuffer::Write operation.
  • Evaluate the following things and ensure they're either implemented in TextBuffer::Write or into a new function/options on that function to handle the needs of both conhost.exe and wt.exe uniformly:
  1. Notifying the accessibility framework of what changed
  2. Moving the cursor on their behalf instead of making them manage it
  3. Updating the way they scroll to either notify them or have them look up the final cursor (if they care) and adjust their viewport over the buffer
  • Stop accepting flags on WriteCharsLegacy and just make it always use those three flags and take out any switches on those to simplify it so we can better understand how to finally get rid of it entirely when getting around to improving the cooked read lines.

Part 2

This focuses on the 2A item, _stream.cpp:939. That's the method WriteChars.

  • After Part 1, WriteChars is the last non-input-line-related caller of WriteCharsLegacy
  • The only caller of WriteChars is DoWriteConsole, a method that used to have a bunch of other Do* friends that I've killed over the years in favor of more organized ApiDispatchers and friends.
  • The only remaining caller of DoWriteConsole is WriteConsoleWImplHelper*
    (The * is because there's actually a second caller, WriteData::Notify, but that's simply the wait-routine servicer of WriteConsoleWImplHelper... a.k.a. it's the one who dispatches that call "later" if it turns out that the buffer was blocked at the time the call was first dispatched and will be called later when the block is released. This is the "click on the screen to make a selection stops the clients from writing" thing.)
  • WriteConsoleWImplHelper is responsible for the three public APIs:
  1. WriteConsoleW
  2. WriteConsoleA
  3. WriteFile (when the HANDLE is a console handle).
  • My theory on these is that what the public operations want here is similar enough to what the VT ones want that once we have the C0 commands implemented here and have beefed up TextBuffer::Write (or another function that looks very close to it, the stream writer), that we could feed them directly into a VT parser in a mode that only pulls out the relevant C0s and dispatches them to an Adapter implementation that performs things like a destructive backspace and then feeds the final result into the TextBuffer::Write.

Conclusion

  • Doing Part 1 is what we need to satisfy the "Credible stream writing" for the Terminal product wt.exe.
  • Doing Part 2 is all conhost.exe work and would be great for maintainability as we'd be platforming everything on the same things.
  • Making a less crappy cooked read suite could be a Part 3, is conhost.exe-centric, and I'm not sure how much real value it has besides letting us kill WriteCharsLegacy once and for all. I might try to justify it on the buffer insecurity of the cmdline.cpp and cookedReadData.cpp accesses to memory locations.

cc: #780

@j4james
Copy link
Collaborator Author

j4james commented Jan 28, 2020

@miniksa: Regarding the notes above, I think there are couple of things you may be overlooking in part 1, which we'd need to address before we could replace WriteCharsLegacy with TextBuffer::Write.

  1. Delayed EOL wrap
  2. The option to not wrap at all when DECAWM is reset
  3. The wrapping/scrolling margins (initially DECSTBM, but we'll eventually want to support DECSLRM too)

The first two are probably not that big a deal, but I can see the margin handling becoming a bit of a nightmare again.

@miniksa
Copy link
Member

miniksa commented Jan 28, 2020

@miniksa Michael Niksa FTE: Regarding the notes above, I think there are couple of things you may be overlooking in part 1, which we'd need to address before we could replace WriteCharsLegacy with TextBuffer::Write.

  1. Delayed EOL wrap
  2. The option to not wrap at all when DECAWM is reset
  3. The wrapping/scrolling margins (initially DECSTBM, but we'll eventually want to support DECSLRM too)

The first two are probably not that big a deal, but I can see the margin handling becoming a bit of a nightmare again.

I wrote it out hoping you would spot considerations I missed like that and we could discuss further. Thanks! Here's my thoughts on each:

For 1, I think delayed EOL wrap should maybe be the default behavior there and the true legacy ways (the ones coming in through the API) should force-advance and then backspace once to get the non-delayed behavior.

For 2, there's an alternate function of TextBuffer::WriteLine that always stops when it reaches the right bounds. TextBuffer::Write calls that in a loop. I'd think things like DECAWM could use that conditionally.

For 3, One of the parameters to TextBuffer::Write, in one of its incarnations, is a bounding rectangle that defaults to the whole buffer. Perhaps when DECSTBM or DECSLRM are set, the Write caller passes in a bounding rectangle with those to the optional parameter?

Any of that sound reasonable? Am I still blind to something?

@j4james
Copy link
Collaborator Author

j4james commented Jan 28, 2020

For 1, I think delayed EOL wrap should maybe be the default behavior there and the true legacy ways (the ones coming in through the API) should force-advance and then backspace once to get the non-delayed behavior.

I'm not sure about this. How would you force-advance? You couldn't write out something like a space, because you need it to be non-destructive. And you couldn't just use CR/LF to move to the next line because you assumedly want the line to be marked as "wrapped" for resizing purposes.

I suspect it might be easier in the end to have a flag for this. Although maybe that's not necessary if the legacy code never goes this route. Maybe it's fine if the delayed wrap is always applied here?

For 2, there's an alternate function of TextBuffer::WriteLine that always stops when it reaches the right bounds. TextBuffer::Write calls that in a loop. I'd think things like DECAWM could use that conditionally.

Conceptually I like this, but I'm not certain it's the right approach. As I understand it, WriteLine stops writing when it reaches the end of the line, but it then leaves the "unused" content unwritten. For DECAWM, we actually need it to consume all of the content, just overwriting the last character every time. I suppose that could be achieved by calling WriteLine repeatedly until everything is consumed, but that feels a bit messy.

For 3, One of the parameters to TextBuffer::Write, in one of its incarnations, is a bounding rectangle that defaults to the whole buffer. Perhaps when DECSTBM or DECSLRM are set, the Write caller passes in a bounding rectangle with those to the optional parameter?

An optional bounding rectangle sounds good. However, bear in mind that it's not really equivalent to the default buffer boundaries. If you're constrained by the margin boundaries, then a scroll event requires that you shift part of the buffer content, whereas a regular scroll would move the viewport origin or cycle the buffer.

Also, if you've started off outside the margin boundaries, then you're not constrained by them, but you are constrained by the viewport boundary. Only in that case, you don't scroll when you reach the bottom of the viewport, but I think you do still wrap at the right viewport boundary. It's complicated.

Also, I expect we'll want to keep at least some of this functionality in a separate method so it can be reused by other operations. In particular linefeeds need to implement the exact same margin/scrolling behaviour as wrapped text.

That said, as long as you're aware of all the issues, I don't think it's necessary we figure out the perfect plan in advance. It often helps to just dive in and start writing something, and if you don't like how it turns out, you can always start again. Eventually you'll settle on the solution you find least offensive. At least, that's what I do. ;)

@miniksa
Copy link
Member

miniksa commented Jan 29, 2020

For 1, I think delayed EOL wrap should maybe be the default behavior there and the true legacy ways (the ones coming in through the API) should force-advance and then backspace once to get the non-delayed behavior.

I'm not sure about this. How would you force-advance? You couldn't write out something like a space, because you need it to be non-destructive. And you couldn't just use CR/LF to move to the next line because you assumedly want the line to be marked as "wrapped" for resizing purposes.

Hm, I was thinking that it might not need to be non-destructive. For instance, the case where we're writing to the bottom right corner of the buffer, the next operation once it leaves the right edge is to go down a line and "rotate the circular buffer" which destroys the contents of the line that's about to swap in.

But I suppose if we're NOT in the absolute bottom-right corner of the buffer, only the right edge of a viewport, that we wouldn't necessarily want to destroy the value in the 0th column of the next line. So everywhere except X=Max,Y=Max it would have to be non-destructive.

I suspect it might be easier in the end to have a flag for this. Although maybe that's not necessary if the legacy code never goes this route. Maybe it's fine if the delayed wrap is always applied here?

Yeah perhaps it's best that it's either a flag or something then.

For 2, there's an alternate function of TextBuffer::WriteLine that always stops when it reaches the right bounds. TextBuffer::Write calls that in a loop. I'd think things like DECAWM could use that conditionally.

Conceptually I like this, but I'm not certain it's the right approach. As I understand it, WriteLine stops writing when it reaches the end of the line, but it then leaves the "unused" content unwritten. For DECAWM, we actually need it to consume all of the content, just overwriting the last character every time. I suppose that could be achieved by calling WriteLine repeatedly until everything is consumed, but that feels a bit messy.

Nnnnn. More flags. Or something. I don't like the idea of adding more and more and more flags and would hope we could just find a systematic way out of this. Perhaps the initial naive implementation just calls it repeatedly until everything is consumed. Then we optimize later. The assorted TextBuffer::Write* methods already leave a bit to be desired in terms of optimization. They could be performing look-aheads to bulk insert data when it matches certain common patterns that they aren't (and that the original legacy algorithms did to some degree). But it doesn't seem to be that painful right now, performance wise... or at least not that I'm aware of.

For 3, One of the parameters to TextBuffer::Write, in one of its incarnations, is a bounding rectangle that defaults to the whole buffer. Perhaps when DECSTBM or DECSLRM are set, the Write caller passes in a bounding rectangle with those to the optional parameter?

An optional bounding rectangle sounds good. However, bear in mind that it's not really equivalent to the default buffer boundaries. If you're constrained by the margin boundaries, then a scroll event requires that you shift part of the buffer content, whereas a regular scroll would move the viewport origin or cycle the buffer.

Also, if you've started off outside the margin boundaries, then you're not constrained by them, but you are constrained by the viewport boundary. Only in that case, you don't scroll when you reach the bottom of the viewport, but I think you do still wrap at the right viewport boundary. It's complicated.

Ah, yes. Perhaps the passed bounding rectangle for writing is also the range that is used for the TextBuffer::ScrollRows operation. If we're writing in a mode where we "Write to the end of the iterator content" not a mode where "we write until we run out of space", then the TextBuffer::Write* entrypoint would just call ScrollRows on itself with the parameters of the bounding rectangle until it ran out of space.

Also, I expect we'll want to keep at least some of this functionality in a separate method so it can be reused by other operations. In particular linefeeds need to implement the exact same margin/scrolling behaviour as wrapped text.

That said, as long as you're aware of all the issues, I don't think it's necessary we figure out the perfect plan in advance. It often helps to just dive in and start writing something, and if you don't like how it turns out, you can always start again. Eventually you'll settle on the solution you find least offensive. At least, that's what I do. ;)

Yes. Yes. Generally I've had the most success just mucking about with it piece by piece and evolving it over time. Let's just make progress here. If you're OK without 100% pre-planning, so am I.

Copy link
Member

@miniksa miniksa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do this, @j4james! Approved.

@miniksa
Copy link
Member

miniksa commented Jan 29, 2020

I'm going to try a re-run on the thing hoping that it remerges against master. But if not, we need to merge master in to see if the flakiness subsides.

I'd click the button, but I royally screwed up your branch last time, @j4james.

@miniksa
Copy link
Member

miniksa commented Jan 29, 2020

Quick. I'm merging it before the system looks at it funny and fails again.

@miniksa miniksa added AutoMerge Marked for automatic merge by the bot when requirements are met and removed Needs-Second It's a PR that needs another sign-off labels Jan 29, 2020
@ghost
Copy link

ghost commented Jan 29, 2020

Hello @miniksa!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@ghost ghost merged commit c69757e into microsoft:master Jan 29, 2020
@j4james j4james deleted the cleanup-vt-control-chars branch February 1, 2020 12:56
@ghost
Copy link

ghost commented Feb 13, 2020

🎉Windows Terminal Preview v0.9.433.0 has been released which incorporates this pull request.:tada:

Handy links:

@DHowett-MSFT
Copy link
Contributor

🎉 Once again, thanks for the contribution!

This pull request was included in a set of conhost changes that was just
released with Windows Insider Build 19603.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AutoMerge Marked for automatic merge by the bot when requirements are met
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Backspacing a tab character in conpty causes visual corruption
4 participants