Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use file descriptor 0 to read stdin #198

Closed
wants to merge 1 commit into from

Conversation

espadrine
Copy link
Contributor

@espadrine espadrine commented Nov 11, 2020

On Linux, when Node.js (at least, v15.2.0) does a readFileSync of
/dev/tty, it hangs.

The addition of /dev/tty comes from 9bcf45b,
which converts to ESM.
It also switches from reading /dev/stdin, to:

  1. on Windows, reading file descriptor 0;
  2. on other platforms, reading /dev/tty
    (which presumably works on macOS, but not Linux).

However, based on the Node.js documentation,
reading file descriptor 0 always reads stdin on all platforms,
thanks to a compatibility layer in Node.js.

Subtly, for Linux specifically, using /dev/stdin
instead of the zero file descriptor, enables the following use-case:
just enter commonmark on the console, causing it to wait for prompt.
You can then type text followed by Control-D
to have that text be the input.

| File       | Linux           | macOS           | Windows   |
|============|=================|=================|===========|
|   |  Pipe  | ✓               | ✓               | ✓         |
| 0 |--------|-----------------|-----------------|-----------|
|   | Prompt | EAGAIN          | EAGAIN          | TypeError |
|~~~~~~~~~~~~|~~~~~~~~~~~~~~~~~|~~~~~~~~~~~~~~~~~|~~~~~~~~~~~|
|            | ✓               | ✓               | ENOENT    |
| /dev/stdin |-----------------|-----------------|-----------|
|            | ✓               | EAGAIN          | ENOENT    |
|~~~~~~~~~~~~|~~~~~~~~~~~~~~~~~|~~~~~~~~~~~~~~~~~|~~~~~~~~~~~|
|            | (input ignored) | (input ignored) | ENOENT    |
| /dev/tty   |-----------------|-----------------|-----------|
|            | ✓               | ✓               | ENOENT    |

Thus the optimal choice is to use /dev/stdin for Linux,
and file descriptor zero for the others.

@jgm
Copy link
Member

jgm commented Nov 11, 2020

On macOS with 0 I get

Error: EAGAIN: resource temporarily unavailable, read
    at Object.readSync (fs.js:590:3)
    at tryReadSync (fs.js:364:20)
    at Proxy.readFileSync (fs.js:401:19)
    at Object.<anonymous> (/Users/jgm/src/commonmark.js/bin/commonmark:83:19)
    at Generator.next (<anonymous>) {
  errno: -35,
  syscall: 'read',
  code: 'EAGAIN'
}

@espadrine
Copy link
Contributor Author

I need to investigate that.

I tried on my macOS machine, and was successful doing this:

$ node -v
v15.2.0
$ npm install
$ echo test | bin/commonmark
<p>test</p>

I am on Catalina 10.15.7.

Which node version and mac version are you on?

@jgm
Copy link
Member

jgm commented Nov 15, 2020

node v14.13.1
Catalina

Note: piping input works fine (as in your test). But if I just do bin/commonmark, I get the output noted above instead of havig it wait for input.

@espadrine
Copy link
Contributor Author

espadrine commented Nov 15, 2020

Ah, thanks! The plot deepens considerably: the same is true on Linux.

When piping (echo test | bin/commonmark):

File Linux macOS Windows
0
/dev/stdin ENOENT
/dev/tty (input ignored) (input ignored) ENOENT

When entering it with bin/commonmark and waiting for the prompt:

File Linux macOS Windows
0 EAGAIN EAGAIN TypeError
/dev/stdin EAGAIN ENOENT
/dev/tty ENOENT

(By input ignored, I mean that the text sent through the echo is not read;
the commands waits for manual input and Control-D.
Which breaks use as a piping system.)

It is surprising to me that /dev/stdin does what we would expect in both cases,
but only on Linux. On macOS, there is no solution that addresses both cases.

Given the data, I would personally favour /dev/stdin,
since it does the right thing on macOS on the more useful scenario (piping).
What do you think?


Edit: I added Windows information. On Windows, from what I can see, only 0 is useful.

@espadrine
Copy link
Contributor Author

(Patch modified to use /dev/stdin for Linux, and 0 for macOS and Windows.)

@jgm
Copy link
Member

jgm commented Nov 16, 2020

I'm not too happy with a solution that doesn't work for both cases on macOS. It's pretty hard to believe that there's not a solution for this in node, though I don't really write JavaScript (except for this project), so I don't know.

@jgm
Copy link
Member

jgm commented Nov 16, 2020

This seems to work in both cases on macOS:

if (process.stdin.isTTY) {
  inps.push(fs.readFileSync('/dev/tty', 'utf-8'));
} else {
  inps.push(fs.readFileSync('/dev/stdin', 'utf-8'));
}

@espadrine
Copy link
Contributor Author

espadrine commented Nov 22, 2020

I updated the code to be inspired by your suggestion.

It now works on pipes and TTY both on Linux and macOS.
It only works on pipes on Windows (except WSL, obviously, where I expect it works on both).

I also made the code platform-independent.

@jgm
Copy link
Member

jgm commented Nov 22, 2020

Great. Do I understand correctly that the situation is no worse off than before on Windows, since running bin/commonmark and then typing the content didn't work before either?

Surely there has to be a way to make that last case work in Windows as well?

@espadrine
Copy link
Contributor Author

Do I understand correctly that the situation is no worse off than before on Windows, since running bin/commonmark and then typing the content didn't work before either?

Yes, that is correct.

There might be a way to achieve it, but from my understanding, the Windows ecosystem is not built for it.
You have cmd.exe which is a clunky command line system with a limited feature set anyway,
and Powershell which has pretty different principles than the ones you expect from a Linux-like experience.

I think most developers don’t make heavy use of the shell on Windows, instead using IDEs.
See this 2018 article from Microsoft indicating they did not have PTY back then:

Windows (currently) does not provide Terminals a way to supply the communication pipes via which it wants to communicate with a Command-Line app. 3rd party Terminals are forced to create an off-screen Console, and to send it user-input and scrape its output, redrawing the output on the 3rd party Console’s own display!

It feels like the way forward from Microsoft’s point of view may be WSL2,
which virtualizes a Linux bash session with a deep Windows integration.
But the paint seems fresh still.

@jgm
Copy link
Member

jgm commented Nov 23, 2020

There might be a way to achieve it, but from my understanding, the Windows ecosystem is not built for it.

I develop a command line application (pandoc) which works fine in this way on Windows (either typing the content or as a pipe). This is just using the standard Haskell functions for IO with stdin. So I know it's not technically impossible. It's a serious shortcoming in node if it's not possible to do this in node; but I'm skeptical that it's not possible and I think further investigation could reveal a way.

@espadrine
Copy link
Contributor Author

espadrine commented Nov 23, 2020

Could you clarify what happens with Pandoc?

I installed it, opened cmd.exe (and also tried the same with Powershell), and did:

C:\>pandoc
test
^D

but nothing happened.

@jgm
Copy link
Member

jgm commented Nov 23, 2020

You have to use ^Z instead of ^D on Windows command shell.

On Linux, when Node.js (at least, v15.2.0) does a readFileSync of
/dev/tty, it hangs.

The addition of /dev/tty comes from 9bcf45b,
which converts to ESM.
It also switches from reading /dev/stdin, to:

1. on Windows, reading file descriptor 0;
2. on other platforms, reading /dev/tty
   (which presumably works on macOS, but not Linux).

However, based on the [Node.js documentation][fd0],
reading file descriptor 0 always reads stdin on all platforms,
thanks to a compatibility layer in Node.js.

Subtly, for Linux specifically, using /dev/stdin
instead of the zero file descriptor, enables the following use-case:
just enter `commonmark` on the console, causing it to wait for prompt.
You can then type text followed by Control-D
to have that text be the input.

    | File       | Linux           | macOS           | Windows   |
    |============|=================|=================|===========|
    |   |  Pipe  | ✓               | ✓               | ✓         |
    | 0 |--------|-----------------|-----------------|-----------|
    |   | Prompt | EAGAIN          | EAGAIN          | TypeError |
    |~~~~~~~~~~~~|~~~~~~~~~~~~~~~~~|~~~~~~~~~~~~~~~~~|~~~~~~~~~~~|
    |            | ✓               | ✓               | ENOENT    |
    | /dev/stdin |-----------------|-----------------|-----------|
    |            | ✓               | EAGAIN          | ENOENT    |
    |~~~~~~~~~~~~|~~~~~~~~~~~~~~~~~|~~~~~~~~~~~~~~~~~|~~~~~~~~~~~|
    |            | (input ignored) | (input ignored) | ENOENT    |
    | /dev/tty   |-----------------|-----------------|-----------|
    |            | ✓               | ✓               | ENOENT    |

Thus we split the decision in a platform-independent way:

1. When in a TTY situation, use /dev/tty.
   Except on Windows, which does not have a PTY,
   so we manually read lines until we hit ^Z (which is EOF on Windows).
2. Otherwise, use file descriptor 0, which works across all platforms.

[fd0]: https://nodejs.org/api/process.html#process_process_stdin_fd
@espadrine
Copy link
Contributor Author

espadrine commented Nov 23, 2020

From what I can see of the GHC source, they seem to handle ^Z manually.

I pushed a commit that did the same.
It requires a bit more explicit handling than Haskell, as Node.js explicitly doesn't support Ctrl+Z on Windows.

Comment on lines +92 to +94
const doc = parser.parse(inp);
const rendered = renderer.render(doc);
if (!options.time) { process.stdout.write(rendered); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this belong here? Aren't we going to fall through to line 109 and expect inps to contain an array of strings?
I am clueless about how async stuff works in JS, but is there a way to do this synchronously to make the code more straightforward?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will fall through, but with a files array that is empty (a requirement of the top if condition), so the rest of the code prints nothing.

There is no built-in way to do things synchronously with readline.
Node.js encourages asynchronous operation as it was its main philosophical tenet.
The fs API is an exemption to the rule.

There are external libraries that I think can do that, such as readline-sync.
It seems to do some gnarly background-process spawning.

We can also convert the whole file to asynchronous operation, which is a more significant patch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. It would be nice to avoid duplicating the parsing and rendering code like this -- if it takes making the whole thing async, maybe that's the best approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, maybe it's not worth the effort. I can just merge this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think this is not good as it is. You fall through and then end up parsing another empty document and printing the result of rendering it. I'd want to avoid that.

@jgm jgm mentioned this pull request Dec 4, 2020
@jgm
Copy link
Member

jgm commented Dec 4, 2020

PR #203, now merged, rolls back the esm changes that also included the changes to input handling that prompted your PR. I cannot reconstruct why that input handling change was included with the esm change, but there must have been a reason.

Anyway, the current code is back to using /dev/stdin on all platforms.

I've verified that this works on macos and linux, both as a pipe and typing input on the terminal.

It would be good to check Windows, if you can.

@espadrine
Copy link
Contributor Author

espadrine commented Dec 5, 2020

I just checked on Windows. Unsurprisingly given the previous tests, neither piping nor TTY works on Windows.

image

Since this patch is bitrot and my use-case is now handled, I will close this pull request.

@espadrine espadrine closed this Dec 5, 2020
@jgm
Copy link
Member

jgm commented Dec 5, 2020

OK, good to know.

jgm added a commit that referenced this pull request Dec 5, 2020
Use file descriptor 0 instead of '/dev/stdin'.
Note that this allows piping but doesn't handle
the case where users run `bin/commonmark` and
enter input directly.  See #198 for some relevant
discussion.
@jgm
Copy link
Member

jgm commented Dec 5, 2020

I've added some code that uses 0 instead of /dev/stdin on windows.
This should give us support for the piping case, at least (if that test is still valid after the esm changes). I'm not sure I want to add a bunch of hairy async code to support the direct input case on windows, but thank you for showing me what would be needed.

@espadrine
Copy link
Contributor Author

I agree with this compromise, as I mentioned.

After all, the Windows ecosystem barely has a true TTY; and writing a Markdown document at the console is a poor experience that I expect nobody prefers to a text editor.
Adding code complexity for an unmet use-case seems like a suboptimal tradeoff.

I can confirm that piping now works on Windows.

@jgm
Copy link
Member

jgm commented Dec 5, 2020

Thanks for confirming, and for all of the above!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants