Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime, internal/poll: don't ignore error in netpoll #30624

Closed
mikioh opened this issue Mar 6, 2019 · 5 comments
Closed

runtime, internal/poll: don't ignore error in netpoll #30624

mikioh opened this issue Mar 6, 2019 · 5 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@mikioh
Copy link
Contributor

mikioh commented Mar 6, 2019

As mentioned in #30426, there is various underlying stuff that doesn't support event notification mechanisms supported by the package runtime, like epoll, kqueue. The current runtime-integrated network poller is not designed for accommodating such stuff; simply, it covers up an error value of each event in the function runtime.netpoll and expects subsequent, mostly IO, system calls to capture the error state on the underlying stuff. Unfortunately, this is far from perfect if the underlying stuff doesn't return an error in the event registration but returns an error in the event scanning.

It's better to forward an error value on each event of event scanning to poll waiters in the package internet/poll for avoiding unnecessary confusion.

/cc @ianlancetaylor

@ianlancetaylor
Copy link
Contributor

What code needs to change? We already return the error from runtime_pollOpen. I don't know where else we would get an error that is specific to a particular file descriptor.

@mikioh
Copy link
Contributor Author

mikioh commented Mar 6, 2019

Just adding a new field to runtime.pollDesc to notify an error state on epoll_pwait or kevent, EPOLLERR in epoll_event.events or EV_ERROR in kevent.flags and kevent.data, and making netpollcheckerr handle the new error value for poll waiters in the package internal/poll might be enough [not tested].

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/166497 mentions this issue: runtime, internal/poll, net: report event scanning error on read event

@julieqiu julieqiu added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 11, 2019
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/167777 mentions this issue: runtime, internal/poll: report only critical event scanning error

gopherbot pushed a commit that referenced this issue Mar 19, 2019
This change makes the runtime-integrated network poller report only
critical event scanning errors.

In the previous attempt, CL 166497, we treated any combination of error
events as event scanning errors and it caused false positives in event
waiters because platform-dependent event notification mechanisms allow
event originators to use various combination of events.

To avoid false positives, this change makes the poller treat an
individual error event as a critical event scanning error by the
convention of event notification mechanism implementations.

Updates #30624.
Fixes #30817.
Fixes #30840.

Change-Id: I906c9e83864527ff73f636fd02bab854d54684ea
Reviewed-on: https://go-review.googlesource.com/c/go/+/167777
Run-TryBot: Mikio Hara <mikioh.public.networking@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/167782 mentions this issue: runtime: disable event scanning error reporting on solaris

gopherbot pushed a commit that referenced this issue Mar 20, 2019
It seems like we need to pay special attention to capturing error
condition on the event port of SmartOS. The previous attempt CL 167777
works on Oracle Solaris but doesn't work on SmartOS for the uncertain
reason. It's better to disable the reporting for now.

Updates #30624.
Fixes #30840.

Change-Id: Ieca5dac4fceb7e8c9cb4db149bb4c2e79691588c
Reviewed-on: https://go-review.googlesource.com/c/go/+/167782
Run-TryBot: Mikio Hara <mikioh.public.networking@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
nsd20463 added a commit to mistsys/tuntap that referenced this issue Sep 17, 2019
running the tuntap in go 1.13 resulted Interface.Read() returning
an "not pollable" error (from the runtime's poll.ErrNotPollable).
This, it turns out, is due to /dev/net/tun in linux not being pollable
(in the epoll sense) until after the TUNSETIFF ioctl has been done.

The right fix, done here, is to open /dev/net/tun as a raw file
descriptor, and ioctl it before constructing an *os.File which gets
added to the poll set when Read() is called.

 See golang/go#30426
 and golang/go#30624

and go source code commit a5fdd58c84b6b0a1ae5a53faebc0550024e3a066
which adds ErrNotPollable and exposes this error which otherwise
was getting silently thrown away.

This code works properly on the AP, too (master branch, using go 1.12.9,
but it should work a long way back)
lukedirtwalker added a commit to lukedirtwalker/scion that referenced this issue Nov 19, 2019
Recently we saw SIG acceptance tests fail more often that usual.
Since the Go1.13 change (scionproto#3362) we saw in some SIGs problems with the TUN device.
Investigating further we came across golang/go#30624.
From there we saw that water library is very old and still uses os.OpenFile,
which was replaced in other similar libs to mitigate afformentioned Go issue.
lukedirtwalker added a commit to scionproto/scion that referenced this issue Nov 19, 2019
Recently we saw SIG acceptance tests fail more often that usual.
Since the Go1.13 change (#3362) we saw in some SIGs problems with the TUN device.
Investigating further we came across golang/go#30624.
From there we saw that water library is very old and still uses os.OpenFile,
which was replaced in other similar libs to mitigate afformentioned Go issue.
@golang golang locked and limited conversation to collaborators Mar 18, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants