-
Notifications
You must be signed in to change notification settings - Fork 17.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime, internal/poll: don't ignore error in netpoll #30624
Comments
What code needs to change? We already return the error from |
Just adding a new field to runtime.pollDesc to notify an error state on epoll_pwait or kevent, EPOLLERR in epoll_event.events or EV_ERROR in kevent.flags and kevent.data, and making netpollcheckerr handle the new error value for poll waiters in the package internal/poll might be enough [not tested]. |
Change https://golang.org/cl/166497 mentions this issue: |
Change https://golang.org/cl/167777 mentions this issue: |
This change makes the runtime-integrated network poller report only critical event scanning errors. In the previous attempt, CL 166497, we treated any combination of error events as event scanning errors and it caused false positives in event waiters because platform-dependent event notification mechanisms allow event originators to use various combination of events. To avoid false positives, this change makes the poller treat an individual error event as a critical event scanning error by the convention of event notification mechanism implementations. Updates #30624. Fixes #30817. Fixes #30840. Change-Id: I906c9e83864527ff73f636fd02bab854d54684ea Reviewed-on: https://go-review.googlesource.com/c/go/+/167777 Run-TryBot: Mikio Hara <mikioh.public.networking@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change https://golang.org/cl/167782 mentions this issue: |
It seems like we need to pay special attention to capturing error condition on the event port of SmartOS. The previous attempt CL 167777 works on Oracle Solaris but doesn't work on SmartOS for the uncertain reason. It's better to disable the reporting for now. Updates #30624. Fixes #30840. Change-Id: Ieca5dac4fceb7e8c9cb4db149bb4c2e79691588c Reviewed-on: https://go-review.googlesource.com/c/go/+/167782 Run-TryBot: Mikio Hara <mikioh.public.networking@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
running the tuntap in go 1.13 resulted Interface.Read() returning an "not pollable" error (from the runtime's poll.ErrNotPollable). This, it turns out, is due to /dev/net/tun in linux not being pollable (in the epoll sense) until after the TUNSETIFF ioctl has been done. The right fix, done here, is to open /dev/net/tun as a raw file descriptor, and ioctl it before constructing an *os.File which gets added to the poll set when Read() is called. See golang/go#30426 and golang/go#30624 and go source code commit a5fdd58c84b6b0a1ae5a53faebc0550024e3a066 which adds ErrNotPollable and exposes this error which otherwise was getting silently thrown away. This code works properly on the AP, too (master branch, using go 1.12.9, but it should work a long way back)
Recently we saw SIG acceptance tests fail more often that usual. Since the Go1.13 change (scionproto#3362) we saw in some SIGs problems with the TUN device. Investigating further we came across golang/go#30624. From there we saw that water library is very old and still uses os.OpenFile, which was replaced in other similar libs to mitigate afformentioned Go issue.
Recently we saw SIG acceptance tests fail more often that usual. Since the Go1.13 change (#3362) we saw in some SIGs problems with the TUN device. Investigating further we came across golang/go#30624. From there we saw that water library is very old and still uses os.OpenFile, which was replaced in other similar libs to mitigate afformentioned Go issue.
As mentioned in #30426, there is various underlying stuff that doesn't support event notification mechanisms supported by the package runtime, like epoll, kqueue. The current runtime-integrated network poller is not designed for accommodating such stuff; simply, it covers up an error value of each event in the function runtime.netpoll and expects subsequent, mostly IO, system calls to capture the error state on the underlying stuff. Unfortunately, this is far from perfect if the underlying stuff doesn't return an error in the event registration but returns an error in the event scanning.
It's better to forward an error value on each event of event scanning to poll waiters in the package internet/poll for avoiding unnecessary confusion.
/cc @ianlancetaylor
The text was updated successfully, but these errors were encountered: