-
Notifications
You must be signed in to change notification settings - Fork 697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Elasticsearch HTTP error handling #4922
Conversation
Jenkins test this please |
Jenkins test this please java.lang.IllegalStateException: JENKINS-37121: something already locked /var/lib/jenkins/workspace/cloud-on-k8s-pr |
StatusCode: response.StatusCode, | ||
} | ||
// We may need to read the body multiple times, read the full body and store it as an array of bytes. | ||
body, err := ioutil.ReadAll(response.Body) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would put the defer response.Body.Close()
statement here (right after we decide to use it, don't forget to close it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something, the response body is always read by body, err := ioutil.ReadAll(response.Body)
, there is no condition. This is why the defer response.Body.Close()
is right before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think by convention people tend to put the defer
statement immediately after the line that requires it (effective go example, uber go style example), but I really don't mind much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the situation is a bit different here since the stream is already opened. The purpose is to be sure that it is effectively closed when we exit the function and no one else will read from it later.
} | ||
// Reset the response body to the original unread state. It allows a caller to read again the body if necessary, | ||
// for example in the case of a 408. | ||
response.Body = ioutil.NopCloser(bytes.NewBuffer(body)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we read that body in case of 408? Or just check the status code?
The code flow here gets a bit tricky. I'm wondering if it may be simpler to always return either response, nil
or nil, err
for any http request we make. So that function that creates the APIError
would read the response body, and then we don't care about that response body anymore, and there's no way to retrieve it (it is still stored in the error object if we need it).
Checking whether the api call we made returned a 408 would be a call to a helper function like client.IsTimeout(err)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we read that body in case of 408? Or just check the status code?
In case of a 408, and if the path is /_cluster/health
, then the error is later skipped, and baseClient.request(..)
attempts to read again the body from the response.
I think I wanted to be conservative with this PR, keeping the responsibility of reading the "common" errors here, while delegating the more specific ones to the callers. This is the current flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I tend to feel like this is more complicated than it should be, but I'm fine with keeping that PR small and focused on the problem it solves.
var errorResponse ErrorResponse | ||
if err := json.Unmarshal(body, &errorResponse); err != nil { | ||
// Only log at the debug level since it is expected to not be able to parse all types of errors. | ||
// Some errors, like 408 on /_cluster/health may return a different body structure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we know and expect 408 will return a different body, should we just not parse the error body in case of 408?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's tricky because we don't know how to parse the body in case of a "custom" error here.
For now it's the responsibility of baseClient.request(..,responseObj interface{},skipErrFunc func(error) bool)
to decide:
- If an API error can be skipped (using
skipErrFunc
) and should not prevent the body to be parsed. In the case of a408
, the 2 conditions are:- path is
/_cluster/health
- code is 408
- path is
- Unmarshall the response using the struct provided by the caller (
responseObj interface{}
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I tend to feel like this is more complicated than it should be, but I'm fine with keeping that PR small and focused on the problem it solves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR fixes #4918.
It also removes
response *http.Response
from theAPIError
object, which is currently preventingError()
from being called multiple times:This is for example the case when an
APIError
is aggregated inzapcode.encodeError
:This is the reason why the error message may contain
unknown
in the error cause: