Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Heartbeat] Correctly store HTTP bodies with validation #14223

Merged
merged 10 commits into from
Oct 29, 2019

Conversation

andrewvc
Copy link
Contributor

@andrewvc andrewvc commented Oct 24, 2019

Currently, when using an HTTP body validator (either regexp or JSON) will break the storage of HTTP bodies with response.include_body (introduced in #13022).

The root cause is that both validation and reading the body for inclusion in the event share the same ReadCloser provided by *http.Response.

This patch looks at both validation and body settings to determine how much of the body to read, reads that much, then passes that to the validation and body inclusion code as a []byte.

Resolves #13751

@andrewvc andrewvc added bug in progress Pull request is currently in progress. Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team labels Oct 24, 2019
@andrewvc andrewvc requested a review from a team as a code owner October 24, 2019 18:11
@andrewvc andrewvc self-assigned this Oct 24, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (:uptime)

@andrewvc andrewvc added review and removed in progress Pull request is currently in progress. labels Oct 24, 2019
@andrewvc andrewvc requested a review from ruflin October 24, 2019 18:56
}

if len(config.RecvJSON) > 0 {
jsonChecks, err := checkJSON(config.RecvJSON)
if err != nil {
return nil, err
return multiValidator{}, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't like too much that even on error, we return an empty object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, what would you propose? The alternative would be making multiValidator a pointer type *multiValidator so we can still return nil. Seems unnecessary to use a pointer type IMHO given that it's initialized once at creation and never mutated. I'd call this a bit of a smell in golang TBH.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I normally would go with *multiValidator but also see your point. Keep it.

"github.com/stretchr/testify/require"

"github.com/elastic/beats/libbeat/common"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sorting 😆

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm (slowly) getting better!

// maxBufferBodyBytes sets a hard limit on how much we're willing to buffer for any reason internally.
// since we must buffer the whole body for body validators this is effectively a cap on that.
// 100MiB out to be enough for everybody.
const maxBufferBodyBytes = 100 * 1024 * 1024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would probably make sense to document this max value somewhere in the docs. At least 1 user will hit it :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

// then closes the body (which closes the connection). It doesn't return any errors
// but does log them. During an error case the return values will be (nil, -1).
// The maxBytes params controls how many bytes will be returned in a string, not how many will be read.
// We always read the full response here since we want to time downloading the full thing.
// This may return a nil body if the response is not valid UTF-8
func readResp(resp *http.Response, maxSampleBytes int) (bodySample string, bodySize int64, hashStr string, err error) {
if resp == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed anymore because?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
eventext.MergeEventFields(event, common.MapStr{"http": response})
// Download the body, close the response body, then attach all fields
err := handleRespBody(event, resp, responseConfig, errReason)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now handled inside the validator, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite, it's handled inside the processBody method mostly now, but now that function is less side-effecty, and doesn't modify the event, it just returns the new HTTP fields. https://github.com/elastic/beats/pull/14223/files#diff-759945e9c6b1377331063825c8e427daR247 is where the fields are finally merged now, which IMHO is an easier to understand flow.

@ruflin
Copy link
Member

ruflin commented Oct 28, 2019

Forgot to add: Changelog missing.

@andrewvc
Copy link
Contributor Author

@ruflin pushed some updates. I believe I've addressed all comments now.

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Did not test it manually.

@@ -511,7 +511,8 @@ Under `check.response`, specify these options:

*`status`*:: The expected status code. 4xx and 5xx codes are considered `down` by default. Other codes are considered `up`.
*`headers`*:: The required response headers.
*`body`*:: A list of regular expressions to match the the body output. Only a single expression needs to match.
*`body`*:: A list of regular expressions to match the the body output. Only a single expression needs to match. HTTP response
bodies of up to 100MiB are supported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this sound like 100MiB is crazy for a Body, but well ...

@andrewvc andrewvc merged commit 6c6b396 into elastic:master Oct 29, 2019
@andrewvc andrewvc deleted the store-body-after-validate branch October 29, 2019 13:59
andrewvc added a commit to andrewvc/beats that referenced this pull request Oct 29, 2019
Currently, when using an HTTP body validator (either regexp or JSON) will break the storage of HTTP bodies with response.include_body (introduced in elastic#13022).

The root cause is that both validation and reading the body for inclusion in the event share the same ReadCloser provided by *http.Response.

This patch looks at both validation and body settings to determine how much of the body to read, reads that much, then passes that to the validation and body inclusion code as a []byte.

Resolves elastic#13751

(cherry picked from commit 6c6b396)
andrewvc added a commit that referenced this pull request Oct 31, 2019
)

Currently, when using an HTTP body validator (either regexp or JSON) will break the storage of HTTP bodies with response.include_body (introduced in #13022).

The root cause is that both validation and reading the body for inclusion in the event share the same ReadCloser provided by *http.Response.

This patch looks at both validation and body settings to determine how much of the body to read, reads that much, then passes that to the validation and body inclusion code as a []byte.

Resolves #13751

(cherry picked from commit 6c6b396)
jorgemarey pushed a commit to jorgemarey/beats that referenced this pull request Jun 8, 2020
Currently, when using an HTTP body validator (either regexp or JSON) will break the storage of HTTP bodies with response.include_body (introduced in elastic#13022).

The root cause is that both validation and reading the body for inclusion in the event share the same ReadCloser provided by *http.Response.

This patch looks at both validation and body settings to determine how much of the body to read, reads that much, then passes that to the validation and body inclusion code as a []byte.

Resolves elastic#13751
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
… (elastic#14310)

Currently, when using an HTTP body validator (either regexp or JSON) will break the storage of HTTP bodies with response.include_body (introduced in elastic#13022).

The root cause is that both validation and reading the body for inclusion in the event share the same ReadCloser provided by *http.Response.

This patch looks at both validation and body settings to determine how much of the body to read, reads that much, then passes that to the validation and body inclusion code as a []byte.

Resolves elastic#13751

(cherry picked from commit 46643b0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Heartbeat review Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Heartbeat] Body on_error does not trigger on validation failures
4 participants