Skip to content

Commit

Permalink
Improve aws-s3 gzip file detection to avoid false negatives (#29969) (#…
Browse files Browse the repository at this point in the history
…29974)

Directly check the byte stream for the gzip magic number and deflate
compression type. Avoid using http.DetectContentType because it returns
the first match it finds while checking many signatures.

Closes #29968

(cherry picked from commit 61a7d36)

Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
  • Loading branch information
mergify[bot] and andrewkroh committed Jan 24, 2022
1 parent c6bec41 commit 599f0c3
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 9 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,12 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
*Filebeat*

- aws-s3: Stop trying to increase SQS message visibility after ReceiptHandleIsInvalid errors. {pull}29480[29480]
- Fix handling of IPv6 addresses in netflow flow events. {issue}19210[19210] {pull}29383[29383]
- Fix `sophos` KV splitting and syslog header handling {issue}24237[24237] {pull}29331[29331]
- Undo deletion of endpoint config from cloudtrail fileset in {pull}29415[29415]. {pull}29450[29450]
- Make Cisco ASA and FTD modules conform to the ECS definition for event.outcome and event.type. {issue}29581[29581] {pull}29698[29698]
- ibmmq: Fixed `@timestamp` not being populated with correct values. {pull}29773[29773]
- aws-s3: Improve gzip detection to avoid false negatives. {issue}29968[29968]

*Heartbeat*

Expand Down
12 changes: 3 additions & 9 deletions x-pack/filebeat/input/awss3/s3_objects.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ import (
"fmt"
"io"
"io/ioutil"
"net/http"
"reflect"
"strings"
"time"
Expand Down Expand Up @@ -375,18 +374,13 @@ func s3ObjectHash(obj s3EventV2) string {
// stream without consuming it. This makes it convenient for code executed after this function call
// to consume the stream if it wants.
func isStreamGzipped(r *bufio.Reader) (bool, error) {
// Why 512? See https://godoc.org/net/http#DetectContentType
buf, err := r.Peek(512)
buf, err := r.Peek(3)
if err != nil && err != io.EOF {
return false, err
}

switch http.DetectContentType(buf) {
case "application/x-gzip", "application/zip":
return true, nil
default:
return false, nil
}
// gzip magic number (1f 8b) and the compression method (08 for DEFLATE).
return bytes.HasPrefix(buf, []byte{0x1F, 0x8B, 0x08}), nil
}

// s3Metadata returns a map containing the selected S3 object metadata keys.
Expand Down

0 comments on commit 599f0c3

Please sign in to comment.