-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fluentd: Fix bug that caused lines to be dropped when containing non utf-8 characters #5107
Conversation
When a log line in hash format contains non UTF-8 characters fluentd would drop the complete line because it failed to convert the line in key-value format. By forcing UTF-8 encoding and replacing non UTF-8 characters with empty strings the log line will not be dropped but only contain the valid UTF-8 characters. Fixes #5099 Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
@cyriltovena is bumping the fluentd plugin version here the right thing to do? 7fd4c40 |
Correct ! |
@@ -278,6 +278,8 @@ def record_to_line(record) | |||
when :key_value | |||
formatted_labels = [] | |||
record.each do |k, v| | |||
# Remove non UTF-8 characters by force-encoding the string and replacing said chars with empty string | |||
v = v.encode('utf-8', invalid: :replace, replace: '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we replace with another special character may be ? Not sure if we can use empty string , it will not tell users there was an issue there ? WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would � work ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can let Ruby decide how to replace the invalid characters, like so:
v = v.encode('utf-8', invalid: :replace)
@@ -4,7 +4,7 @@ $LOAD_PATH.push File.expand_path('lib', __dir__) | |||
|
|||
Gem::Specification.new do |spec| | |||
spec.name = 'fluent-plugin-grafana-loki' | |||
spec.version = '1.2.16' | |||
spec.version = '1.2.17' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 this will do !
Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
@@ -122,7 +139,7 @@ | |||
expect(body[:streams][0]['stream'].empty?).to eq true | |||
expect(body[:streams][0]['values'].count).to eq 1 | |||
expect(body[:streams][0]['values'][0][0]).to eq '1546270458000000000' | |||
expect(body[:streams][0]['values'][0][1]).to eq 'message="' + content[0] + '" stream="stdout"' | |||
expect(body[:streams][0]['values'][0][1]).to eq 'message="' + content[0] + '" stream=stdout' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was failing since c298514793ddf49ec8d15972b35cdae5f225b5d2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you so much @chaudum |
* Fix encoding error in fluentd client This change fixes a bug that is caused by incorrectly trying to encode any values other than string. This bug was a regression introduced with #5107 and caused `NoMethodError` errors. Fixes #5099 Signed-off-by: Christian Haudum <christian.haudum@gmail.com> * Bump fluent-plugin-grafana-loki to version 1.2.18 Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
What this PR does / why we need it:
When a log line in hash format contains non UTF-8 characters fluentd
would drop the complete line because it failed to convert the line in
key-value format.
By forcing UTF-8 encoding and replacing non UTF-8 characters with empty
strings the log line will not be dropped but only contain the valid
UTF-8 characters.
Which issue(s) this PR fixes:
Fixes #5099
Special notes for your reviewer:
Checklist
CHANGELOG.md
about the changes.