Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lowercase processor plugin #3890

Closed
wants to merge 1 commit into from
Closed

Conversation

ada-foss
Copy link
Contributor

My own use case means I need to convert the values of tags from upper case to lower case (or vice versa) because IIS logs I've been working with contain the character case originally expressed in each request, but this means statistics are not aggregated together by the basicstats aggregator.

I tried building and using the regex processor, #3839 , but regex in golang doesn't support the case transformations available in Perl with \L and \U. Everything else about the plugin was a perfect fit so I copied its structure and want to credit @44px for most of the hard work. The plugin just calls strings.ToLower on the tags and fields specified.

Configuration Example

[[processors.lowercase]]
  namepass = "iis_log"
  [[processors.lowercase.tags]]
    key = "01_uriStem"

Source metric from logparser

iis_log,host=MYMACHINE,02_method=GET,01_uriStem=/api/Healthcheck,03_response=200 port="7314",protocolVersion="HTTP
/1.1",serverName="MYHOST",log_timestamp="2018-03-08 00:02:59",clientIP="fe80::beef:beef:feeb:9071%12",bytesSent=436i,req
uestHost="THEIRHOST:7314",referer="-",userAgent="Mozilla/5.0+(Windows+NT;+Windows+NT+6.3;+en-GB)+WindowsPowerShell/5.1.1
4409.1012",serverIP="fe80::beef:beef:beef:9071%12",timetaken=1781i,username="-",serviceName="W3SVC11",win32response="0",
uriQuery="-",bytesReceived=178i,subresponse="0" 1521122773998868100
iis_log,02_method=GET,01_uriStem=/API/healthcheck,03_response=200,host=MYMACHINE uriQuery="-",username="-",bytesSe
nt=436i,bytesReceived=178i,clientIP="fe80::beef:beef:feeb:9071%12",serverName="MYHOST",serverIP="fe80::beef:beef:beef:90
71%12",userAgent="Mozilla/5.0+(Windows+NT;+Windows+NT+6.3;+en-GB)+WindowsPowerShell/5.1.14409.1012",subresponse="0",log_
timestamp="2018-03-08 00:03:00",protocolVersion="HTTP/1.1",serviceName="W3SVC11",port="7314",timetaken=1781i,referer="-"
,requestHost="THEIRHOST:7314",win32response="0" 1521122773998868101

Results

iis_log,host=LAPTOP-A5GCJ6G9,03_response=200,02_method=GET,01_uriStem=/api/healthcheck uriQuery="-",requestHost="THEIRHO
ST:7314",timetaken=1781i,bytesReceived=178i,userAgent="Mozilla/5.0+(Windows+NT;+Windows+NT+6.3;+en-GB)+WindowsPowerShell
/5.1.14409.1012",serverName="MYHOST",bytesSent=436i,username="-",clientIP="fe80::beef:beef:feeb:9071%12",log_timestamp="
2018-03-08 00:02:59",subresponse="0",serverIP="fe80::beef:beef:beef:9071%12",port="7314",win32response="0",referer="-",s
erviceName="W3SVC11",protocolVersion="HTTP/1.1" 1521122905846447300
iis_log,02_method=GET,03_response=200,host=LAPTOP-A5GCJ6G9,01_uriStem=/api/healthcheck subresponse="0",serviceName="W3SV
C11",protocolVersion="HTTP/1.1",serverName="MYHOST",bytesSent=436i,port="7314",serverIP="fe80::beef:beef:beef:9071%12",b
ytesReceived=178i,clientIP="fe80::beef:beef:feeb:9071%12",username="-",uriQuery="-",win32response="0",log_timestamp="201
8-03-08 00:03:00",userAgent="Mozilla/5.0+(Windows+NT;+Windows+NT+6.3;+en-GB)+WindowsPowerShell/5.1.14409.1012",requestHo
st="THEIRHOST:7314",timetaken=1781i,referer="-" 1521122905846447301

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

@danielnelson
Copy link
Contributor

Thanks for the pull request, I'd like to see how much interest there is in this plugin before moving forward with merging it into master. Can everyone who needs this give it a thumbs up and add a description of how you would use it.

@ada-foss
Copy link
Contributor Author

OK that's a fair plan. I don't know if this is helpful but here's my use case in a bit more detail. I have logs that look a bit like this (I built a file of 1000 records as an easier-on-the-eye example)

GET /api/Query 14
GET /api/Query 70
GET /api/Query 44
GET /api/Query 97
GET /API/query 38
GET /API/query 6 
GET /API/query 56
GET /api/Query 9 

And a configuration that looks like this

[[outputs.file]]
  files = ["/Program Files/Telegraf/telegraf.influx"]
  data_format = "influx"

[[aggregators.basicstats]]
  period = "10s"
  drop_original = true
  stats = ["count"]

[[inputs.logparser]]
  files = ["/Program Files/Telegraf/simple.log"]
  from_beginning = true
  tagexclude = ["path","host"]

  [inputs.logparser.grok]
    patterns = ['%{WORD:method:tag} %{URIPATH:uri_stem:tag} %{NUMBER:timetaken:int}']
    measurement = "iis_log"

Curiously I think there's something wrong with the pattern because it only produces 899 points from the file. I don't think it's a problem for the example though.

Because case does not match I get this

iis_log,method=GET,uri_stem=/API/query timetaken_count=448 1521213720000000000
iis_log,uri_stem=/api/Query,method=GET timetaken_count=451 1521213720000000000

But adding the lowercase plugin which looks like this

[[processors.lowercase]]
  namepass = "iis_log"
  [[processors.lowercase.tags]]
    key = "uri_stem"

Gives me what I want which is this (apart from 101 missing points, but since it's just an example I don't care)

iis_log,method=GET,uri_stem=/api/query timetaken_count=899 1521213900000000000

@danielnelson
Copy link
Contributor

Right now this plugin is very specialized, I wonder if we should make this processor extendable so that we could add more common string processing functions later, which I think would help move this processor into the goldilocks zone:

[[processors.strings]]
  namepass = "foo"

  [[processors.strings.lowercase]]
    tag = "method"
    # defaults to value, apply operation to key or value
    convert = "value"
    result_key = "method_lower"

  ## imaginary operation we could add later:
  [[processors.strings.snakecase]]
    field = "NetworkPacketIn"
    convert = "key"
- foo,method=GET, NetworkPacketIn=42i
+ foo,method=GET,method_lower=get network_packet_in=42i

Other string functions I can think of are upper, title, camelcase, casefold.

@ada-foss
Copy link
Contributor Author

I've taken a look and I don't see why not. I imagined that you might make a separate plugin for different functions before I really understood how the plugins worked but it doesn't seem so hard now. I concentrated on lowercase because it was one way to solve my problem at the time.

Shall I adapt it to work like that but just include lowercase and one other function (to show how it can be extended) for now?

@danielnelson
Copy link
Contributor

@bsmaldon Yeah that would be great, thanks

@Kaacz
Copy link

Kaacz commented Jul 3, 2018

WOW, I need uppercase plugin (to unify hostname tags). :)
Explanation: supplier of platform send to my Telegraf trough his Telegraf some metrics from servers. But don't use localhost and override tag "host" to name of functionality+number (for example application server number 6 = "as06"). But I need convert it in my Telegraf to "VXAS06LOCALITY" (prefix+uppercase+suffix).
PS: about prefix/suffix - maybe it is possible done it in socket/tcp4 input configuration ...
Thanks for lowercase plugin, I will try modify it for my case... but I am old and untouched by Go. :)

@danielnelson
Copy link
Contributor

Some other handy functions might be trim, trim_left and trim_right.

@ada-foss
Copy link
Contributor Author

I've picked this up again but now that I look at it I'm not sure what the rationale for the convert = "<key|value>" option is. Surely if you know the key you want to transform at configuration time you can simply set result_key to the key name you want and use tag_exclude to drop the old tag? Might be nice to have a null processor that allows you to manipulate tags and fields like that with a drop_original option or something. Anyway I'm proceeding with a mockup that doesn't have that option in unless I hear back about doing otherwise.

@danielnelson
Copy link
Contributor

Sounds good, we will probably add a standalone processor for renames and then this plugin can focus on string manipulation.

@ada-foss
Copy link
Contributor Author

I'm opening a new pull request for this because I rebased but don't know how to update this pull request. The updated pull request number is #4476.

@ada-foss ada-foss closed this Jul 26, 2018
@danielnelson
Copy link
Contributor

Okay, no problem. (The way you can do this is by force pushing over the old branch: git push myfork mybranch -f)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants