Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON output support with --json/-J option #160

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

bonkf
Copy link

@bonkf bonkf commented Nov 22, 2019

This PR revamps #141 and fixes all the complaints. It adds support for outputting pretty-printed JSON with variable indentation (--json=n), outputting compact non-pretty-printed JSON (--json=0) and outputting the --vcount and --outage options' results in JSON format.

Unfortunately choosing a custom indentation or disabling pretty-printing with -J 4 or -J 0 does not work, because the parser can't tell the difference between an option's optarg and a non-option argument (e.g. an IP address).

Seems like the parser accepts -J4 or -J0.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.6%) to 80.131% when pulling 3923fa5 on Reperator:json-output into 12961d5 on schweikert:develop.

@schweikert
Copy link
Owner

Thanks for the patch! I will need some time to review it / integrate it, but I just wanted to already say that it's much appreciated :-)

@schweikert
Copy link
Owner

Considering that the JSON output is going to be an interface and we won‘t be able to change it later, we should make sure that the schema is how we want to keep it long term. Seeing the current output:

fping --json -n -A -c3 -p100 dns.google.com                                  
{
  "hosts": {
    "dns.google.com (2001:4860:4860::8844)": {
      "xmt": 3,
      "rcv": 3,
      "loss_percentage": 0,
      "min": 3.32,
      "avg": 3.32,
      "max": 3.33
    }
  }
}

I see a couple of issues:

  1. The schema changes depending on the options that you use. If for example -C3 is used instead of -c3, then the values are the rtt measurements as array instead of the keys „xmt“, etc. We probably should use a consistent schema independently of what options are used, and just fill what is relevant.
  2. Using -n and -A combined forces the user to parse the output still (to get the name and address). They probably should be keys instead.
  3. „loss_percentage“ is quite long compared to the other keys. I also wonder whether this is needed at all (it is reduntant)
  4. Maybe the key "hosts" could be renamed "summary" so that in the future also other output could be intergrated (the output that comes without -q, which --json currently implies)

How the output could look like:

{
  "summary": {
    "dns.google.com“: {
      "addr": "2001:4860:4860::8844",
      "xmt": 3,
      "rcv": 3,
      "min": 3.32,
      "avg": 3.32,
      "max": 3.33,
      "rtt": [ 3.32, 3.32, 3.33 ]
    }
  }
}

Let me know what you think. I can also work on making these changes.

@bonkf
Copy link
Author

bonkf commented Jan 1, 2020

  1. The schema changes depending on the options that you use. If for example -C3 is used instead of -c3, then the values are the rtt measurements as array instead of the keys „xmt“, etc. We probably should use a consistent schema independently of what options are used, and just fill what is relevant.

Good point. I think it best if we were to always include every possible field and null out everything that wouldn't be included in non-JSON output, e.g. something like

{
  "summary": {
    "dns.google.com": {
      "addr": "2001:4860:4860::8844",
      "host": "dns.google.com",
      "xmt": 3,
      "rcv": 3,
      "outage": null,
      "min": 3.32,
      "avg": 3.32,
      "max": 3.33,
      "rtt": null
    }
  }
}

or

{
  "summary": {
    "2001:4860:4860::8844": {
      "addr": "2001:4860:4860::8844",
      "host": null,
      "xmt": null,
      "rcv": null,
      "outage": null,
      "min": null,
      "avg": null,
      "max": null,
      "rtt": [
        3.32,
        3.32,
        3.33
      ]
    }
  }
}

OTOH this is kind of wasteful since we can easily do the summary calculations even if we print all rtts anyway. Opinions?

  1. Using -n and -A combined forces the user to parse the output still (to get the name and address). They probably should be keys instead.

I'll add an addr key and a host key that include address and hostname respectively. I'll only include the hostname if the user provided it or if we've done rDNS resolution due to -n.

  1. „loss_percentage“ is quite long compared to the other keys. I also wonder whether this is needed at all (it is reduntant)

Technically most keys are redundant as long as we include the rtts. I'll throw out return_percentage and loss_percentage and keep the others.

  1. Maybe the key "hosts" could be renamed "summary" so that in the future also other output could be intergrated (the output that comes without -q, which --json currently implies)

Will do.

Let me know what you think. I can also work on making these changes.

No worries, I'll take care of it.

Edit: Do you want me to squash my commits?

Edit 2: Another thing I noticed is the issue of duplicate keys. Suppose a user runs fping -A -n -J 127.0.0.1 localhost or even simpler fping -A -n -J localhost localhost. Both summary entries would have the key localhost. JSON allows this, but I can imagine many parsers failing on this, especially if the JSON is supposed to be parsed as a programming language structure such as a hash table or an object. The problem is that identical keys may only become apparent after we have already done rDNS resolution. For now I'll just output duplicate keys; if this becomes a problem we may have to fail if we were about to output duplicate keys, or assign duplicate keys identifiers such as localhost (1), or instead of summary being an object we could make it an array. This would alleviate these problems but necessitates iterating over the array to find a specific entry (and we would still have entries that have identical addresses and hostnames).

@jasoncdavis
Copy link

Reperator, thanks for picking this back up! I'm encouraged to see others found value in the original suggestion. I've been using fping 4.2 with my original patch suggestions, but I look forward to your fix-ups and enhancements!

@mmihalev
Copy link

Any news about this? It will be really awesome to have json output!

@bkuker
Copy link

bkuker commented Aug 11, 2020

My two cents is that the schema needs work. The point about the schema changing is good, but in my opinion the "null" suggestion is not quite right. Json consuming code that has to distinguish between a field being null and a field not being present is almost always going to have bugs.

" We probably should use a consistent schema independently of what options are used, and just fill what is relevant." is a good approach, and allows for additions to the schema in the future while preserving backwards compatibility with any JSON consumer that is not going to be confused by new keys (Another good practice IMHO)

@bonkf
Copy link
Author

bonkf commented Aug 12, 2020

I'm sorry, I completely forgot about this. I'll start working on it today; I hope I can get it finished soon-ish.

Json consuming code that has to distinguish between a field being null and a field not being present is almost always going to have bugs.

Agreed, I'll keep the schema consistent across options.

@mmihalev
Copy link

What about using numeric keys instead of IP address as a key? This should eliminate the issue with repeated IPs?

@bonkf
Copy link
Author

bonkf commented Aug 12, 2020

In that case we can just use JSON arrays. My main gripe with that is that the API for JSON objects is much nicer than the API for JSON arrays in many languages (hashtables vs. lists).

@mmihalev
Copy link

In that case we can just use JSON arrays. My main gripe with that is that the API for JSON objects is much nicer than the API for JSON arrays in many languages (hashtables vs. lists).

I think this should not be an issue. Every JSON consumer can work with arrays with ease. And all the API's that I know about are using json arrays in their json responses.
P.S.: Maybe RTT's responses should also be array? If number of pings (-c) is greater than 1 we can include all the RTTs inside an array?

@bkuker
Copy link

bkuker commented Aug 12, 2020

Not a big deal, I am already running fping and parsing the string results into json and passing them further along to other software... This would apply to yaml or xml too, I do the conversion on more of a line by line basis and emit a stream of json documents.
In my case those documents have three different schemas, "progress" generated once per ping per IP, "result" generate once per IP, and "verbose result" which extends the result schema and has the verbose results, big surprise.

If this were of interest, https://en.wikipedia.org/wiki/JSON_streaming has some suggestions on outputting multiple json objects over a stream.

I can share an example as a different straw man to beat on.

//Ignore insonsistent numbers, I edited by hand for readability.
//Progress Results
{"ip": "8.8.8.8", "seq": 0, "size": 96, "rtt": 20.47, "rttAvg": 20.47, "loss": 0}
{"ip": "8.8.4.4", "seq": 0, "size": 96, "rtt": 23.73, "rttAvg": 23.73, "loss": 0}
{"ip": "8.8.8.8", "seq": 1, "size": 96, "rtt": 21.27, "rttAvg": 21.00, "loss": 0}
{"ip": "8.8.4.4", "seq": 1, "size": 96, "rtt": 24,00, "rttAvg": 23.53, "loss": 0}
{"ip": "8.8.4.4", "seq": 2, "size": 96, "rtt": 21.27, "rttAvg": 22.47, "loss": 0}
{"ip": "8.8.8.8", "seq": 2, "size": 96, "rtt": 22.73, "rttAvg": 21.27, "loss": 0}
{"ip": "8.8.8.8", "seq": 3, "size": 96, "rtt": 22.50, "rttAvg": 21.47, "loss": 0}
//lost #3 for 8.8.4.4
{"ip": "8.8.8.8", "seq": 4, "size": 96, "rtt": 22.73, "rttAvg": 22.00, "loss": 0}
{"ip": "8.8.4.4", "seq": 4, "size": 96, "rtt": 25.53, "rttAvg": 24.27, "loss": 0.2}

//Result (or verbose result. not both)
{"ip": "8.8.8.8", "sent": 5, "received": 5, "loss": 0, "rtt": {"min": 21.29, "avg": 22.29, "max": 23.89}}

//VerboseResult (note null in rtts array for lost packet)
{"ip": "8.8.4.4", "sent": 5, "received": 4, "loss": 0.20, "rtt": {"min": 21.45, "avg": 24.34, "max": 26.36}, "jitter": {"min": 0.266, "avg": 2.95, "max": 5.91}, "rtts": [23.43,null,21.17,26.76,25.26]}
{"ip": "8.8.8.8", "sent": 5, "received": 5, "loss": 0, "rtt": {"min": 20.62, "avg": 22.16, "max": 22.53}, "jitter": {"min": 0.1875, "avg": 0.949, "max": 1.719}, "rtts": [20.86,21.18,22.95,22.56,22.49]}

@mmihalev
Copy link

mmihalev commented Aug 12, 2020

Not a big deal, I am already running fping and parsing the string results into json and passing them further along to other software... This would apply to yaml or xml too, I do the conversion on more of a line by line basis and emit a stream of json documents.
In my case those documents have three different schemas, "progress" generated once per ping per IP, "result" generate once per IP, and "verbose result" which extends the result schema and has the verbose results, big surprise.

If this were of interest, https://en.wikipedia.org/wiki/JSON_streaming has some suggestions on outputting multiple json objects over a stream.

I can share an example as a different straw man to beat on.

//Ignore insonsistent numbers, I edited by hand for readability.
//Progress Results
{"ip": "8.8.8.8", "seq": 0, "size": 96, "rtt": 20.47, "rttAvg": 20.47, "loss": 0}
{"ip": "8.8.4.4", "seq": 0, "size": 96, "rtt": 23.73, "rttAvg": 23.73, "loss": 0}
{"ip": "8.8.8.8", "seq": 1, "size": 96, "rtt": 21.27, "rttAvg": 21.00, "loss": 0}
{"ip": "8.8.4.4", "seq": 1, "size": 96, "rtt": 24,00, "rttAvg": 23.53, "loss": 0}
{"ip": "8.8.4.4", "seq": 2, "size": 96, "rtt": 21.27, "rttAvg": 22.47, "loss": 0}
{"ip": "8.8.8.8", "seq": 2, "size": 96, "rtt": 22.73, "rttAvg": 21.27, "loss": 0}
{"ip": "8.8.8.8", "seq": 3, "size": 96, "rtt": 22.50, "rttAvg": 21.47, "loss": 0}
//lost #3 for 8.8.4.4
{"ip": "8.8.8.8", "seq": 4, "size": 96, "rtt": 22.73, "rttAvg": 22.00, "loss": 0}
{"ip": "8.8.4.4", "seq": 4, "size": 96, "rtt": 25.53, "rttAvg": 24.27, "loss": 0.2}

//Result (or verbose result. not both)
{"ip": "8.8.8.8", "sent": 5, "received": 5, "loss": 0, "rtt": {"min": 21.29, "avg": 22.29, "max": 23.89}}

//VerboseResult (note null in rtts array for lost packet)
{"ip": "8.8.4.4", "sent": 5, "received": 4, "loss": 0.20, "rtt": {"min": 21.45, "avg": 24.34, "max": 26.36}, "jitter": {"min": 0.266, "avg": 2.95, "max": 5.91}, "rtts": [23.43,null,21.17,26.76,25.26]}
{"ip": "8.8.8.8", "sent": 5, "received": 5, "loss": 0, "rtt": {"min": 20.62, "avg": 22.16, "max": 22.53}, "jitter": {"min": 0.1875, "avg": 0.949, "max": 1.719}, "rtts": [20.86,21.18,22.95,22.56,22.49]}

I think showing the "progress" per IP per ping is a must! Either by adding this into the final response or outputting it while running. Right now the proposed implementation will imply the -q flag which is going to hide per-target/per-ping results

@bkuker
Copy link

bkuker commented Aug 12, 2020

IMHO though I could be over-complicating, flags would allow you to choose between multiple json document output like I suggested, or a single json output with all the results, possibly including progress.
This could be as simple as either one json object per line, or starting with a [ adding a , after each json object per line, and ending with ]:

[
{"ip": "8.8.8.8", "seq": 0, "size": 96, "rtt": 20.47, "rttAvg": 20.47, "loss": 0}
,
{"ip": "8.8.4.4", "seq": 0, "size": 96, "rtt": 23.73, "rttAvg": 23.73, "loss": 0}
,
{"ip": "8.8.8.8", "seq": 1, "size": 96, "rtt": 21.27, "rttAvg": 21.00, "loss": 0}
,
{"ip": "8.8.4.4", "seq": 1, "size": 96, "rtt": 24,00, "rttAvg": 23.53, "loss": 0}
,
..... output omitted
]

But it could be more complex:

{
  progress: [ ...array of progress... ],
  results: { ...hash of results... },
  options: { .. I love when programs output the options they were run with in structured output like this... }
}

fping --jsonStream | someCommandThatProcessesOneLineAtATimeWhereEachLineIsAValidJsonObject

fping --jsonArray > myValidJsonFile.json

@bonkf
Copy link
Author

bonkf commented Aug 24, 2020

I'm a bit busy at work at the moment so it might take some time until I can get back to this PR, sorry. :(

@dimir
Copy link

dimir commented Sep 25, 2020

Thank you, @Reperator, this is really useful feature for us, talking from Zabbix perspective! I'd agree with @bkuker that it makes more sense to not list tags that have no values.

OK here are my 2 cents. I have grouped the output in targets, within those there is data for each packet and summary. Also, I have added some ideas about error handling:

{
    "error": "in case of error",
    "targets": {
        "dns.google.com": {
            "addr": "8.8.8.8",
            "packets": [
                {
                    "bytes": 64,
                    "rtt": 3.29,
                    "avg": 3.35,
                    "loss": 11,
                    "source": "4.4.4.4"
                },
                {
                    "error": "timed out"
                }
            ],
            "summary": {
                "min": 21.29,
                "avg": 22.29,
                "max": 23.89
            }
        },
        "7.7.7.7": {
            "addr": "7.7.7.7",
            "packets": [
                {
                    "error": "timed out"
                },
                {
                    "error": "timed out"
                }
            ],
            "summary": {}
        }
    }
}

@stefan-ahrefs
Copy link

anyone still working on this or it was forgotten?

@bonkf
Copy link
Author

bonkf commented Mar 4, 2022

I'm not working on it anymore, I left my old company a year ago and don't have the time to look into this again.

@auerswal
Copy link
Collaborator

I'll convert this to a "draft" since it does not apply currently, and work on it has stopped.

@auerswal auerswal marked this pull request as draft August 31, 2024 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants