Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flexible EBT Performance Metric Schema #136395

Merged
merged 65 commits into from
Aug 3, 2022

Conversation

lizozom
Copy link
Contributor

@lizozom lizozom commented Jul 14, 2022

Summary

In order to simplify the consumption of performance telemetry, we’d like to standardize the shape of events. This requires considering the following aspects:

  • Performance - amount and size of events being sent
  • Discoverability - how easy it is to find different events, dimensions and metrics
  • Queriability - we would like to have the ability to write complex queries on top of telemetry data; for example getting all dashboard load events that took over 10 seconds for dashboards that contain less than 5 panels.
  • Flexibility - Teams should be able to flexibly design their event structure without having to define custom mappings (except for edge cases). I.e. a team can decide that the first free field is always used for that page load.
  • Ease of visualization - it should be easy to visualize the data
  • Index field explosion - having too many field names in the index, damages the performance when visualizing the data.

Balancing these requirements required certain compromises that led us to the proposed solution:

Performance metric events are registered once (on server and on client) using the event_type: performance_metric and a standardized base schema:

  • Event name - We decided to report all metric events under a single event_type, so that we can avoid registering it multiple times. This requires us to add a special standard field for eventName.
  • Standardized fields - these fields represent common values that might be reported for performance. This can include duration for time based reports, status, heapSize, memorySize, etc. Standardized fields will be mapped as is to the index, so they can be visualized.
  • Free fields - A limited set of numbered free key-value pairs that can be used to report additional use case specific fields. Teams may use them to create their own internal sub-schema. For example, a team may choose to use key1 to always report the loading time of a certain component that is always present on the page. Free fields will be mapped, if present, to the index, so they can be visualized. If a free field becomes popular, it can be promoted to a standardized field, freeing up the free field for other use.
  • Non standard fields - additional fields may be added to an event, by extending them to the meta object. However those won’t be automatically mapped to the index. Since the properties field is a Flattened Field, this means those fields will be searchable by default, but an additional field mapping would be required to visualize them.

For example:

{
   "event_type": "performance_metric",
   "properties": {
      “eventName”: “dashboard_load”,
      “meta”: {
          “fullscreen”: 1,
      }
      "duration" : 3500,
      "key1"  : "time_to_render",
      "value1": 3000,
      "key2"  : "time_to_data",
      "value2": 2000,
      "key3"  : "...",
      "value3": "...",
      "key4"  : "...",
      "value4": "...",
      "key5"  : "...",
      "value5": "...",
   }
}

These events will then be mapped as implemented in https://github.com/elastic/telemetry/pull/1459

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine but I left a few comments.

src/core/public/utils/events.ts Outdated Show resolved Hide resolved
* Side Public License, v 1.
*/

// TODO: replace with kibana-loaded
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover?

src/core/utils/events.ts Outdated Show resolved Hide resolved
src/plugins/dashboard/public/plugin.tsx Outdated Show resolved Hide resolved
@suchcodemuchwow suchcodemuchwow requested a review from a team July 14, 2022 15:16
@lizozom lizozom changed the title use flexible performancce event schema use flexible performance event schema Jul 14, 2022
Copy link
Member

@afharo afharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing what we commented on Slack: I'm mostly concerned with the breaking changes we are introducing. Especially when we are still discussing if this is the event structure and any potential wrappers/API-helpers we may want to provide in the future.

While in phase 1 we don't need these API-helpers, we may introduce them in future phases to improve the Developer Experience, so, IMO, we should think about what event structure and implementation will give us the highest level of flexibility to adapt to new requirements as we find them in the journey.

src/core/utils/events.ts Outdated Show resolved Hide resolved
src/core/public/utils/index.ts Outdated Show resolved Hide resolved
src/core/server/server.ts Outdated Show resolved Hide resolved
src/core/public/utils/events.ts Outdated Show resolved Hide resolved
src/core/public/utils/events.ts Outdated Show resolved Hide resolved
@lizozom
Copy link
Contributor Author

lizozom commented Jul 19, 2022

@afharo I resovled both backward compatibility problems and used underscores to help the indexer index the event names correctly.
Also moved code into a package.

@suchcodemuchwow suchcodemuchwow changed the title [EBT][performance] use flexible metric \ performance event schema Flexible EBT Performance Metric Schema Aug 2, 2022
Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did another pass after your most recent changes. I left a few comments but consider all of them only nits. The PR looks good me. Please feel free to address my nits but from my end it does not require an additional review cycle.

Copy link
Member

@afharo afharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some nits.

We'll discuss tomorrow in our meeting how to best approach the keyN/valueN vs meta 😇

Copy link
Member

@afharo afharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
core 357 362 +5
dashboard 377 382 +5
total +10

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/ebt-tools - 11 +11

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
core 136.4KB 136.4KB +16.0B
dashboard 372.3KB 374.5KB +2.2KB
total +2.2KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
core 344.8KB 346.5KB +1.7KB
dashboard 66.4KB 66.0KB -417.0B
total +1.3KB
Unknown metric groups

API count

id before after diff
@kbn/ebt-tools - 19 +19

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @lizozom @suchcodemuchwow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting performance release_note:skip Skip the PR/issue when compiling release notes telemetry Issues related to the addition of telemetry to a feature v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance Event Schema