Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add usage collection for savedObject tagging #83160

Merged
merged 9 commits into from
Nov 20, 2020

Conversation

pgayvallet
Copy link
Contributor

@pgayvallet pgayvallet commented Nov 11, 2020

Summary

Fix #81847

Add a new saved_objects_tagging usage collector to monitor tagging usages.

Data structure:

{
  usedTags: number;
  taggedObjects: number;
  types: Record<string, { usedTags: number; taggedObjects: number;}> 
}

Sample data:

{
  usedTags: 3;
  taggedObjects: 7;
  types: {
    dashboard: {
       usedTags: 3;
       taggedObjects: 5;
    },
    visualization: {
       usedTags: 1;
       taggedObjects: 2;
    },
  }
}

Checklist

@pgayvallet pgayvallet marked this pull request as ready for review November 11, 2020 18:31
@pgayvallet pgayvallet added Feature:Saved Object Tagging Saved Objects Tagging feature release_note:skip Skip the PR/issue when compiling release notes Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.11.0 v8.0.0 labels Nov 11, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-platform (Team:Platform)

@pgayvallet pgayvallet requested review from a team November 11, 2020 18:32
Copy link
Member

@Bamieh Bamieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Mergable after removing the esClient check or we can remove it once we fix the bug.

Copy link
Contributor

@TinaHeiligers TinaHeiligers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now that the esClient check is removed.

@afharo
Copy link
Member

afharo commented Nov 18, 2020

FYI: the missing esClient issue will be gone once #83546 is merged :)

taggedObjects: { type: 'integer' },

types: {
dashboard: perTypeSchema,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we collect tags usage data for more than just these two saved object types. I think we should try to avoid having this kind of dynamic fields in our schema and it will also increase the field count on our telemetry cluster.

What about:

perSavedObjectType: {
  usedTags: { type: 'integer' },
  taggedObjects: { type: 'integer' },
  savedObjectType: {type: 'string'},
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perSavedObjectType: { ...

Should we use a nested or array type in that case?

That would work. However I would be sure that is would be alright to the way telemetry consumes / uses the data.

When I look at the other collectors, none seems to be used nested fields, which is why I'm asking:

export const applicationUsageSchema = {
// OSS
dashboards: commonSchema,
dev_tools: commonSchema,
discover: commonSchema,
home: commonSchema,
kibana: commonSchema, // It's a forward app so we'll likely never report it
management: commonSchema,

note that this could only be a design flaw when the collector was implemented. @elastic/kibana-telemetry?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! This is a legacy issue that no longer applies: in the past, before having the schema implementation, the indexer in the remote service indexed the data in the same format it received it from Kibana. This meant that we had to keep the mappings in mind when developing the collectors (and nested was never an option because Kibana does not natively support viz with that type yet).
New collectors/fields shouldn't have those contraints anymore because the new indexer actually reshapes the payloads and splits the info into multiple indices based on the different needs to consume the information by our PMs and Analysts.
If we foresee the number of keys growing dynamically, I think the array approach would be best.

cc @mindbat so he can share his thoughts from the indexer-maintainer POV 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New collectors/fields shouldn't have those contraints anymore because the new indexer actually reshapes the payloads and splits the info into multiple indices based on the different needs to consume the information by our PMs and Analysts.

@afharo: We're not sending Kibana data to Telemetry Next yet and the legacy indexer doesn't reshape data by default. We would still need to update the indexer to consume the new data, reshape it and index it into a custom extended index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should we stick with the static, per type, mappings for now?

Copy link
Contributor

@rudolf rudolf Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we just index data as-is, can you elaborate why we couldn't use an array field type for the mappings?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and nested was never an option because Kibana does not natively support viz with that type yet

I missed @afharo's comment and I assume this applies to array fields too #58175

It makes sense that we want to do some processing inside Telemetry Next, but I think it would be much more flexible if Kibana had more control around which documents are created. If telemetry is uploaded as NDJSON then usage collectors could return several documents instead of always having a single large payload that needs to be handled and split by the server. This is probably a big effort, but maintaining a static schema for dynamic data like applications and saved object types that change every release feels like a maintenance nightmare.

So I guess for this PR we just need to add all the known saved object types to the mappings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess for this PR we just need to add all the known saved object types to the mappings.

We only need to add types that currently supports the tagging feature. That would just be dashboard, visualizations and maps. Will add maps. Further additions should only be done when the app/type implements the tagging feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 7b6d76e: added map to the schema, and updated the readme to state that developers should update the schema when adding tagging support to additional SO types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rudolf

Even if we just index data as-is, can you elaborate why we couldn't use an array field type for the mappings?

Without any transformations on the remote service indexer, we'd end up with data mapped as an array in elasticsearch and face the usual problem of not being able to meaningfully query or visualize that data. It's a situation we're trying to fix with ui_metric right now, for example.

@@ -1841,6 +1817,30 @@
}
}
},
"infraops": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elastic/kibana-telemetry Out of curiosity, any idea why the order of fields are changing here? Would be nice if we could make it stable to avoid unnecessary diffs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bamieh do you have any idea how we can prevent reordering fields in the json files?

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Distributable file count

id before after diff
default 42957 42962 +5

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@pgayvallet pgayvallet merged commit 22e494e into elastic:master Nov 20, 2020
pgayvallet added a commit to pgayvallet/kibana that referenced this pull request Nov 20, 2020
* add so tagging usage collection

* update telemetry mappings

* fix types

* remove check on esClient presence

* update schema and README
gmmorris added a commit to gmmorris/kibana that referenced this pull request Nov 20, 2020
* master: (38 commits)
  [ML] Data frame analytics: Adds functionality to map view (elastic#83710)
  Add usage collection for savedObject tagging (elastic#83160)
  [SECURITY_SOLUTION] 145: Advanced Policy Tests (elastic#82898)
  [APM] Service overview transactions table (elastic#83429)
  [ML] Fix Single Metric Viewer not loading if job is metric with no partition (elastic#83880)
  do not export types from 3rd party modules as 'type' (elastic#83803)
  [Fleet] Allow to send SETTINGS action (elastic#83707)
  Fixes Failing test: Chrome X-Pack UI Functional Tests.x-pack/test/functional_with_es_ssl/apps/triggers_actions_ui/details·ts - Actions and Triggers app Alert Details Alert Instances renders the active alert instances (elastic#83478)
  [Uptime]Reduce chart height on monitor detail page (elastic#83777)
  [APM] Prefer `APIReturnType` over `PromiseReturnType` (elastic#83843)
  [Observability] Fix telemetry for Observability Overview (elastic#83847)
  [Alerting] Adds generic UI for the definition of conditions for Action Groups (elastic#83278)
  ensure workload agg doesnt run until next interval when it fails (elastic#83632)
  [ILM] Policy form should not throw away data (elastic#83077)
  [Monitoring] Stop collecting Kibana Usage in bulkUploader (elastic#83546)
  [TSVB] fix wrong imports (elastic#83798)
  [APM] Correlations UI POC (elastic#82256)
  list all the refs in  tsconfig.json (elastic#83678)
  Bump jest (and related packages) to v26.6.3 (elastic#83724)
  Functional tests - stabilize reporting tests for cloud execution (elastic#83787)
  ...
pgayvallet added a commit that referenced this pull request Nov 21, 2020
* Add usage collection for savedObject tagging (#83160)

* add so tagging usage collection

* update telemetry mappings

* fix types

* remove check on esClient presence

* update schema and README

* fix mappings for backport
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Saved Object Tagging Saved Objects Tagging feature release_note:skip Skip the PR/issue when compiling release notes Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.11.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Usage Analytics on Saved Object Tags
8 participants