Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(metadata-jobs): improve consumer logging #10173

Merged
merged 1 commit into from
Mar 31, 2024

Conversation

darnaut
Copy link
Collaborator

@darnaut darnaut commented Mar 31, 2024

Currently consumer logs do not record any events processed, making it difficult to analyze, troubleshoot or gain insights from event processing. This change improves logging so that consumer logs record all events processed.

Additionally, the timer measuring how long ingestAspectsToLocalDB takes no longer includes time spent in emitMCL or when there are no aspects in a batch.

For reference, new log output for an MCL event:

INFO  c.l.m.k.MetadataChangeLogProcessor - Got MCL event key: urn:li:dataHubExecutionRequest:modeldoc-generated, topic: MetadataChangeLog_Versioned_v1, partition: 0, offset: 20613, value size: 64766, timestamp: 1711856258850
INFO  c.l.m.k.MetadataChangeLogProcessor - Invoking MCL hooks for urn: urn:li:dataHubExecutionRequest:modeldoc-generated, aspect name: dataHubExecutionRequestResult, entity type: dataHubExecutionRequest, change type: UPSERT
INFO  c.l.m.k.MetadataChangeLogProcessor - Invoking MCL hook FormAssignmentHook for urn: urn:li:dataHubExecutionRequest:modeldoc-generated
INFO  c.l.m.k.MetadataChangeLogProcessor - Invoking MCL hook IngestionSchedulerHook for urn: urn:li:dataHubExecutionRequest:modeldoc-generated
INFO  c.l.m.k.MetadataChangeLogProcessor - Invoking MCL hook UpdateIndicesHook for urn: urn:li:dataHubExecutionRequest:modeldoc-generated
INFO  c.l.m.s.e.update.ESBulkProcessor - Added request id: WfsoJ2g66rUx73/uwkiemA==, operation type: UPDATE, index: system_metadata_service_v1
INFO  c.l.m.s.e.update.ESBulkProcessor - Added request id: urn%3Ali%3AdataHubExecutionRequest%3Amodeldoc-generated, operation type: UPDATE, index: datahubexecutionrequestindex_v2
INFO  c.l.m.k.MetadataChangeLogProcessor - Invoking MCL hook IncidentsSummaryHook for urn: urn:li:dataHubExecutionRequest:modeldoc-generated
INFO  c.l.m.k.MetadataChangeLogProcessor - Invoking MCL hook EntityChangeEventGeneratorHook for urn: urn:li:dataHubExecutionRequest:modeldoc-generated
INFO  c.l.m.k.MetadataChangeLogProcessor - Invoking MCL hook SiblingAssociationHook for urn: urn:li:dataHubExecutionRequest:modeldoc-generated
INFO  c.l.m.k.MetadataChangeLogProcessor - Successfully completed MCL hooks for urn: urn:li:dataHubExecutionRequest:modeldoc-generated
INFO  c.l.m.s.e.update.BulkListener - Successfully fed bulk request 27. Number of events: 8 Took time ms: 1022

MCP event:

INFO  c.l.m.k.MetadataChangeProposalsProcessor - Got MCP event key: urn:li:dataset:(urn:li:dataPlatform:snowflake,long_tail_companions.analytics.worldclass_solutions,PROD), topic: MetadataChangeProposal_v1, partition: 0, offset: 1703, value size: 283, timestamp: 1711856256171
INFO  c.l.m.entity.EntityServiceImpl - Ingesting aspects batch to database, items: [ChangeItemImpl{changeType=UPSERT, urn=urn:li:dataset:(urn:li:dataPlatform:snowflake,long_tail_companions.analytics.worldclass_solutions,PROD), aspectName='siblings', recordTemplate={siblings=[urn:li:dataset:(urn:li:dataPlatform:dbt,long_tail_companions.analytics.worldclass_solutions,PROD)], primary=false}, systemMetadata={lastObserved=1711856256173, runId=no-run-id-provided}}]
INFO  c.l.m.entity.EntityServiceImpl - Ingesting aspect with name siblings, urn urn:li:dataset:(urn:li:dataPlatform:snowflake,long_tail_companions.analytics.worldclass_solutions,PROD)
INFO  c.l.m.entity.EntityServiceImpl - Ingestion of aspects batch to database took 2 ms
INFO  c.l.m.entity.EntityServiceImpl - Skipped producing MCL for ingested aspect siblings, urn urn:li:dataset:(urn:li:dataPlatform:snowflake,long_tail_companions.analytics.worldclass_solutions,PROD). Aspect has not changed.
INFO  c.l.m.k.MetadataChangeProposalsProcessor - Successfully processed MCP event urn: urn:li:dataset:(urn:li:dataPlatform:snowflake,long_tail_companions.analytics.worldclass_solutions,PROD)

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added product PR or Issue related to the DataHub UI/UX community-contribution PR or Issue raised by member(s) of DataHub Community labels Mar 31, 2024
@darnaut darnaut force-pushed the darnaut-metadata-jobs-logging branch from c4c9d3d to 2fca5dc Compare March 31, 2024 14:19
@darnaut darnaut merged commit c3257d8 into master Mar 31, 2024
41 checks passed
@darnaut darnaut deleted the darnaut-metadata-jobs-logging branch March 31, 2024 15:07
sleeperdeep pushed a commit to sleeperdeep/datahub that referenced this pull request Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution PR or Issue raised by member(s) of DataHub Community product PR or Issue related to the DataHub UI/UX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants