Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

Latest commit

 

History

History
32 lines (17 loc) · 1.29 KB

architecture.md

File metadata and controls

32 lines (17 loc) · 1.29 KB

Architecture

Overview

The Jaeger S3 GRPC plugin uses Amazon S3 to store spans and makes them queryable using Amazon Athena.

flowchart LR
  spans --> jaeger-collector

  jaeger-collector -- write as parquet --> s3["AWS S3 bucket"]

  jaeger-query --> athena["AWS Athena"] --> s3

  style spans stroke-dasharray: 5 5

Loading

Writing

Incoming spans are converted to Parquet, a columnar data store format and partitioned for more efficient querying following the recommendations from AWS.

New parquet files are opened by default every 60s and spans streamed into them. We found that 60s is a good compromise between creating files large enough for efficient querying and ensuring some level of realtimeness users expect. If you have different needs you can adjust the s3.bufferDuration configuration value.

Querying

While is Athena is a great fully-managed query engine, query duration is usually seconds and not milliseconds.

To still provide a pleasant user experience we use the ability to fetch past Athena queries and their results to provide a query cache for improved response times and reduced costs.