Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[new feature] promtail: Add a Promtail stage for probabilistic sampling #7127

Merged
merged 38 commits into from
Mar 17, 2023

Conversation

liguozhong
Copy link
Contributor

@liguozhong liguozhong commented Sep 10, 2022

What this PR does / why we need it:

The sampling stage can be directly sampled.
The implementation of sampling is to use the algorithm in jaeger go client

pipeline_stages:
- sampling:
     rate: 0.1

or it can be used with match for precise sampling.

pipeline_stages:
- json:
    expressions:
      app:
- match:
    pipeline_name: "app2"
    selector: "{app=\"poki\"}"
    stages:
    - sampling:
        rate: 0.1

Which issue(s) this PR fixes:
Fixes #6654

Special notes for your reviewer:

The promtail 'rate' stage is also used with the 'match' stage for log filtering.This design makes the code very clean.
Other log agents vector have log filtering built into the sampling operator, which I think is too complicated
https://vector.dev/docs/reference/configuration/transforms/sample/

 transforms:
  my_transform_id:
    type: sample
    inputs:
      - my-source-or-transform-id
    exclude: null
    rate: 10

'rate' stage review suggestions .
#5051
image

Checklist

  • Documentation added
  • Tests updated
  • Is this an important fix or new feature? Add an entry in the CHANGELOG.md.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md

@liguozhong liguozhong requested a review from a team as a code owner September 10, 2022 00:05
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

}, nil
}

// dropStage applies Label matchers to determine if the include stages should be run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs correction :)

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

cfg: cfg,
dropCount: getDropCountMetric(registerer),
samplingBoundary: samplingBoundary,
pool: sync.Pool{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming the reference implementation is using a sync.Pool for storing rand generators since the generator created by NewSource is not safe for concurrent use. And as an added benefit generating random number from this pool would make use of generators created with different seeds.

But in this stage, we have a single go routine that's generating the random numbers which means we don't necessarily need a sync.Pool maybe we should drop this

@stale
Copy link

stale bot commented Nov 2, 2022

Hi! This issue has been automatically marked as stale because it has not had any
activity in the past 30 days.

We use a stalebot among other tools to help manage the state of issues in this project.
A stalebot can be very useful in closing issues in a number of cases; the most common
is closing issues or PRs where the original reporter has not responded.

Stalebots are also emotionless and cruel and can close issues which are still very relevant.

If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.

We regularly sort for closed issues which have a stale label sorted by thumbs up.

We may also:

  • Mark issues as revivable if we think it's a valid issue but isn't something we are likely
    to prioritize in the future (the issue will still remain closed).
  • Add a keepalive label to silence the stalebot if the issue is very common/popular/important.

We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task,
our sincere apologies if you find yourself at the mercy of the stalebot.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Nov 2, 2022
# Conflicts:
#	clients/pkg/logentry/stages/stage.go
@stale stale bot removed the stale A stale issue or PR that will automatically be closed. label Feb 9, 2023
@jeschkies
Copy link
Contributor

Nice. Could you update the documentation?

@liguozhong
Copy link
Contributor Author

done.

Copy link
Contributor

@JStickler JStickler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Docs squad} Couple of suggestions and questions for the documentation.

docs/sources/clients/promtail/stages/sampling.md Outdated Show resolved Hide resolved
docs/sources/clients/promtail/stages/sampling.md Outdated Show resolved Hide resolved
docs/sources/clients/promtail/stages/sampling.md Outdated Show resolved Hide resolved
liguozhong and others added 4 commits February 10, 2023 19:00
Co-authored-by: J Stickler <julie.stickler@grafana.com>
Co-authored-by: J Stickler <julie.stickler@grafana.com>
@liguozhong
Copy link
Contributor Author

thanks,I modified the document to 2 examples

@jeschkies
Copy link
Contributor

I think @JStickler needs to approve 🙂

Copy link
Contributor

@JStickler JStickler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Docs squad]

@liguozhong
Copy link
Contributor Author

thanks

@MasslessParticle
Copy link
Contributor

Awesome, thanks for the contribution!

@MasslessParticle MasslessParticle merged commit 0295fd4 into grafana:main Mar 17, 2023
@filip-kolodziej-synerise

PR was merged 4m ago, any plans for merging it to the release branch as well?

@PrayagS
Copy link

PrayagS commented Aug 8, 2023

PR was merged 4m ago, any plans for merging it to the release branch as well?

+1. Looking forward to utilizing this feature. With the limit stage, it's more of a hit-and-miss thing to get the percentage right.

grafanabot added a commit that referenced this pull request Sep 1, 2023
MasslessParticle added a commit that referenced this pull request Sep 1, 2023
…Promtail stage for probabilistic sampling (#10425)

Add PR #7127 to release notes for next release

Co-authored-by: Travis Patterson <travis.patterson@grafana.com>
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
… Add a Promtail stage for probabilistic sampling (grafana#10425)

Add PR grafana#7127 to release notes for next release

Co-authored-by: Travis Patterson <travis.patterson@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add-to-release-notes size/L type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a Promtail stage for probabilistic sampling
8 participants