Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design doc: object store scrape #2107

Closed
wants to merge 2 commits into from

Conversation

adityacs
Copy link
Contributor

@adityacs adityacs commented May 21, 2020

What this PR does / why we need it:
This PR is to start the discussion on how we should implement Object store scraping feature described in #2045

Thanks to @cyriltovena for his help and guidance.

Also, thanks to @Punkoivan, some of the points are taken from his comments here #2045

@codecov-commenter
Copy link

codecov-commenter commented May 21, 2020

Codecov Report

Merging #2107 into master will increase coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2107   +/-   ##
=======================================
  Coverage   61.21%   61.22%           
=======================================
  Files         146      146           
  Lines       11187    11187           
=======================================
+ Hits         6848     6849    +1     
+ Misses       3792     3789    -3     
- Partials      547      549    +2     
Impacted Files Coverage Δ
pkg/promtail/targets/tailer.go 73.86% <0.00%> (-4.55%) ⬇️
pkg/promtail/targets/filetarget.go 68.67% <0.00%> (-1.81%) ⬇️
pkg/logql/evaluator.go 90.69% <0.00%> (+0.58%) ⬆️
pkg/logql/vector.go 87.50% <0.00%> (+18.75%) ⬆️


### Where should we keep positions.yaml file ?

Since, the files we scrape is remotely placed in a objecstore, should we keep the `positions.yaml` also in the same place? Advantage of keeping the `positions.yaml` within the objecstore will help us to recover from the state where the scraping was left when promtail is running in docker container and the container crashes. However, `user` reading the files from the Object store would also need write access to the Objecstore. Getting write access may not be possible in some cases where people doesn't want to risk modifying the files by giving write access.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You gotta be very careful with this due to the data consistency model in object storage systems. You may get stale data on a read after write, meaning you scrape the same data twice. Potentially consider a highly consistent system like consul/etcd, similar to how Loki uses these for its distributor/ingester ring.

This would also enable usage of locking/leasing so that multiple promtail instances could coordinate scraping. I think this is pretty important since it's possible during rolling updates that you end up with multiple promtail instances running, and this could also lead to duplicated scrapes/data.

Copy link
Contributor Author

@adityacs adityacs May 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very valid point. Especially with S3 we should be very careful for any operation we do like LIST, PUT, GET because of eventual consistency issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for now if we just skip through out of order and try to always snapshot the position file when terminating and from time to time. This file should be just used for resuming/starting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. You still need to think about concurrency though.


### How should we support reading of compressed files ?

If the files in the Objectstore is stored in a compressed format, promtail should have support for reading the compressed data and we should make sure we support most of the compression algorithms like we already do for chunks(Snappy, LZ4, Gzip)
Copy link
Contributor

@cyriltovena cyriltovena May 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to have this addition but to make it simple we should add it after. And also we could add it for static config too. But let's not add this for the first MVP. It can still stays in this doc though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on this, we should add features one by one.


### How we should identify and watch newly added files ?

To identify and to start tailing the newly added files, we should list the files in the Object store at some desired intervals. How frequently we should this? Once we add the file to `watch`, to identify any changes to the file we should frequently check the modified `datetime` of the file. The list of files might be large and each watch will be running in a `go routine` and each `go routine` making frequent `http` requests to the Obejctstore. We should make sure this doesn't affect promtail's performance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not worried about goroutine. I think the frequency should be configurable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought that since this is a client side code we should make sure we don't consume lot of CPU cycles

Copy link

@mrmikemarshall mrmikemarshall Jun 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S3 is event driven and can be configured to trigger an event when updates occur. Setting up a Lambda in combination with SQS, promtail can be notified that new content has arrived. I’m ingesting data from S3 into Loki via promtail today using a similar architecture. This eliminates polling for changes to S3. I am interested in your proposed feature as I would use it for much more than load balancer logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mrmikemarshall
Thanks for the inputs. We definitely want to consider other options like SQS. However, SQS doesn't support FIFO and we might get out of order entries due to this. Also, as @Punkoivan mentioned below, AWS doesn't have any plan to support FIFO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our preliminary version, we just want to keep things simple. Later, we can plan to support both general polling and SQS


> There is no appending of lines to file in object store

In the positions.yaml along with the `offset` we should also store file modified `datetime`. If we see that date is modified we should reset the offset and tail the contents from the begining. This is the same behaviour in local filesystem as well. If a file is `opened`, `hpcloud tail` will tail the file from the begining.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the problem if we resume where the offset was when a file is modified ? I think it's fine to expect a file to be updated with a bigger content.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two scenarios here. One, file is just appended with more content and uploaded. Two, full content of the file is modified. How will we identify this? If the whole file is modified then we are expected to tail from the beginning

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do some research to see how it work for AWS loadbalancer logs which is the essence of this feature. I wondering if they are just dropping file.

I'm ok to no support update for the first version too and leave that for later. I don;'t think file update is a big requirement.

/cc @Punkoivan

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we see that date is modified we should reset the offset and tail the contents from the begining

I was thinking that modified time will be updated if some content was just appended.

We can store timestamp for the last line in the file and the compare to appended/replaced line's timestamp.
Pseudo code:

if previous_line_ts > new_line_ts
then content was fully replaced
if previous_line_ts < new_line_ts
then content was added.

That's based on the statement "Loki doesn't support out of order message". In other words even if we have new lines with older timestamp, it will be declined.

I assume it's bad approach which will lead to a huge overhead, especially in case where we have hundreds of files, but might be helpful to develop good approach :)


> How should we support reading of compressed files ?

It would be good to support most of the compression algorithms like we already do for chunks(Snappy, LZ4, Gzip).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add this after, it will be super easy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure


> Where should we keep `positions.yaml` file ?

I think this should be an user choice. One thing we need to consider here is that there will be API limit per account per region and if we store the `positions.yaml` in the object store itself, we may end up doing lot of APIs calls and we would get `RequestLimitExceeded` very frequently. Still, backoff and retry should be implemented in case we get this error even when doing API calls less frequently. So, placing and updating the `positions.yaml` should be left to the user so he can decide based on the API usage of his setup wether to keep it locally or remotely.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would start with storing the file first on disk, with docker and kubernetes it's pretty simple to ensure the file is always mount and kept safe.

We can store the file later in the object store in another iteration. The main reason for this is running multiple promtail will not work, at least not from the first iteration and so I don't see any big advantage for saving it in the cloud, except to run stateless workload which again is not a big deal nowadays. (e.g statefulset, volume mount, etc..)


> How we should identify and watch newly added files ?

Limit on API calls should also be considered for syncing the newly added files. To start tailing the newly added files, we should do `ListObjects` at a desired interval. We should make sure we don't keep this interval too small.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good points !


## path to store positions.yaml
filename: /path/to/positions.yaml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should support a glob to limit path in the bucket for two reasons :

  1. You can use this for scaling horizontally, aka running multiple promtails each of them with different path. ( That's cool until we have something better, like a ring to shard files across multiple promtails)
  2. You may not want to read and send from all files in the buckets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from couple of comments. This proposal looks good.

When you start working on this, please don't try to do it all at once, let's focus on a MVP aka send text log from object store to Loki with position file on disk.

Let's add the rest of the noise later. This will help ease reviews but also tackle problems at once.

@adityacs
Copy link
Contributor Author

adityacs commented May 27, 2020

Will update the doc with the list of features we would try to achieve in the first iteration

@pull-request-size pull-request-size bot added size/L and removed size/M labels Jun 3, 2020
Comment on lines +24 to +25
Instead of frequently doing `ListObjects`, we can consider to use event notification in S3 with `SQS`, through which we can get to know the newly added files and then we can fetch only them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyriltovena WDYT?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me put few words please.
SQS doesn't support FIFO for event notifications from S3. That means we will have "out of order" errors in Loki.

We're using S3, SQS, fluentd and loki now and sometimes get this error. AWS has no plan to support FIFO for S3 notifications, we asked them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQS doesn't support FIFO for event notifications from S3. That means we will have "out of order" errors in Loki.

Nice point.

AWS has no plan to support FIFO for S3 notifications, we asked them

This is very good information. It's also mentioned here https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html

Thanks @Punkoivan


### How to fetch the file and read the contents?

Here, we have two options. One, just get the file with a `reader` and iterate the lines of the file but if the files we are fetching are very large, doing `GetObject` will consume more memory. This might be a serious performance issue. Two, we can download the file to local and then tail the file the same way we do for other log files. This would help us reduce memory consumption when fetching large files. Also, we should consider to implement multipart download to make the downloading of large files faster.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyriltovena Need your inputs here

Comment on lines +103 to +107
1. We will use Cortex's object client code to list and read objects
2. `positions.yaml` will be kept locally
3. If the object is modifed, we will resume to read from the previous known position. Suppose, if the modified object's size is less than the previous know position then we read from the begining.
4. We will support scraping logs only from S3 in first iteration.
5. Only normal files and `gzip` files will be supported.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyriltovena This is the first iteration plan. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ELB logs will be published to S3 every 5mins by default. May be more than 1 file based on traffic. So, I think it's ok if we don't support reading the modified file in the first iteration.

FYR: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-access-logs.html

@stale
Copy link

stale bot commented Jul 12, 2020

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Jul 12, 2020
@adityacs adityacs added the keepalive An issue or PR that will be kept alive and never marked as stale. label Jul 13, 2020
@stale stale bot removed the stale A stale issue or PR that will automatically be closed. label Jul 13, 2020
@CLAassistant
Copy link

CLAassistant commented Apr 20, 2021

CLA assistant check
All committers have signed the CLA.

@sandstrom
Copy link

sandstrom commented May 25, 2021

@cyriltovena Just curious, any progress on this? Would be useful for accessing AWS ALB logs.


Also, as an a side, those "Sign our SLA" bots are pretty annoying. I mostly skip contributing to projects that require them.

If you really need it, make it dead simple, e.g. ask the user to comment in the PR and acknowledge that they agree, e.g. "Please add a comment to this thread saying 'I agree to the SLA per grafana.com/sla' and be done with it."

@kavirajk
Copy link
Contributor

Closing this as no activities for long time. Feel free to send new PR if anyone want's to revive the work

@kavirajk kavirajk closed this Mar 18, 2022
@sandstrom
Copy link

@kavirajk You'd probably want to update your automation, since this isn't done. 😄

image

@sandstrom
Copy link

This may be relevant for any repo watchers: #5065

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keepalive An issue or PR that will be kept alive and never marked as stale. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants