-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store: Distributed filesystem S3-like object storage implementation #326
Comments
Hey, Can I have more details on it?
Currently, data can be uploaded first to object storage and then you can write straightforward tool to download the data to the local HD, but I am lacking of context what you want to achieve. |
Sorry about the late reply. Hopefully this helps clear it up. My goal is to backup the TSDB data to a local system rather than GCS or AWS. We have two main datacenters and several smaller locations. At our main datacenters we purchased some large storage systems for longterm storage all of our prometheus data. We were originally going to federate all data upwards to these machines but the scrape times become too high to get a detailed picture of the data. A breakdown of what I am proposing:
Prometheus Scrape Server:
On each Prometheus scrape server we would run the Thanos backup (and probably query access.) As data is added to prometheus this would be backed up to the long term storage systems via thanos. Allowing us to make the scrape servers almost completely stateless. On the prometheus storage servers we would have the backup and gateway running so that on individual nodes they can do calculations against the larger dataset. We would also have the Query layer which would provide an interface for Grafana. |
hmm, so I think there are minor misunderstandings but overall what I get from this is that you would like to have "Filesystem Provider" or something like that, right? To be able to upload metrics somewhere to your NAS server or something like that and in the same way query metrics from there? Something like this, is achievable by "simply" implementing another adapter for our Bucket interface. See this short tutorial how to do it (currently in a PR): https://github.com/improbable-eng/thanos/blob/71634c354202d96f83ea22c3c1e1f194701b368e/docs/object_stores.md It is up to you how would you like to implement this, maybe iSCSI or NFS would help here, but basically Not sure, how consistency model would look like here as well as latency and cost for this, but technically it is doable. Regarding potential misunderstandings:
Thanos Store does not back up anything. It just use "read" bucket operations.
I don't understand that part, if you set up thanos sidecar it will gives you query access (expose StoreAPI that is used by thanos query components) and optional backup logic. This indeed makes the Prometheus almost stateless even in current form (GCS/S3 backup), but "almost" means still ~3h of fresh data that needs to be available. |
So my thought was to have prometheus scrape systems with SSD's and then have thanos push that data using it's backup logic to a centralized store (S3/GCS/NFS) so that individual prometheus servers don't have that data locally. That's the data I was hoping to backup to a file system. My other option is to setup something to provide an S3/GCS compatible interface but save to the file system but I was hoping that solution could be natively provided. I will look at making a provider that writes to NFS. Thank you |
Does the new title of this issue makes sense to you @jdfalk ? (: |
Yup that's perfect.
On Wed, May 16, 2018 at 7:19 AM Bartek Plotka ***@***.***> wrote:
Does the new title of this issue makes sense to you @jdfalk
<https://github.com/jdfalk> ? (:
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#326 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABP9mgnrQ8_zxydBLBY92dmy8lg6_daKks5tzAtRgaJpZM4T20Tk>
.
--
Thank You,
Johnathan Falk
|
Hi, As discussed on Slack, this is something we might also like to have! Implementing GetRange is just as easy as doing a Seek on the io object? Thanks! |
Cool, any volunteers? (: I won't have time for this currently and I don't have any Filesystem-like system available right now. |
@jdfalk If you're interested in setting up something like that until this issue is resolved, I recommend you check out Minio. It provides an S3-compatible API and in its simplest form, just writes objects as files and directories on a local filesystem. Getting started is as easy as running
If instead you already have a NAS set up, Minio can act as an S3-to-NAS gateway:
It can also act as a gateway in front of other object stores, such as Azure or GCS, if needed. |
I have seen this but have hard time to decrypt what minio actually gives by the gateway, good to know! |
That's pretty slick. I wonder how it will scale to having 10-20k of machines dumping data into prometheus and thanos pushing that much to it. I will have to run some load tests but thanks that's a promising solution. Really I wouldn't even be opposed if thanos would just write all the blocks as is to the long term storage system's file system so it looks like a giant prometheus folder; then having it perform compaction, maintenance, etc. My biggest goal is ensuring the data is replicated, backed up, and queryable from our remote datacenters. |
I think you'll find the receive component able to do what you want, it' still WIP and you can track progress here: #1093 In your use case, I guess you could have Prometheus remote-writing to the Thanos Receive component in your main datacenter/storage system, would that be alright ? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
We would like to be able to have the Thanos instances backup to a central thanos instance and have it write the data to the local HD.
i.e. prom1/thanos1 -> thanosstorage1 -> /var/lib/thanos/feddata
The text was updated successfully, but these errors were encountered: