store: Distributed filesystem S3-like object storage implementation #326

jdfalk · 2018-05-08T15:04:47Z

We would like to be able to have the Thanos instances backup to a central thanos instance and have it write the data to the local HD.

i.e. prom1/thanos1 -> thanosstorage1 -> /var/lib/thanos/feddata

bwplotka · 2018-05-08T19:49:28Z

Hey, Can I have more details on it?

What do you mean by Thanos instance? What exactly component?
What is a central thanos instance?
What exactly data you have in mind? TSDB Blocks?
Can you share what problem/goal you want to solve? Do you want to kind of use local HD as "object storage"? (so you want to browse the data using store Thanos gateway)?

Currently, data can be uploaded first to object storage and then you can write straightforward tool to download the data to the local HD, but I am lacking of context what you want to achieve.

jdfalk · 2018-05-14T15:23:39Z

Sorry about the late reply. Hopefully this helps clear it up.

My goal is to backup the TSDB data to a local system rather than GCS or AWS. We have two main datacenters and several smaller locations. At our main datacenters we purchased some large storage systems for longterm storage all of our prometheus data. We were originally going to federate all data upwards to these machines but the scrape times become too high to get a detailed picture of the data.

A breakdown of what I am proposing:
Prometheus Storage Server:

Thanos Backup / Store Gateway — Saving TSDB to local system for long term storage.
Thanos Query Layer / Thanos Query Access — Point Grafana towards the query layer.
Thanos Compactor — Compact storage.

Prometheus Scrape Server:

Thanos Backup saving to Prometheus Storage Server
Thanos Query Access — Not sure if that’s necessary as all data would be migrated to the Prometheus Storage Server.

On each Prometheus scrape server we would run the Thanos backup (and probably query access.) As data is added to prometheus this would be backed up to the long term storage systems via thanos. Allowing us to make the scrape servers almost completely stateless.

On the prometheus storage servers we would have the backup and gateway running so that on individual nodes they can do calculations against the larger dataset. We would also have the Query layer which would provide an interface for Grafana.

bwplotka · 2018-05-15T14:10:09Z

hmm, so I think there are minor misunderstandings but overall what I get from this is that you would like to have "Filesystem Provider" or something like that, right? To be able to upload metrics somewhere to your NAS server or something like that and in the same way query metrics from there?

Something like this, is achievable by "simply" implementing another adapter for our Bucket interface. See this short tutorial how to do it (currently in a PR): https://github.com/improbable-eng/thanos/blob/71634c354202d96f83ea22c3c1e1f194701b368e/docs/object_stores.md

It is up to you how would you like to implement this, maybe iSCSI or NFS would help here, but basically GetRange operation is non-trivial. (Getting the arbitrary number of bytes from an object).

Not sure, how consistency model would look like here as well as latency and cost for this, but technically it is doable.

Regarding potential misunderstandings:
Thanos query does not care about object storage it just cares about gRPC StoreAPI it has access to, so not sure if thanos query is relevant in this particular discussion.

On the prometheus storage servers we would have the backup and gateway

Thanos Store does not back up anything. It just use "read" bucket operations.

On each Prometheus scrape server we would run the Thanos backup (and probably query access.)

I don't understand that part, if you set up thanos sidecar it will gives you query access (expose StoreAPI that is used by thanos query components) and optional backup logic.

This indeed makes the Prometheus almost stateless even in current form (GCS/S3 backup), but "almost" means still ~3h of fresh data that needs to be available.

jdfalk · 2018-05-15T14:44:58Z

So my thought was to have prometheus scrape systems with SSD's and then have thanos push that data using it's backup logic to a centralized store (S3/GCS/NFS) so that individual prometheus servers don't have that data locally. That's the data I was hoping to backup to a file system. My other option is to setup something to provide an S3/GCS compatible interface but save to the file system but I was hoping that solution could be natively provided. I will look at making a provider that writes to NFS.

Thank you

bwplotka · 2018-05-16T11:19:44Z

Does the new title of this issue makes sense to you @jdfalk ? (:

jdfalk · 2018-05-16T11:34:25Z

Yup that's perfect.

On Wed, May 16, 2018 at 7:19 AM Bartek Plotka ***@***.***> wrote: Does the new title of this issue makes sense to you @jdfalk <https://github.com/jdfalk> ? (: — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#326 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABP9mgnrQ8_zxydBLBY92dmy8lg6_daKks5tzAtRgaJpZM4T20Tk> .

-- Thank You, Johnathan Falk

dupondje · 2018-05-16T13:00:22Z

Hi,

As discussed on Slack, this is something we might also like to have!
We have redundant SAN systems, so we would like to have data stored there instead of on GCS or S3.

Implementing GetRange is just as easy as doing a Seek on the io object?

Thanks!

bwplotka · 2018-05-16T13:22:40Z

Cool, any volunteers? (: I won't have time for this currently and I don't have any Filesystem-like system available right now.

BenoitKnecht · 2018-06-14T07:34:28Z

My other option is to setup something to provide an S3/GCS compatible interface but save to the file system but I was hoping that solution could be natively provided.

@jdfalk If you're interested in setting up something like that until this issue is resolved, I recommend you check out Minio. It provides an S3-compatible API and in its simplest form, just writes objects as files and directories on a local filesystem.

Getting started is as easy as running

docker run -p 9000:9000 --name minio -v /mnt/data:/data minio/minio server /data

If instead you already have a NAS set up, Minio can act as an S3-to-NAS gateway:

minio gateway nas /path/to/nfs-volume

It can also act as a gateway in front of other object stores, such as Azure or GCS, if needed.

bwplotka · 2018-06-14T12:00:52Z

I have seen this but have hard time to decrypt what minio actually gives by the gateway, good to know!

jdfalk · 2018-06-14T17:48:44Z

That's pretty slick. I wonder how it will scale to having 10-20k of machines dumping data into prometheus and thanos pushing that much to it. I will have to run some load tests but thanks that's a promising solution.

Really I wouldn't even be opposed if thanos would just write all the blocks as is to the long term storage system's file system so it looks like a giant prometheus folder; then having it perform compaction, maintenance, etc. My biggest goal is ensuring the data is replicated, backed up, and queryable from our remote datacenters.

adrien-f · 2019-05-14T12:11:09Z

I think you'll find the receive component able to do what you want, it' still WIP and you can track progress here: #1093

In your use case, I guess you could have Prometheus remote-writing to the Thanos Receive component in your main datacenter/storage system, would that be alright ?

stale · 2020-01-11T06:42:50Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

bwplotka added the feature request/improvement label May 8, 2018

bwplotka removed the feature request/improvement label May 9, 2018

bwplotka added feature request/improvement difficulty: hard labels May 16, 2018

bwplotka changed the title ~~Allow Local Storage Backup~~ store: Distributed filesystem S3-like object storage implementation May 16, 2018

bwplotka added difficulty: medium and removed difficulty: hard labels May 16, 2018

bwplotka mentioned this issue Jul 5, 2018

Store: when support openTSDB, rrdtool or whisper #409

Closed

stale bot added the stale label Jan 11, 2020

stale bot closed this as completed Jan 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

store: Distributed filesystem S3-like object storage implementation #326

store: Distributed filesystem S3-like object storage implementation #326

jdfalk commented May 8, 2018

bwplotka commented May 8, 2018

jdfalk commented May 14, 2018

bwplotka commented May 15, 2018

jdfalk commented May 15, 2018

bwplotka commented May 16, 2018

jdfalk commented May 16, 2018 via email

dupondje commented May 16, 2018

bwplotka commented May 16, 2018

BenoitKnecht commented Jun 14, 2018

bwplotka commented Jun 14, 2018

jdfalk commented Jun 14, 2018

adrien-f commented May 14, 2019

stale bot commented Jan 11, 2020

store: Distributed filesystem S3-like object storage implementation #326

store: Distributed filesystem S3-like object storage implementation #326

Comments

jdfalk commented May 8, 2018

bwplotka commented May 8, 2018

jdfalk commented May 14, 2018

bwplotka commented May 15, 2018

jdfalk commented May 15, 2018

bwplotka commented May 16, 2018

jdfalk commented May 16, 2018 via email

dupondje commented May 16, 2018

bwplotka commented May 16, 2018

BenoitKnecht commented Jun 14, 2018

bwplotka commented Jun 14, 2018

jdfalk commented Jun 14, 2018

adrien-f commented May 14, 2019

stale bot commented Jan 11, 2020