Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monasca-persister database #505

Open
zhangjianweibj opened this issue Mar 4, 2019 · 6 comments
Open

monasca-persister database #505

zhangjianweibj opened this issue Mar 4, 2019 · 6 comments

Comments

@zhangjianweibj
Copy link
Contributor

hello, i find monasca-psersister has finished cassandra databases as storage's work.is monasca can use this feature?at present,influxdb opensource version has no ha and cluster featuren.so it is no suitable at production environment.

image

@witekest
Copy link
Member

witekest commented Mar 5, 2019

You can achieve HA with InfluxDB using InfluxDB-relay. Another, even better option is to deploy several InfluxDB instances and configure persisters in different Kafka consumer groups. Measurement will be duplicated for every consumer group.
You can also achieve HA on the filesystem level (e.g. using GlusterFS).
For horizontal scaling you would have to do sharding.

Cassandra support is fully functional. At SUSE we offer optionally both InfluxDB and Cassandra.
InfluxDB implementation is simpler and better suited for an opensource project. Thus it has wider community adoption.

@matrixik
Copy link
Member

matrixik commented Mar 5, 2019

Take into account I never used Monasca with Cassandra myself.
Tests for persister are running on every change but I don't know if anyone is using actively Python version of persister with Cassandra. I believe that Suse is using Cassandra but with old Java version of persister.
You could probably expect lower performance than InfluxDB, probably not with small amount of data but if I remember correctly some people complained that Cassandra is slowing down even more with big amount of stored and incoming data. So to actually evaluate it you would need to test it yourself.

Sorry, I can't give you better answer.

@zhangjianweibj
Copy link
Contributor Author

@witekest @matrixik ok.very thanks.at present,we use influxdb-relay as ha component and backend has three influxdb pod.but we find it is not a reliable solution.if we killed a pod ,monasca persister may access a bad one (at present ,influxdb-relay can not closed a bad service,it forward request to backend at random ),and influxdb-relay return 204 code.then monasca persister crashed.on the other hand,if a influxdb pod crashed a certain time. then it start and lost many metrics. a user get metrics from monasca api. users may get different metrics at different visit time.

now,we plan to use cassandra as storage.and dockfile has finished.but do not know the cql of cassandra to create tables that monasca persister used.can you help me,tell us detail cql of tables creation.very thanks.

@zhangjianweibj
Copy link
Contributor Author

at present ,we write a cql according to monasca persister conf.py.but has some mistakes.

create schema monasca with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };
create table monasca.dimensions_metrics (
region text,
tenant_id text,
dimension_name text,
dimension_value text,
metric_name text,
updated_at timestamp,
primary key ((region, tenant_id, dimension_name, dimension_value), metric_name)
);

create table monasca.alarm_state_history (
tenant_id text,
alarm_id text,
metric text,
old_state text,
new_state text,
sub_alarms text,
reason text,
reason_data text,
time_stamp timestamp,

primary key ((tenant_id, alarm_id, old_state, new_state, sub_alarms,reason , reason_data), metric)
);

create table monasca.measurements (
tenant_id text,
region text,
bucket_start timestamp,
metric_name text,
dimensions text,
time_stamp timestamp,
value float,
value_meta text,
primary key ((tenant_id, region, bucket_start, metric_name, dimensions), time_stamp)
);

@witekest
Copy link
Member

You can find the schema in DevStack plugin.

@witekest
Copy link
Member

@witekest @matrixik ok.very thanks.at present,we use influxdb-relay as ha component and backend has three influxdb pod.but we find it is not a reliable solution.if we killed a pod ,monasca persister may access a bad one (at present ,influxdb-relay can not closed a bad service,it forward request to backend at random ),and influxdb-relay return 204 code.then monasca persister crashed.on the other hand,if a influxdb pod crashed a certain time. then it start and lost many metrics. a user get metrics from monasca api. users may get different metrics at different visit time.

now,we plan to use cassandra as storage.and dockfile has finished.but do not know the cql of cassandra to create tables that monasca persister used.can you help me,tell us detail cql of tables creation.very thanks.

Yes, InfluxDB instance which was offline/crashed for a longer time should be removed from the query pool until recovered. If you use Kafka consumer groups for measurements replication you won't loose any measurements! They will be cached in Kafka until the messages can be consumed again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants