Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to metadata cache expiration on full metadata refresh #4677

Merged
merged 4 commits into from
Apr 15, 2024

Conversation

emasab
Copy link
Collaborator

@emasab emasab commented Apr 8, 2024

Metadata cache was cleared on full metadata
refresh, leading to unnecessary refreshes and
occasional UNKNOWN_TOPIC_OR_PART errors.
Solved by updating cache for existing or
hinted entries instead of clearing them.
Happening since 2.1.0

@emasab emasab force-pushed the dev_fix_metadata_cache_cleared_full_request branch from fa6ea32 to 6f3c1b5 Compare April 8, 2024 19:00
@emasab emasab force-pushed the dev_metadata_cache_topic_id branch from 4d7eab9 to 8ea3b9e Compare April 11, 2024 09:17
@emasab emasab force-pushed the dev_fix_metadata_cache_cleared_full_request branch 2 times, most recently from b27ccc6 to d4d40c1 Compare April 11, 2024 13:47
Base automatically changed from dev_metadata_cache_topic_id to dev_kip848 April 11, 2024 15:12
@emasab emasab force-pushed the dev_fix_metadata_cache_cleared_full_request branch from d4d40c1 to d9540da Compare April 11, 2024 16:34
src/rdkafka_metadata.c Show resolved Hide resolved
rd_kafka_t *rk,
const rd_kafka_metadata_topic_t *mdt,
const rd_kafka_metadata_topic_internal_t *mdit,
rd_bool_t propagate,
rd_bool_t include_metadata,
rd_kafka_metadata_broker_internal_t *brokers,
size_t broker_cnt);
size_t broker_cnt,
rd_bool_t only_existing);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better name maybe update_only_existing and let's explain what it means in the doc.

Copy link
Collaborator Author

@emasab emasab Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you search for only_ the code convention in this repo is to avoid repeating the action in variable names, for example in rd_kafka_buf_write_topic_partitions we have only_invalid_offsets not write_only_invalid_offsets.

Okey for the doc

src/rdkafka_metadata_cache.c Show resolved Hide resolved
rkmce = rd_kafka_metadata_cache_find(rk, mdt->topic, 0);
if (!rkmce)
return 0;
} else if (unlikely(!mdt->topic)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not checking only_existing part.

on full metadata refresh

Metadata cache was cleared on full metadata
refresh, leading to unnecessary refreshes and
occasional `UNKNOWN_TOPIC_OR_PART` errors.
Solved by updating cache for existing or
hinted entries instead of clearing them.
Happening since 2.1.0
@emasab emasab force-pushed the dev_fix_metadata_cache_cleared_full_request branch from d9540da to a159193 Compare April 14, 2024 10:54
Copy link
Member

@pranavrth pranavrth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!. Just a couple of minor comments.

@@ -35,15 +35,6 @@ def read_scenario_conf(scenario):
return parser.load(f)


# FIXME: merge in trivup
class KafkaBrokerApp(KafkaBrokerAppOrig):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is expected right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it was added back during a rebase

int main_0146_metadata_mock(int argc, char **argv) {
TEST_SKIP_MOCK_CLUSTER(0);

do_test_metadata_persists_in_cache("range");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not round robin case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference can be because of taking the eager path instead of the incremental one, but roundrobin also takes the eager path, as "range", so it's not important for this test.

@emasab emasab merged commit 590f18c into dev_kip848 Apr 15, 2024
2 checks passed
@emasab emasab deleted the dev_fix_metadata_cache_cleared_full_request branch April 15, 2024 07:38
emasab added a commit that referenced this pull request Apr 15, 2024
Metadata cache was cleared on full metadata
refresh, leading to unnecessary refreshes and
occasional `UNKNOWN_TOPIC_OR_PART` errors.
Solved by updating cache for existing or
hinted entries instead of clearing them.
Happening since 2.1.0
emasab added a commit that referenced this pull request Apr 18, 2024
Metadata cache was cleared on full metadata
refresh, leading to unnecessary refreshes and
occasional `UNKNOWN_TOPIC_OR_PART` errors.
Solved by updating cache for existing or
hinted entries instead of clearing them.
Happening since 2.1.0
anchitj pushed a commit that referenced this pull request Jun 10, 2024
Metadata cache was cleared on full metadata
refresh, leading to unnecessary refreshes and
occasional `UNKNOWN_TOPIC_OR_PART` errors.
Solved by updating cache for existing or
hinted entries instead of clearing them.
Happening since 2.1.0
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants