feat: improve compaction job state management #3519

aleks-p · 2024-08-26T11:40:31Z

The main change here is how we update the state of compaction jobs, in particular when workers are polling with state updates and ask for jobs.

The current implementation intermixes the persistence (boltdb) and in-memory state updates. This causes a few cases where an error could leave the 2 storage layers in an inconsistent state.

The new implementation does everything in memory first and constructs a list of items that need to be durably stored. If we fail to durably store something or otherwise end up in an unexpected state while persisting, the application will panic.

Bonus:

handle failed compaction jobs (with max retries)
prioritize jobs on compaction level before lease expiry
unit tests for compaction job creation and state management
convert some flags to config variables
rename "job pre queue" to "job block queue"

aleks-p · 2024-08-26T12:46:11Z

pkg/experiment/metastore/metastore_state_poll_compaction_jobs.go

+		case compactorv1.CompactionStatus_COMPACTION_STATUS_IN_PROGRESS:
+			m.compactionJobQueue.update(statusUpdate.JobName, raftAppendedAtNanos, statusUpdate.RaftLogIndex)
+			stateUpdate.updatedJobs = append(stateUpdate.updatedJobs, job.Name)
+		case compactorv1.CompactionStatus_COMPACTION_STATUS_FAILURE:


This is still missing a failure reason from the worker, I am considering adding it so that we can persist it and show it in tooling. However, after reaching max failures we currently delete the job to avoid accumulating jobs so the failure reason would only be observed temporarily.

We could implement a job retention policy for this (we might need it anyway) but I would do that separately.

kolesnikovae · 2024-08-27T11:59:47Z

pkg/experiment/metastore/metastore_state_poll_compaction_jobs.go

+
+func (m *metastoreState) writeToDb(sTable *pollStateUpdate) error {
+	return m.db.boltdb.Update(func(tx *bbolt.Tx) error {
+		for shard, blocks := range sTable.newBlocks {


We need to ensure atomicity of operations within a shard. Consider a case:

We remove source blocks.

We handle query that targets the blocks we just have removed.

We add compacted blocks.

For me it's not very apparent how we're handling this. I suspect that the query will return incomplete results

Good point!

Currently from what I can see in the read path we read data directly from memory. 9442699 locks the shards mutex while replacing source blocks with compacted ones so things should be a bit better. Ideally we should lock a single shard but this makes the code more complex and swapping of blocks should anyway be quite fast.

aleks-p requested review from a team as code owners August 26, 2024 11:40

aleks-p commented Aug 26, 2024

View reviewed changes

kolesnikovae reviewed Aug 27, 2024

View reviewed changes

kolesnikovae approved these changes Aug 27, 2024

View reviewed changes

aleks-p added 12 commits August 27, 2024 16:31

Rename compaction 'pre queue' to 'block queue'

822dcc5

Minor refactoring of compaction planning, a few unit tests

e379ad6

Verify the db and in-memory states match

04f6a68

Improve consistency of compaction job status updates, add tests

c2135d1

Prioritize compaction level over expiry

c050502

Simplify state updates for compaction job polling

43f3c7c

Move a few constants to config

98eaaf1

Add test for handling compaction failures

e9aa88d

Restore compaction metrics

817626c

Remove unused functions, error return value

edc001f

Improve logging, naming, add a few comments

8338d65

Lock all segments when replacing source blocks with compacted blocks

9442699

aleks-p force-pushed the feat/v2/compaction-planning branch from 4c5630e to 9442699 Compare August 27, 2024 19:40

Keep cancelled jobs in the queue / db

8dd252c

aleks-p merged commit 32621d5 into main Aug 28, 2024
18 checks passed

aleks-p deleted the feat/v2/compaction-planning branch August 28, 2024 11:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve compaction job state management #3519

feat: improve compaction job state management #3519

aleks-p commented Aug 26, 2024

aleks-p Aug 26, 2024 •

edited

Loading

kolesnikovae Aug 27, 2024

aleks-p Aug 27, 2024

feat: improve compaction job state management #3519

feat: improve compaction job state management #3519

Conversation

aleks-p commented Aug 26, 2024

aleks-p Aug 26, 2024 • edited Loading

Choose a reason for hiding this comment

kolesnikovae Aug 27, 2024

Choose a reason for hiding this comment

aleks-p Aug 27, 2024

Choose a reason for hiding this comment

aleks-p Aug 26, 2024 •

edited

Loading