Skip to content
This repository has been archived by the owner on Feb 17, 2024. It is now read-only.

Log indexing #139

Open
chriso opened this issue Jun 29, 2023 · 1 comment
Open

Log indexing #139

chriso opened this issue Jun 29, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@chriso
Copy link
Contributor

chriso commented Jun 29, 2023

No description provided.

@chriso chriso added the enhancement New feature or request label Jun 29, 2023
@gernest
Copy link

gernest commented Oct 10, 2023

Some notes I gathered relevant to this feature while working on #232

  • Only relevant field to index are (*Record).Time , (*Record).FunctionID and (*Record).FunctionCall I don't see any merits in indexing FunctionCall will someone query logs by the value of the syscall argument ?
  • The actual log data (*Record).FunctionCall is not structured and is in custom encoding between features. Indexing should be context aware ( features should be responsible for indexing their own data)
  • (*Record).Offset is not stored on the segment. It is dynamically set while reading . This limits how much you can skip when querying batches. When you create inverted index that finds batches with relevant logs you will be forced to potentially read the full batch and filter relevant logs in memory.
  • Record is coupled with syscall

Potentially as current api stand maybe indexing timestamps (*Record).Time will make sense and allow commands to accept -start-ts and --end-ts . When reading logs we can skip batches that have no records in the time time range

note : these notes can be incorrect , they come from my limited time hacking on something different.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants