Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential for missed log messages between dump + clear commands #958

Open
jphickey opened this issue Oct 20, 2020 · 3 comments
Open

Potential for missed log messages between dump + clear commands #958

jphickey opened this issue Oct 20, 2020 · 3 comments
Labels

Comments

@jphickey
Copy link
Contributor

Describe the bug
This issue was initially described in #956 but isolated to a separate ticket for discussion/triage.

The ES syslog "dump" and "clear" are separate commands, so there is a window of opportunity between these actions where messages can be lost - as it is not possible to guarantee that no additional messages were written during this time.

To Reproduce

  1. Log messages are written
  2. Dump command issued
  3. More log messages are written
  4. Clear command issued.

Logs written in (3) above are lost, as they are not in the dump file but they will be cleared by the clear command.

Expected behavior
Should have command structure which can ensure that no messages get lost.

Reporter Info
Joseph Hickey, Vantage Systems, Inc., generalized from comment at #956 (comment)

@jphickey jphickey added the CCB:Ready Ready for discussion at the Configuration Control Board (CCB) label Oct 20, 2020
@jphickey
Copy link
Contributor Author

Marking for discussion to determine if/how we should to close this hole

@jphickey jphickey added the bug label Oct 20, 2020
@jsjoberg01
Copy link

I would recommend one of the following:
a. A shadow copy taken prior to Dump, and all writing occurs on the other copy. This would be not need locking if the index was word sized.
b. Use the implementation for #956 instead of ReadStart.

Locking from ReadStart through to the Clear could block the entire system making it a nexus to stall the whole system (bad).

@astrogeco
Copy link
Contributor

CCB 2020-10-21

Low priority, we need to make sure to report this behavior.
Potential solution is to have the dump command also trigger a reset.
Keep tickets open.

@astrogeco astrogeco added CCB-20201021 and removed CCB:Ready Ready for discussion at the Configuration Control Board (CCB) labels Oct 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants