Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enhancement](memtracker) Optimize query memory accuracy #11740

Merged
merged 1 commit into from
Aug 16, 2022

Conversation

xinyiZzz
Copy link
Contributor

@xinyiZzz xinyiZzz commented Aug 12, 2022

Proposed changes

Issue Number: close #11738

Problem summary

motivation

The value of the query mem tracker is consistent with the physical memory actually used by the query.

problem causes

Currently, only the virtual memory used by the query can be tracked through the tcmalloc hook. When the memory is not fully used after the application, the recorded virtual memory will be larger than the physical memory.

At present, it is mainly because PODArray does not memset 0 when applying for memory, and blocks applied for through PODArray in places such as VOlapScanNode::_free_blocks are usually used for memory reuse and cannot be fully used.

Fix

The query mem tracker only records the peak memory used by PODArray and MemPool

Checklist(Required)

  1. Does it affect the original behavior:
    • Yes
    • No
    • I don't know
  2. Has unit tests been added:
    • Yes
    • No
    • No Need
  3. Has document been added or modified:
    • Yes
    • No
    • No Need
  4. Does it need to update dependencies:
    • Yes
    • No
  5. Are there any changes that cannot be rolled back:
    • Yes (If Yes, please explain WHY)
    • No

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@xinyiZzz xinyiZzz force-pushed the mem_tracker_factor_v2_fix8_push branch from addbcda to dd54cb5 Compare August 12, 2022 17:53
Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yiguolei yiguolei merged commit 2a1803c into apache:master Aug 16, 2022
yiguolei pushed a commit that referenced this pull request Mar 26, 2023
#11740 , solved the problem that the query memory statistics are higher than the actual physical memory, because PODArray does not have memset 0 when allocating memory, and the query mem tracker is virtual memory.

But in extreme cases, such as csv load, PODArray frequent insert will cause performance problems. So revert part of #11740 and part of #12820.

The accuracy of the query mem tracker, there is currently no feedback, no further attention.
gnehil pushed a commit to gnehil/doris that referenced this pull request Apr 21, 2023
…8010

apache#11740 , solved the problem that the query memory statistics are higher than the actual physical memory, because PODArray does not have memset 0 when allocating memory, and the query mem tracker is virtual memory.

But in extreme cases, such as csv load, PODArray frequent insert will cause performance problems. So revert part of apache#11740 and part of apache#12820.

The accuracy of the query mem tracker, there is currently no feedback, no further attention.
mongo360 pushed a commit to mongo360/doris that referenced this pull request Jul 12, 2023
…8010

apache#11740 , solved the problem that the query memory statistics are higher than the actual physical memory, because PODArray does not have memset 0 when allocating memory, and the query mem tracker is virtual memory.

But in extreme cases, such as csv load, PODArray frequent insert will cause performance problems. So revert part of apache#11740 and part of apache#12820.

The accuracy of the query mem tracker, there is currently no feedback, no further attention.
xinyiZzz added a commit to xinyiZzz/incubator-doris that referenced this pull request Jul 28, 2023
…8010

apache#11740 , solved the problem that the query memory statistics are higher than the actual physical memory, because PODArray does not have memset 0 when allocating memory, and the query mem tracker is virtual memory.

But in extreme cases, such as csv load, PODArray frequent insert will cause performance problems. So revert part of apache#11740 and part of apache#12820.

The accuracy of the query mem tracker, there is currently no feedback, no further attention.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enhancement] Improve query memory tracking accuracy
2 participants