Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature-wip] (memory tracker) (step1) Refactor impl of MemTracker, and related use #8322

Merged
merged 5 commits into from
Mar 11, 2022

Conversation

xinyiZzz
Copy link
Contributor

@xinyiZzz xinyiZzz commented Mar 3, 2022

Proposed changes

Issue Number: close #7196 (step 1/3)

Problem Summary:

Modify the implementation of MemTracker:

  1. Simplify a lot of useless logic;
  2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing;
  3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes;
  4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection
  5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently;
  6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later;
  7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env;
  8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.;

Modify where MemTracker is used:

  1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code;
  2. Added trackers for global objects such as ChunkAllocator and StorageEngine;
  3. Added more fine-grained trackers such as ExprContext;
  4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode;
  5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;

Checklist(Required)

  1. Does it affect the original behavior: (Yes)
  2. Has unit tests been added: (No)
  3. Has document been added or modified: (Yes)
  4. Does it need to update dependencies: (Yes)
  5. Are there any changes that cannot be rolled back: (No)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@xinyiZzz xinyiZzz changed the title refactor_mem_tracker [refactor] Impl of MemTracker, and related use Mar 3, 2022
@@ -244,7 +244,7 @@ void HashTable::grow_node_array() {
_alloc_list.push_back(_current_nodes);
_end_list.push_back(_current_nodes + alloc_size);

_mem_tracker->Consume(alloc_size);
_mem_tracker->consume(alloc_size);
Copy link
Contributor

@yiguolei yiguolei Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many codes call mem_tracker->somsume(size_t), Is there a better method to do this automatically?
For example, we may rewrite the memory allocators and track the memory usage during allocate, like in clickhouse:

/// Implementation of std::allocator interface that tracks memory with MemoryTracker.
/// NOTE We already plug MemoryTracker into new/delete operators. So, everything works even with default allocator.
/// But it is enabled only if jemalloc is used (to obtain the size of the allocation on call to delete).
/// And jemalloc is disabled for builds with sanitizers. In these cases memory was not always tracked.

template
struct AllocatorWithMemoryTracking

Copy link
Contributor Author

@xinyiZzz xinyiZzz Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yiguolei I've done something similar with reference to clickhouse to automatically track memory usage via TCMalloc Hook
PR: #7198
But this PR is too big, so I was advised to split the commit =_=, I will push other codes next week.

Different from Jemalloc, overloading the TCMalloc new/delete operator needs to invade the source code of TCMalloc, so it is implemented by Hook;

Copy link
Contributor

@yiguolei yiguolei Mar 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are too many places call memory tracker and we may lost to add mem_tracker.consume in some node or thread. If we are following clickhouse's method, I think it do not need to call memory tracker everywhere and call mem tracker.consume so many times.

We could attach the running thread to a conext, the context maybe a query context or a compaction context or olap scanner. And rewrite the new and delete method to update the memory tracker automatically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are too many places call memory tracker and we may lost to add mem_tracker.consume in some node or thread. If we are following clickhouse's method, I think it do not need to call memory tracker everywhere and call mem tracker.consume so many times.

We could attach the running thread to a conext, the context maybe a query context or a compaction context or olap scanner. And rewrite the new and delete method to update the memory tracker automatically.

This is done by following next PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are too many places call memory tracker and we may lost to add mem_tracker.consume in some node or thread. If we are following clickhouse's method, I think it do not need to call memory tracker everywhere and call mem tracker.consume so many times.

We could attach the running thread to a conext, the context maybe a query context or a compaction context or olap scanner. And rewrite the new and delete method to update the memory tracker automatically.

You are right, I have already implemented it like this. In #7198 mentioned above,
you can see the original design document last year: https://shimo.im/docs/DT6JXDRkdTvdyV3G

be/src/exec/es/es_scroll_parser.cpp Show resolved Hide resolved
be/src/exprs/expr_context.cpp Outdated Show resolved Hide resolved
be/src/olap/task/engine_alter_tablet_task.cpp Outdated Show resolved Hide resolved
be/src/olap/task/engine_batch_load_task.h Outdated Show resolved Hide resolved
@xinyiZzz xinyiZzz force-pushed the refactor_mem_tracker branch 12 times, most recently from 8c5f3a6 to c39621c Compare March 9, 2022 03:36
be/src/exec/exec_node.cpp Show resolved Hide resolved
@@ -244,6 +244,7 @@ class OlapScanNode : public ScanNode {
TResourceInfo* _resource_info;

int64_t _buffered_bytes;
std::shared_ptr<MemTracker> _scanner_mem_tracker;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment for this mem tracker

Copy link
Contributor Author

@xinyiZzz xinyiZzz Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, Count the memory consumption of Rowset Reader and Tablet Reader in OlapScanner
_scanner_mem_tracker replaces the previous runtime_state->fragment_mem_tracker(), both of which are also used to count the memory of OlapScanner.
The difference is that,

  1. The parent of runtime_state->fragment_mem_tracker() is process_mem_tracker, which is used in segment_reader of segmentV1 to determine whether OlapScanner exceeds the mem limit.
  2. The parent of _scanner_mem_tracker is query_mem_tracker, which removes the logic of OlapScanner mem limit exceed in segment_reader.

be/src/exec/partitioned_hash_table.cc Outdated Show resolved Hide resolved
be/src/exprs/expr_context.h Show resolved Hide resolved
be/src/exprs/expr_context.cpp Outdated Show resolved Hide resolved
be/src/runtime/memory/chunk_allocator.cpp Show resolved Hide resolved
be/src/runtime/memory/chunk_allocator.cpp Outdated Show resolved Hide resolved
be/src/runtime/plan_fragment_executor.h Show resolved Hide resolved
be/src/service/doris_main.cpp Show resolved Hide resolved
be/src/util/mem_info.h Show resolved Hide resolved
@yiguolei
Copy link
Contributor

I think maybe we should split the memory tracker to a small pr, for example we may first implement only Query Memory tracker. It is more useful and important.

@xinyiZzz
Copy link
Contributor Author

I think maybe we should split the memory tracker to a small pr, for example we may first implement only Query Memory tracker. It is more useful and important.

It is also possible to split the refactoring of the mem tracker into multiple small PRs, but it takes time to deal with the conflicting parts.

Let’s merge this first, the pr of Hook tcmalloc new/delete later depends on this ==, waiting for push

Next pr for help review~

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman merged commit e17aef9 into apache:master Mar 11, 2022
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Mar 11, 2022
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman added dev/backlog waiting to be merged in future dev branch and removed reviewed labels Mar 11, 2022
@xiaokang
Copy link
Contributor

@xinyiZzz When I test for #8451 , I encounter a memory limit problem.

The problem is that, after the long query, as specified in the test steps of #8451 , is finished, a simple query 'select count() from tableA' will raise memory limit error.

I guess it's related to this pr, since the problem is not present before I merge the new MemTracker code.

The following is mysql client error message.

ERROR 1105 (HY000): errCode = 2, detailMessage = Memory exceed limit. fragment=4f5f114582e7429d-a630eec6e0e45384, details=New partitioned aggregation, while getting next from child 0., on backend=172.16.44.107. Memory left in process limit=8589934592.00 GB. current tracker <label=RuntimeState:inst

The following is some related logs of be.

I0312 10:58:07.931836 2320 plan_fragment_executor.cpp:76] PlanFragmentExecutor::prepare|pthread_id=140354754955008|backend_num=1|instance_id=35895a325c6943dc
-872eced6dfcb8c91|query_id=35895a325c6943dc-872eced6dfcb8c90
I0312 10:58:07.936765 2203 fragment_mgr.cpp:459] PlanFragmentExecutor::_exec_actual|pthread_id=140355728508672|instance_id=35895a325c6943dc-872eced6dfcb8c91|
query_id=35895a325c6943dc-872eced6dfcb8c90
I0312 10:58:07.936780 2203 plan_fragment_executor.cpp:213] PlanFragmentExecutor::open, using query memory limit: 7.59 GB|mem_limit=8147483648|instance_id=358
95a325c6943dc-872eced6dfcb8c91|query_id=35895a325c6943dc-872eced6dfcb8c90
W0312 10:58:07.936826 2203 status.h:260] warning: Status msg truncated, OK: Memory exceed limit. fragment=35895a325c6943dc-872eced6dfcb8c91, details=New part
itioned aggregation, while getting next from child 0., on backend=172.16.44.107. Memory left in process limit=8589934592.00 GB. current tracker <label=Runtime
State:instance:35895a325c6943dc-872eced6dfcb8c91, used=18432, limit=8147483648, failed alloc size=-1.00 B>. If query, can change the limit by session variable
exec_mem_limit. precise_code:1
W0312 10:58:07.943392 2203 mem_tracker.cpp:290] Memory exceed limit. fragment=35895a325c6943dc-872eced6dfcb8c91, details=New partitioned aggregation, while g
etting next from child 0., on backend=172.16.44.107. Memory left in process limit=8589934592.00 GB. current tracker <label=RuntimeState:instance:35895a325c694
3dc-872eced6dfcb8c91, used=18432, limit=8147483648, failed alloc size=-1.00 B>. If query, can change the limit by session variable exec_mem_limit.
MemTracker log_usage Label: queryId=35895a325c6943dc-872eced6dfcb8c90, Limit: 7.59 GB, Total: 19.00 KB, Peak: 19.00 KB, Exceeded: false
MemTracker log_usage Label: RuntimeState:instance:35895a325c6943dc-872eced6dfcb8c92, Limit: 7.59 GB, Total: 1.00 KB, Peak: 1.00 KB, Exceeded: false
MemTracker log_usage Label: RuntimeFilterMgr, Limit: -1.00 B, Total: 0, Peak: 0, Exceeded: false
MemTracker log_usage Label: RuntimeState:instance:35895a325c6943dc-872eced6dfcb8c91, Limit: 7.59 GB, Total: 18.00 KB, Peak: 18.00 KB, Exceeded: false
MemTracker log_usage Label: RuntimeFilterMgr, Limit: -1.00 B, Total: 0, Peak: 0, Exceeded: false
MemTracker log_usage Label: ExecNode:AGGREGATION_NODE (id=1), Limit: -1.00 B, Total: 1.00 KB, Peak: 1.00 KB, Exceeded: false
MemTracker log_usage Label: DataStreamSender:35895a325c6943dc-872eced6dfcb8c91, Limit: -1.00 B, Total: 16.00 KB, Peak: 16.00 KB, Exceeded: false
W0312 10:58:07.944548 2203 fragment_mgr.cpp:231] Got error while opening fragment 35895a325c6943dc-872eced6dfcb8c91: Memory limit exceeded: Memory exceed lim
it. fragment=35895a325c6943dc-872eced6dfcb8c91, details=New partitioned aggregation, while getting next from child 0., on backend=172.16.44.107. Memory left i
n process limit=8589934592.00 GB. current tracker <label=RuntimeState:inst
I0312 10:58:07.944562 2316 internal_service.cpp:187] cancel fragment, fragment_instance_id=35895a325c6943dc-872eced6dfcb8c92, reason: 3
I0312 10:58:07.944604 2316 plan_fragment_executor.cpp:602] PlanFragmentExecutor::cancel|instance_id=35895a325c6943dc-872eced6dfcb8c92|query_id=35895a325c6943
dc-872eced6dfcb8c90
W0312 10:58:07.944797 2148 fragment_mgr.cpp:231] Got error while opening fragment 35895a325c6943dc-872eced6dfcb8c92: Cancelled: Cancelled
I0312 10:58:07.944806 2203 plan_fragment_executor.cpp:673] Close() fragment_instance_id=35895a325c6943dc-872eced6dfcb8c91

zhengte pushed a commit to zhengte/incubator-doris that referenced this pull request Mar 15, 2022
…pache#8322)

Modify the implementation of MemTracker:
1. Simplify a lot of useless logic;
2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing;
3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes;
4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection
5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently;
6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later;
7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env;
8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.;

Modify where MemTracker is used:
1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code;
2. Added trackers for global objects such as ChunkAllocator and StorageEngine;
3. Added more fine-grained trackers such as ExprContext;
4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode;
5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;
@xinyiZzz
Copy link
Contributor Author

@xinyiZzz When I test for #8451 , I encounter a memory limit problem.

The problem is that, after the long query, as specified in the test steps of #8451 , is finished, a simple query 'select count() from tableA' will raise memory limit error.

I guess it's related to this pr, since the problem is not present before I merge the new MemTracker code.

@xiaokang Sorry I just saw,
The memory should be independent between queries, I will try to find the cause of this problem.

Another problem, after this pr, there may be cases where large queries that were successful before, now fail, because this pr implements Query-level memory limits.

exec_mem_limit is used to limit the memory of the entire query, which is its original meaning. And in the past, exec_mem_limit is actually the memory limit at the Fragment Instance level, not the query.

@xinyiZzz xinyiZzz changed the title [refactor] Impl of MemTracker, and related use [refactor] (memory) Impl of MemTracker, and related use Apr 1, 2022
@xinyiZzz xinyiZzz changed the title [refactor] (memory) Impl of MemTracker, and related use [feature-wip] (memory tracker) (step1) Refactor impl of MemTracker, and related use Apr 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. area/memory-consumption dev/backlog waiting to be merged in future dev branch kind/improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Refactored memory statistics framework MemTracker
4 participants