-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature-wip] (memory tracker) (step1) Refactor impl of MemTracker, and related use #8322
Conversation
@@ -244,7 +244,7 @@ void HashTable::grow_node_array() { | |||
_alloc_list.push_back(_current_nodes); | |||
_end_list.push_back(_current_nodes + alloc_size); | |||
|
|||
_mem_tracker->Consume(alloc_size); | |||
_mem_tracker->consume(alloc_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many codes call mem_tracker->somsume(size_t), Is there a better method to do this automatically?
For example, we may rewrite the memory allocators and track the memory usage during allocate, like in clickhouse:
/// Implementation of std::allocator interface that tracks memory with MemoryTracker.
/// NOTE We already plug MemoryTracker into new/delete operators. So, everything works even with default allocator.
/// But it is enabled only if jemalloc is used (to obtain the size of the allocation on call to delete).
/// And jemalloc is disabled for builds with sanitizers. In these cases memory was not always tracked.
template
struct AllocatorWithMemoryTracking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yiguolei I've done something similar with reference to clickhouse to automatically track memory usage via TCMalloc Hook
PR: #7198
But this PR is too big, so I was advised to split the commit =_=, I will push other codes next week.
Different from Jemalloc, overloading the TCMalloc new/delete operator needs to invade the source code of TCMalloc, so it is implemented by Hook;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are too many places call memory tracker and we may lost to add mem_tracker.consume in some node or thread. If we are following clickhouse's method, I think it do not need to call memory tracker everywhere and call mem tracker.consume so many times.
We could attach the running thread to a conext, the context maybe a query context or a compaction context or olap scanner. And rewrite the new and delete method to update the memory tracker automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are too many places call memory tracker and we may lost to add mem_tracker.consume in some node or thread. If we are following clickhouse's method, I think it do not need to call memory tracker everywhere and call mem tracker.consume so many times.
We could attach the running thread to a conext, the context maybe a query context or a compaction context or olap scanner. And rewrite the new and delete method to update the memory tracker automatically.
This is done by following next PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are too many places call memory tracker and we may lost to add mem_tracker.consume in some node or thread. If we are following clickhouse's method, I think it do not need to call memory tracker everywhere and call mem tracker.consume so many times.
We could attach the running thread to a conext, the context maybe a query context or a compaction context or olap scanner. And rewrite the new and delete method to update the memory tracker automatically.
You are right, I have already implemented it like this. In #7198 mentioned above,
you can see the original design document last year: https://shimo.im/docs/DT6JXDRkdTvdyV3G
2354b40
to
16eb56c
Compare
8c5f3a6
to
c39621c
Compare
@@ -244,6 +244,7 @@ class OlapScanNode : public ScanNode { | |||
TResourceInfo* _resource_info; | |||
|
|||
int64_t _buffered_bytes; | |||
std::shared_ptr<MemTracker> _scanner_mem_tracker; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment for this mem tracker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, Count the memory consumption of Rowset Reader and Tablet Reader in OlapScanner
_scanner_mem_tracker replaces the previous runtime_state->fragment_mem_tracker(), both of which are also used to count the memory of OlapScanner.
The difference is that,
- The parent of runtime_state->fragment_mem_tracker() is process_mem_tracker, which is used in segment_reader of segmentV1 to determine whether OlapScanner exceeds the mem limit.
- The parent of _scanner_mem_tracker is query_mem_tracker, which removes the logic of OlapScanner mem limit exceed in segment_reader.
15b88ac
to
4975690
Compare
I think maybe we should split the memory tracker to a small pr, for example we may first implement only Query Memory tracker. It is more useful and important. |
It is also possible to split the refactoring of the mem tracker into multiple small PRs, but it takes time to deal with the conflicting parts. Let’s merge this first, the pr of Hook tcmalloc new/delete later depends on this ==, waiting for push Next pr for help review~ |
4975690
to
5f07101
Compare
5f07101
to
4239191
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
@xinyiZzz When I test for #8451 , I encounter a memory limit problem. The problem is that, after the long query, as specified in the test steps of #8451 , is finished, a simple query 'select count() from tableA' will raise memory limit error. I guess it's related to this pr, since the problem is not present before I merge the new MemTracker code. The following is mysql client error message.
The following is some related logs of be.
|
…pache#8322) Modify the implementation of MemTracker: 1. Simplify a lot of useless logic; 2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing; 3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes; 4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection 5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently; 6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later; 7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env; 8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.; Modify where MemTracker is used: 1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code; 2. Added trackers for global objects such as ChunkAllocator and StorageEngine; 3. Added more fine-grained trackers such as ExprContext; 4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode; 5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;
@xiaokang Sorry I just saw, Another problem, after this pr, there may be cases where large queries that were successful before, now fail, because this pr implements Query-level memory limits. exec_mem_limit is used to limit the memory of the entire query, which is its original meaning. And in the past, exec_mem_limit is actually the memory limit at the Fragment Instance level, not the query. |
Proposed changes
Issue Number: close #7196 (step 1/3)
Problem Summary:
Modify the implementation of MemTracker:
Modify where MemTracker is used:
Checklist(Required)
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...