-
Notifications
You must be signed in to change notification settings - Fork 11.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lldb/DWARF] Search fallback to the manual index in GetFullyQualified… #102123
Conversation
…Type This is needed to ensure we find a type if its definition is in a CU that wasn't indexed. This can happen if the definition is in some precompiled code (e.g. the c++ standard library) which was built with different flags than the rest of the binary.
@llvm/pr-subscribers-lldb Author: Pavel Labath (labath) Changes…Type This is needed to ensure we find a type if its definition is in a CU that wasn't indexed. This can happen if the definition is in some precompiled code (e.g. the c++ standard library) which was built with different flags than the rest of the binary. Full diff: https://github.com/llvm/llvm-project/pull/102123.diff 2 Files Affected:
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
index 7e66b3dccf97f..32d8a92305aaf 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
@@ -371,6 +371,7 @@ void DebugNamesDWARFIndex::GetFullyQualifiedType(
!ProcessEntry(entry, callback))
return;
}
+ m_fallback.GetFullyQualifiedType(context, callback);
}
bool DebugNamesDWARFIndex::SameParentChain(
diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/mixed-debug-names-complete-type-search.test b/lldb/test/Shell/SymbolFile/DWARF/x86/mixed-debug-names-complete-type-search.test
new file mode 100644
index 0000000000000..71da8fad00165
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/mixed-debug-names-complete-type-search.test
@@ -0,0 +1,35 @@
+REQUIRES: lld, python
+
+RUN: split-file %s %t
+RUN: %clang --target=x86_64-pc-linux -g -gpubnames -c %t/file1.c -o %t-1.o
+RUN: %clang --target=x86_64-pc-linux -g -gno-pubnames -c %t/file2.c -o %t-2.o
+RUN: llvm-dwarfdump %t-1.o --debug-names | FileCheck %s --check-prefix=PUBNAMES
+RUN: llvm-dwarfdump %t-2.o --debug-names | FileCheck %s --check-prefix=NOPUBNAMES
+RUN: ld.lld %t-1.o %t-2.o -o %t.out
+RUN: %lldb %t.out -s %t/commands -o exit | FileCheck %s
+
+// Precondition check: The first file should contain a debug_names index, but no
+// entries for MYSTRUCT.
+PUBNAMES: Name Index @ 0x0 {
+PUBNAMES-NOT: MYSTRUCT
+
+// The second file should not contain an index.
+NOPUBNAMES-NOT: Name Index
+
+// Starting from the variable in the first file, we should be able to find the
+// declaration of the type in the first unit, and then match that with the
+// definition in the second unit.
+CHECK: (lldb) script
+CHECK: struct MYSTRUCT {
+CHECK-NEXT: int x;
+CHECK-NEXT: }
+
+#--- commands
+script lldb.target.FindFirstGlobalVariable("struct_ptr").GetType().GetPointeeType()
+#--- file1.c
+struct MYSTRUCT *struct_ptr;
+#--- file2.c
+struct MYSTRUCT {
+ int x;
+};
+struct MYSTRUCT struct_;
|
LGTM. |
Isn't this going to degrade the performance of all negative queries? |
Sort of yes (*), but only so much as it is necessary to make it correct, and the level of slowdown depends on how much non-indexed code you have. The fallback index will only index those units that aren't covered by the debug_names index, so if debug_names covers everything, the fallback is a noop -- if there's one non-indexed unit, you only pay the cost of indexing that unit. (*) except not really. The most expensive fallback operation is building the manual index (afterwards, it may even be faster than debug_names), and this happens first time that anything queries it. Since all of the other debug_names methods are already calling the fallback index, it will most likely be already built by the time we get to this point and the fallback call should be fast. |
Got it, this is what I was hoping that would happen! Thanks for explaining it. (future ideas) This makes me wonder if a better design would have been to make whoever owns this instances of Index classes have instead a collection of these indices, instead of having an implementation of an index have to remember to call a fallback mechanism. |
The current setup makes sense to me, but I guess that's expected as I'm the one who created it. I can also imagine something like what you propose, but it doesn't seem like a clear win to me. These objects are owned by SymbolFileDWARF, and we probably don't want to have it do the work of juggling these (it has a lot on its plate already), so it would probably have to be a separate object (basically another implementation of the "dwarf index" interface). That would be a lot of boilerplate (though maybe we could use some template trickery to reduce it). We'd also need to come up with a less ad-hoc way communicate which things are supposed to be indexed by who, but we also wouldn't want to make it too generic (== more code), since there are basically only three index configurations we care about (and these could easily be reduced to two). |
I guess what I was proposing was more about deleting code than adding it. We must always have at most one index, but we have a lot of code allowing for the situation of two indices (manual & {apple, dwarf}). |
The main issue that this covers is when you have N .o files compiled with .debug_names, and M .o files compiled without it, and then you link N+M into a final executable. The linker will simply concatenate all of the .o files section together, but the linker doesn't know anything about DWARF, so any of the M files that didn't have indexes, won't have .debug_names indexes. So the fallback stuff is needed here. The fallback doesn't re-index existing files, it just manually indexes any files that didn't have an accelerator table. So we must have this code because of the way things are compiled and linked on linux. |
Okay. I think I understand now what you meant -- which was to have each compile unit to store some sort of a reference to "its" index (is that right)? If that's the case, then I don't think that could work. While it's true that every CU is covered by exactly one index, a single index can (and usually does) cover more than one CU, and there isn't a good way to partition it along CU boundaries. The apple indexes such shove everything into a single table, which is indexed by name (a hash of it). There's no way to efficiently query only for entries belonging to a single unit. With per-CU debug_names indexes, we could do that, but we also have multi-cu indexes, where we'd have the same problem. (For manual indexes, we could do what we want, but we'd need to be careful to not regress performance, as we currently get a large speed boost from the paralelization of the indexing step). On top of that, this isn't really a good match for the way indexes are used. When querying the index, we usually don't know which CU we want to look at (if we did, we may not even need the index). We just have a name, and we want the CU (and the DIE within that CU) which defines that name. Even if we created something like this, it would mean that each query would basically become a loop over all CUs to see which one contains the thing we're looking for. |
llvm#102123) …Type This is needed to ensure we find a type if its definition is in a CU that wasn't indexed. This can happen if the definition is in some precompiled code (e.g. the c++ standard library) which was built with different flags than the rest of the binary.
…ualified… (llvm#102123)" The test appears to be flaky. Revert it while I investigate. This reverts commits 7027cc6 and 21ef272.
…Qualified… (llvm#102123)" This reverts commit 38b67c5. I reverted the wrong patch -- sorry :(
…Type
This is needed to ensure we find a type if its definition is in a CU that wasn't indexed. This can happen if the definition is in some precompiled code (e.g. the c++ standard library) which was built with different flags than the rest of the binary.