Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Produce idata content for dll/dylib import on windows #30027

Closed
SlugFiller opened this issue Nov 24, 2015 · 27 comments
Closed

Produce idata content for dll/dylib import on windows #30027

SlugFiller opened this issue Nov 24, 2015 · 27 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. O-windows Operating system: Windows

Comments

@SlugFiller
Copy link

From what I can tell the kind="dylib" link hint is mostly ignored on windows, as rust expects a matching import library to be found. I suggest it should, instead, produce an appropriate idata section in the LLVM code.

For example, for the following:

#[cfg(windows)]
#[link(name = "kernel32", kind="dylib")]
#[allow(non_snake_case)]
extern "system" {
    fn GetStdHandle(nStdHandle: u32) -> *mut u8;
    fn WriteConsoleA(hConsoleOutput: *mut u8, lpBuffer: *const u8, nNumberOfCharsToWrite: u32, lpNumberOfCharsWritten: *mut u32, lpReserved: *const u8) -> i32;
}

Rust should produce (Assuming 32-bit. For 64-bit target, a bunch of i32s should replaced with i64s below):

@__ImageBase = external global i8

define internal i8* @GetStdHandle(i32 %nStdHandle) unnamed_addr alwaysinline nounwind {
    %lpaddr = bitcast i8** getelementptr ([3 x i8*]* @.import.kernel32.ptr, i32 0, i32 0) to i8*(i32 )**
    %addr = load i8*(i32)** %lpaddr
    %ret = call cc 64 i8* %addr(i32 %nStdHandle)
    ret i8* %ret
}

define internal i32 @WriteConsoleA(i8* %hConsoleOutput, i8* %lpBuffer, i32 %nNumberOfCharsToWrite, i32* %lpNumberOfCharsWritten, i8* %lpReserved) unnamed_addr alwaysinline nounwind {
    %lpaddr = bitcast i8** getelementptr ([3 x i8*]* @.import.kernel32.ptr, i32 0, i32 1) to i32(i8*, i8*, i32, i32*, i8*)**
    %addr = load i32(i8*, i8*, i32, i32*, i8*)** %lpaddr
    %ret = call cc 64 i32 %addr(i8* %hConsoleOutput, i8* %lpBuffer, i32 %nNumberOfCharsToWrite, i32* %lpNumberOfCharsWritten, i8* %lpReserved)
    ret i32 %ret
}

%.win32.image_import_descriptor = type {
    i8*, ; Characteristics
    i32, ; TimeDateStamp
    i32, ; ForwarderChain
    i8*, ; DLL Name
    i8* ; FirstThunk
}

@.import.kernel32.dllname = private constant [16 x i8] c"KERNEL32.DLL\00\00\00\00", section ".idata$7"

@.import.kernel32.func.GetStdHandle = private constant [18 x i8] c"\00\00GetStdHandle\00\00\00\00", section ".idata$6"
@.import.kernel32.func.WriteConsoleA = private constant [19 x i8] c"\00\00WriteConsoleA\00\00\00\00", section ".idata$6"

@.import.kernel32.desc = private global [3 x i8*] [
    i8* inttoptr (i32 sub(i32 ptrtoint([18 x i8]* @.import.kernel32.func.GetStdHandle to i32), i32 ptrtoint(i8* @__ImageBase to i32)) to i8*),
    i8* inttoptr (i32 sub(i32 ptrtoint([19 x i8]* @.import.kernel32.func.WriteConsoleA to i32), i32 ptrtoint(i8* @__ImageBase to i32)) to i8*),
    i8* null
], section ".idata$4"
@.import.kernel32.ptr = private global [3 x i8*] [
    i8* inttoptr (i32 sub(i32 ptrtoint([18 x i8]* @.import.kernel32.func.GetStdHandle to i32), i32 ptrtoint(i8* @__ImageBase to i32)) to i8*),
    i8* inttoptr (i32 sub(i32 ptrtoint([19 x i8]* @.import.kernel32.func.WriteConsoleA to i32), i32 ptrtoint(i8* @__ImageBase to i32)) to i8*),
    i8* null
], section ".idata$3"

@.dllimport = appending constant [2 x %.win32.image_import_descriptor] [
    %.win32.image_import_descriptor {
        i8* inttoptr (i32 sub(i32 ptrtoint([3 x i8*]* @.import.kernel32.desc to i32), i32 ptrtoint(i8* @__ImageBase to i32)) to i8*),
        i32 0,
        i32 0,
        i8* inttoptr (i32 sub(i32 ptrtoint([16 x i8]* @.import.kernel32.dllname to i32), i32 ptrtoint(i8* @__ImageBase to i32)) to i8*),
        i8* inttoptr (i32 sub(i32 ptrtoint([3 x i8*]* @.import.kernel32.ptr to i32), i32 ptrtoint(i8* @__ImageBase to i32)) to i8*)
    },
    %.win32.image_import_descriptor {
        i8* null,
        i32 0,
        i32 0,
        i8* null,
        i8* null
    }
], section ".idata$2"
@nagisa
Copy link
Member

nagisa commented Nov 24, 2015

So… you basically want to have import .lib generated on the fly by rustc?

This, at very least, would be problematic with extern blocks that have overlapping definitions.

@SlugFiller
Copy link
Author

Using the right markers for the globals, it's possible to make LLVM handle that gracefully, by concatenating the necessary elements. However, a better solution would be for rust to do overlap checking prior to generating the LLVM code, and generate a single import block for the entire program, thus avoiding any duplicate imports.

At any rate, avoiding duplicate creation of imports can't be much more challenging than avoiding duplicate reification of generics.

@apasel422 apasel422 added the O-windows Operating system: Windows label Nov 25, 2015
@alexcrichton
Copy link
Member

Could you elaborate on the rationale for this? Do other compilers do this? e.g. MSVC or clang?

@SlugFiller
Copy link
Author

MSVC comes with a (proprietary) tool for producing .lib from .def, and .def from .dll, if you're missing the prior. Most other compilers targeting Windows actually lack this facility, requiring reliance on MSVC's toolchain.

The rationale is twofold:

  • First, removing a dependency on a specific toolchain. For instance, a file with the appropriate idata section could be compiled with LLVM and linked with LLD without needing an additional runtime or libraries.
  • Second, without this, the kind="dylib" hint is superfluous, since the behavior is identical to kind="static", at least when targeting Windows (I haven't tested how the behavior is when targeting Mac or Linux yet)

@retep998
Copy link
Member

The behavior of kind="dylib" and kind="static" are not the same. dylib causes the library to be passed to the linker and static causes rustc to bundle the library into the .rlib. There's also a proposal to make the application of dllimport based on whether the library the definitions are from is dylib or static.

@alexcrichton
Copy link
Member

@SlugFiller could you elaborate on the LLD aspect of this? From what I understand import libraries are basically a "linker thing" in the sense that the MSVC linker is the only one that consumes them. The GNU linker in the MinGW toolchain, for example, I believe doesn't use the same strategy? (or at least encodes it differently)

Along those lines I would expect that LLD would basically take care of itself and not require a lot of manual interaction with these sorts of imports and whatnot. I'm not entirely sure about the story here, though, as I've played around very little with LLD.

@SlugFiller
Copy link
Author

@alexcrichton MinGW is first and foremost a set of header files and libraries mimicking those available in MSVC, starting with those import libraries. The rest of the toolchain is simply a port of the GCC toolchain. So, I can guarantee you that MinGW consumes import libraries, and is bundled with "off-brand" replicas of import libraries for common Microsoft DLLs. As far as I know, they should be cross-compatible/interchangeable with the Microsoft ones, although there's always a chance that tiny differences would cause incompatibilities if trying to mix-and-match. Avoiding that minefield is reason enough to avoid dependency on them entirely.

LLD, at it's core, is just a linker, but it does successfully consume import libraries, as the changelog for it maintains up to date compatibility reports with import libraries. A recent patch (dated Jun 17), mentions a dependency on an MSVC tool (lib.exe) to produce import libraries. To my knowledge, LLD is not bundled with any import libraries, but can consume both those bundled with MSVC and MinGW. This does mean it still has an external dependency on one of those two.

As I've demonstrated above, import libraries can be created manually through LLVM, thereby eliminating dependency on any external toolchains.

Note that this is not a "linker thing", but a "compiler thing": Most compilers output DLL imports simply as static imports, indistinguishable from dependencies on static libraries. True DLL imports require a matching idata section. The linker can link such an idata section into the program, without needing any special awareness of its contents. In particular, without the idata section, the compiler "erases" the data of which DLL any given import must come from. This information is stored in the idata section.

@retep998
Copy link
Member

If this can be done with no runtime overhead over the linker resolving imports from static libraries, and if it can be done without needing the compiler to look at the DLLs itself, and if it doesn't further overload the meaning of kind=dylib but instead uses its own opt-in way of specifying it, then I'd be in favor of something like this.

@SlugFiller
Copy link
Author

@retep998 I don't understand what you mean by "runtime overhead". It doesn't use LoadLibrary, if that's what you mean. The idata section is how all Windows programs must define their import tables. It's part of the PE format. Import libraries must use the same method (Although import libraries skip any inlining optimizations LLVM can offer for the import stubs)

The compiler doesn't need to look in the DLLs. The information provided in the link attribute and extern section is sufficient, as I've demonstrated above. Adding an extra attribute for the ordinal hint could also help, but is not necessary.

For not overloading the meaning, a possibility is to use a different type of kind, like kind=dll.

@retep998
Copy link
Member

retep998 commented Dec 1, 2015

Well if it ends up with the same PE imports as using import libraries, and all it needs is the information in the extern section with that link attribute, then by all means go for it. This might need an RFC though since it does add a new kind (or some other syntax to specify the usage of this feature).

@ahicks92
Copy link
Contributor

if I understand what this does correctly--removes the need to have the import library on Windows--I really want this.
Getting users to have the import libraries in the right places is painful in basically any language. Not to mention getting the users to have the right import libraries for the toolchain they're using, etc. You can probably build a lot of stuff from source, but at least one of my projects requires CMake and 6 or so Python packages. On top of that, it uses Libsndfile (can only be built under MinGW) but must itself be built under VS (because it's audio and it needs the Wasapi interfaces). This setup is perhaps atypical, but it's all for very good reasons. The net result is that getting it working with a build script is tricky, as is providing an import library to everyone who needs one.
Or, if we had this, I could just not bother and keep giving out binary dlls all day long and everyone would be happy. This would simplify my life greatly.
It might also help the Sdl2 bindings, which currently instruct you to copy a library to a MinGW directory and then to put a DLL in your cargo project. Either those instructions or some other instructions elsewhere further specify that you have to be careful to not accidentally use the Visual Studio Sdl or it will all explode.

@retep998
Copy link
Member

retep998 commented Mar 1, 2016

@SlugFiller I really wish you'd create an RFC for this. It's such a genuinely useful thing to have.

@retep998
Copy link
Member

retep998 commented Mar 9, 2016

Just as a note, whatever the proposed syntax is, it would need to have a way to specify an ordinal instead of a symbol name, as some DLLs have symbols that are only identifiable by their ordinal rather than their name.

@alexchandel
Copy link

👍 👍 👍 This would be absolutely amazing for Windows targeting. It would remove so much hell in compiling/cross-compiling to windows, including dealing with msvc/mingw32.

@alexchandel
Copy link

Is it certain this would require an RFC? It uses the existing dylib kind, and it more closely matches the meaning of dylib on other platforms, since Windows doesn't really distinguish between linking static and dynamic libraries.

@retep998
Copy link
Member

I'd much rather this used a new kind or new syntax rather than changing the behavior of dylib. One example of how changing the behavior of dylib could cause breakage is when linking to a DLL which only provides a function via an ordinal rather than a symbol name (yes this does happen quite often). The import library thus serves an important role in correlating the symbol name to the ordinal. Also since many people currently use kind=dylib to link to static libraries (since kind=static does undesirable bundling), making dylib do this instead of linking to an import library would cause those situations to break as well.

@alexchandel
Copy link

Ok, maybe like kind="dll"?

@Ericson2314
Copy link
Contributor

Ericson2314 commented Aug 16, 2016

Cause nobody mentioned it yet, dlltool is the binutils way of getting something for the linker when only a dll is present. There's also a dllwrap but I don't know anything about it.

@retep998
Copy link
Member

retep998 commented Aug 16, 2016

All an import library really is, is a mapping from a symbol such as _foo@4 to a symbol in a dll such as foo in BAR.dll. A .def file is one way of explicitly specifying this mapping, and both lib and dlltool can create an import library from that .def. The only issue is that lib doesn't support mapping from one symbol name to a different symbol name, so on 32bit it only supports cdecl _foo -> foo, which means it is completely unusable for stdcall.

Anyway, to make rustc be able to emit this sort of information without losing out on anything, all the user needs to be able to do is specify these things:

  1. The DLL name. Can be specified via #[link(name = "foo", kind = "dll")] which ends up being foo.dll.
  2. The symbol name in the DLL. Defaults to the literal name of the function/static. The question here is for 32-bit, whether to link to the decorated version or the non-decorated version. By default a dll created without a .def will export decorated symbols, while a dll with a .def will export undecorated symbols, and the import library will provide the mapping from decorated to undecorated. I believe that the predominant use of this functionality will be to link to system libraries, which fall into the latter group, so I'd prefer the default to be undecorated, and if you use #[link_name] it will use the decorated form instead, unless you prefix it with \x01 in which case it will be undecorated again.
  3. The ordinal in the DLL. Some symbols are only exported from the dll via ordinal, and not by name, so it is critical that there be a way to specify the ordinal. Some sort of #[ordinal] or #[link_ordinal] attribute would suffice.

Have I missed anything? As far as I can tell this should be sufficient.

@retep998
Copy link
Member

Also to clarify how important ordinals are, there are 13,907 exports in the Windows SDK only accessible via ordinal (out of a total of 145,802 exports cataloged by me).

@rkarp
Copy link
Contributor

rkarp commented Jan 28, 2018

Interestingly, the windows-gnu toolchain can link DLLs directly, without the need for any import libraries. You can test this by renaming libkernel32.a in the ...lib\rustlib\x86_64-pc-windows-gnu\lib subdirectory of your toolchain. You'll get a ld: cannot find -lkernel32 error when linking. If you then copy kernel32.dll from c:\windows\system32 to lib the build will work again. I've also tried linking other 3rd party DLLs and it works without any problem, the DLLs just have to be on the library search path for the linker.

So actually it seems Rust could theoretically just add system32 to the library search path for all builds and dispose of all the Windows import libraries in the toolchain folder for windows-gnu targets. Though I don't know if that would have other unintended side effects, such as not being able to use exports by ordinal.

@GabrielMajeri
Copy link
Contributor

@rkarp this has been brought up before. It doesn't work if you're cross-compiling (since you don't have those DLLs), or you're targeting a newer version of Windows (so your DLLs lack some specific export), or you need an API exported through an ordinal.

@est31
Copy link
Member

est31 commented Jan 29, 2018

👍 to this. This will help lowering the entry level into rust development on windows because once we ship lld, and we figure out a solution for CRT, you won't need to install anything any more beyond the rust toolchain itself. It will also make cross compiling to windows easier from other platforms.

@Memnarch
Copy link

wait a sec, it's that easy to add IData in LLVM? I have been searching for this way to long, as i want to write a little toy pascal frontend for LLVM and (object)Pascal naturally comes with this kind of specifying imports (internalname, dllname, externalname).

Seems that ticket helped accidentally, too :D

@retep998
Copy link
Member

There is an RFC open for this feature: rust-lang/rfcs#2627

@crlf0710
Copy link
Member

cc #58713 , maybe close this as a duplicate.

@retep998
Copy link
Member

Given the RFC has been accepted and a tracking issue has been created, please direct all further discussion there.

#58713

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. O-windows Operating system: Windows
Projects
None yet
Development

No branches or pull requests