Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.rodata bloat with aligned zero sized objects #75524

Open
haraldh opened this issue Aug 14, 2020 · 18 comments
Open

.rodata bloat with aligned zero sized objects #75524

haraldh opened this issue Aug 14, 2020 · 18 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such

Comments

@haraldh
Copy link
Contributor

haraldh commented Aug 14, 2020

I tried this code:

#![allow(unused_variables, unused_mut)]

#[repr(align(0x200000))]
pub struct Page2MiB;

pub fn main() {
    // Version a: .rodata size <<<<< 0x200000
    let mut slice = [0u8];

    // Version b: .rodata size > 0x200000
    let mut slice = vec![0u8]; let slice = slice.as_mut_slice();

    let (pre, middle, post) = unsafe { slice.align_to_mut::<Page2MiB>() };
    dbg!(pre.len(), middle.len(), post.len());
}

Edit: which can be further reduced to:

pub fn main() {
    #[repr(align(0x200000))]
    pub struct Aligned;
    dbg!();
    let slice: &mut [Aligned; 0] = &mut [];
    dbg!(slice.as_ptr());
}

I expected to see this happen:

$ rustc main.rs && size -A main | fgrep .rodata
.rodata                   17449   2097152

A reasonable section size of .rodata, like if Version b is commented out.

Instead, this happened:

$ rustc main.rs && size -A main | fgrep .rodata
.rodata                 2114473   2097152

A .rodata section with over 2MB.

With this minimal example, opt-level makes it go away, but with more complex code it does not.

The culprit is:

$ rustc --emit=asm main.rs && egrep -5 'p2align[[:space:]]+21' main.s
	.asciz	"k\000\000\000\000\000\000\000[\004\000\000\035\000\000"
	.size	.L__unnamed_3, 24

	.type	.L__unnamed_19,@object
	.section	.rodata..L__unnamed_19,"a",@progbits
	.p2align	21
.L__unnamed_19:
	.size	.L__unnamed_19, 0

	.type	.L__unnamed_4,@object
	.section	.data.rel.ro..L__unnamed_4,"aw",@progbits

So, we have a zero-sized object, but an alignment was forced.
I don't know, if this is an LLVM bug.

$ rustc --emit=llvm-ir main.rs && grep 'align 2097152' main.ll
@alloc162 = private unnamed_addr constant <{ [0 x i8] }> zeroinitializer, align 2097152
define internal i64 @"_ZN4core5slice29_$LT$impl$u20$$u5b$T$u5d$$GT$3len17h52545189b0701658E"([0 x %Page2MiB]* noalias nonnull readonly align 2097152 %self.0, i64 %self.1) unnamed_addr #1 {
  %_45 = call i64 @"_ZN4core5slice29_$LT$impl$u20$$u5b$T$u5d$$GT$3len17h52545189b0701658E"([0 x %Page2MiB]* noalias nonnull readonly align 2097152 %middle.0, i64 %middle.1)

Again, zero-sized, but with align 2097152.

Meta

rustc --version --verbose:

rustc --version --verbose
rustc 1.45.2 (d3fb005a3 2020-07-31)
binary: rustc
commit-hash: d3fb005a39e62501b8b0b356166e515ae24e2e54
commit-date: 2020-07-31
host: x86_64-unknown-linux-gnu
release: 1.45.2
LLVM version: 10.0

and nightly:

$ rustc +nightly -Vv
rustc 1.47.0-nightly (18f3be770 2020-08-09)
binary: rustc
commit-hash: 18f3be7704a4ec7976fcd1272c728974243d29bd
commit-date: 2020-08-09
host: x86_64-unknown-linux-gnu
release: 1.47.0-nightly
LLVM version: 10.0

@haraldh haraldh added the C-bug Category: This is a bug. label Aug 14, 2020
@RalfJung
Copy link
Member

RalfJung commented Aug 15, 2020

So, we have an unsized object, but an alignment was forced.

(I assume you mean zero-sized? "unsized" is something else.)
I don't see a bug here. Zero-sized objects have alignment requirements like all others, they are not special in that regard.

For example, a &[i32; 0] must be 4-aligned, otherwise even creating such a reference is UB.

@haraldh
Copy link
Contributor Author

haraldh commented Aug 15, 2020

Huh, so having 2MB zeros on disk for nothing is OK?

@RalfJung
Copy link
Member

RalfJung commented Aug 15, 2020

Huh, so having 2MB zeros on disk for nothing is OK?

I didn't say that. The object can still occupy zero bytes. But it also has an address, and that address must be appropriately aligned.

I am afraid I don't know the details of the ELF format or the optimizations involved. All I am saying is that "it has size 0 and alignment is forced" is expected behavior. There is possibly a bug elsewhere, but the fact that "alignment is forced" is not a bug. It would be a critical bug if alignment was not enforced! (See e.g. #70143, #70022.)

Cc @Amanieu

@haraldh
Copy link
Contributor Author

haraldh commented Aug 15, 2020

Can't we give it address 0 which is aligned to everything?

@tesuji
Copy link
Contributor

tesuji commented Aug 15, 2020

Is it null?

@haraldh
Copy link
Contributor Author

haraldh commented Aug 15, 2020

Except querying the len() you can't do much with it.

@RalfJung
Copy link
Member

No, address 0 is not a valid address for any object.

@haraldh
Copy link
Contributor Author

haraldh commented Aug 15, 2020

But yeah... dbg!(middle.as_ptr()); should give an aligned pointer.

@tesuji
Copy link
Contributor

tesuji commented Aug 15, 2020

The problem with reference at zero is that when you get raw pointer from it,
how do you know it is null or not?

@haraldh
Copy link
Contributor Author

haraldh commented Aug 15, 2020

Ok, this is what it can be reduced to:

pub fn main() {
    #[repr(align(0x200000))]
    pub struct Aligned;
    dbg!();
    let slice: &mut [Aligned; 0] = &mut [];
    dbg!(slice.as_ptr());
}

@tmiasko
Copy link
Contributor

tmiasko commented Aug 15, 2020

-Clink-args=-Wl,--sort-section=alignment might improve the situation.

@haraldh
Copy link
Contributor Author

haraldh commented Aug 15, 2020

-Clink-args=-Wl,--sort-section=alignment might improve the situation.

very good, thank you!

@haraldh
Copy link
Contributor Author

haraldh commented Aug 15, 2020

-Clink-args=-Wl,--sort-section=alignment might improve the situation.

rustc -C opt-level=3 src/main.rs && strip main && ls -hs main
2,3M mainrustc -C opt-level=3 -Clink-args=-Wl,--sort-section=alignment src/main.rs && strip main && ls -hs main
231K main

@haraldh
Copy link
Contributor Author

haraldh commented Aug 15, 2020

pub fn main() {
    dbg!();
    #[repr(align(536870912))]
    pub struct Aligned;
    let slice: &mut [Aligned; 0] = &mut [];
    dbg!(slice.as_ptr());
}

to the extreme (even with opt-level=s):

rustc -C opt-level=s src/main.rs && strip main && ls -hs main
513M main

@haraldh
Copy link
Contributor Author

haraldh commented Aug 15, 2020

So, I would suggest opt-level=s at least should include --sort-section=alignment

haraldh added a commit to haraldh/enarx that referenced this issue Aug 18, 2020
haraldh added a commit to haraldh/enarx that referenced this issue Aug 18, 2020
haraldh added a commit to haraldh/enarx that referenced this issue Aug 18, 2020
haraldh added a commit to haraldh/enarx that referenced this issue Aug 18, 2020
haraldh added a commit to haraldh/enarx that referenced this issue Aug 19, 2020
haraldh added a commit to haraldh/enarx that referenced this issue Aug 19, 2020
enarxbot pushed a commit to enarx/enarx that referenced this issue Aug 19, 2020
npmccallum pushed a commit to enarx-archive/enarx-keepldr that referenced this issue Sep 2, 2020
@workingjubilee
Copy link
Member

workingjubilee commented Sep 5, 2020

...wait, what magic happened here exactly? Does telling the linker to sort by alignment seriously just guarantee that all ZSTs overlap the same address in the binary space? If so, there's no obvious reason not to always do it for all ZSTs, is there?

@haraldh
Copy link
Contributor Author

haraldh commented Dec 23, 2022

Still so annoying :-/

@bstrie
Copy link
Contributor

bstrie commented Dec 23, 2022

Does telling the linker to sort by alignment seriously just guarantee that all ZSTs overlap the same address in the binary space?

Note that I can only realize these benefits on GNU ld, for whatever reason lld is reluctant to do the same despite claiming to support sorting sections by alignment.

@ChrisDenton ChrisDenton added the needs-triage-legacy Old issue that were never triaged. Remove this label once the issue has been sufficiently triaged. label Jul 16, 2023
@jieyouxu jieyouxu removed the needs-triage-legacy Old issue that were never triaged. Remove this label once the issue has been sufficiently triaged. label Feb 18, 2024
@jieyouxu jieyouxu added A-linkage Area: linking into static, shared libraries and binaries C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such labels Feb 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such
Projects
None yet
Development

No branches or pull requests

8 participants