Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split fix android memory kill in split #5940

Merged

Conversation

cre4ture
Copy link
Contributor

@cre4ture cre4ture commented Feb 3, 2024

split allocates quite some memory when reading from "files" with no real end like /dev/zero.
On x86 its ~ 1GB and on x86_x64 its 2GB. This was cause frequent stability issues on android CI.
This change limits the initially read memory heavily to the blksize of the filesystem or 500MB at max if no blksize is specified. This is aligned with a solution used in head

@cre4ture cre4ture force-pushed the fix/stabilize_android_build_memory_kill_split branch 3 times, most recently from dd6a6ca to 29668b4 Compare February 3, 2024 17:23
Copy link

github-actions bot commented Feb 3, 2024

GNU testsuite comparison:

GNU test failed: tests/split/b-chunk. tests/split/b-chunk is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/split/l-chunk. tests/split/l-chunk is passing on 'main'. Maybe you have to rebase?
GNU test error: tests/split/line-bytes. tests/split/line-bytes is passing on 'main'. Maybe you have to rebase?

@tertsdiepraam
Copy link
Member

I've only given this a cursory glance, but we do probably need to at least keep that ---io-blksize argument around. Why did you remove it?

@cre4ture
Copy link
Contributor Author

cre4ture commented Feb 3, 2024

I've only given this a cursory glance, but we do probably need to at least keep that ---io-blksize argument around. Why did you remove it?

I do not see the purpose. Its a hidden (undocumented?) parameter. Who will gona use it? Does GNU have it?
Of course it doesn't hurt much. But I thought automatic blksize is better than manual.

@tertsdiepraam
Copy link
Member

tertsdiepraam commented Feb 3, 2024

Yeah GNU has it, and that's why we have it too. It's also used us some of their tests, so if we want to pass those we need it.

@cre4ture
Copy link
Contributor Author

cre4ture commented Feb 3, 2024

ah, ok, I see. I'll add it again.

@cre4ture
Copy link
Contributor Author

cre4ture commented Feb 3, 2024

How do we know what the purpose of the undocumented argument is about?
It turns out that a empty implementation of that argument makes the tests green.

@tertsdiepraam
Copy link
Member

Yeah it's mostly for testing I believe, but it'd be nice to make it useful if we have it :)

@cre4ture cre4ture force-pushed the fix/stabilize_android_build_memory_kill_split branch 2 times, most recently from 2c7dfd8 to 8d2a917 Compare February 4, 2024 10:59
@cre4ture cre4ture changed the title fix android memory kill in split ci:android fix android memory kill in split Feb 4, 2024
@cre4ture cre4ture changed the title ci:android fix android memory kill in split split fix android memory kill in split Feb 4, 2024
Copy link
Member

@tertsdiepraam tertsdiepraam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple of questions/suggestions about this.

  • It looks like we're only using the result as u64, is that correct? Maybe we could just make it use u64 instead of usize. That'll allow us to get rid of the conversions and simplify the code a bit.
  • I don't think the stdin check is really correct, because we might be piping in a file with a well-defined size. So, I think the check should not just be checking for -.
  • We should think about the naming of the functions and document all the public functions. Maybe sane_blksize? I'd love to get rid of the distinction between the path and metadata versions too, but if that's not possible this is alright.

src/uucore/src/lib/features/fs.rs Outdated Show resolved Hide resolved
@cre4ture
Copy link
Contributor Author

cre4ture commented Feb 4, 2024

Thanks for reviewing and feedback :-)
Response below:

I have a couple of questions/suggestions about this.

  • It looks like we're only using the result as u64, is that correct? Maybe we could just make it use u64 instead of usize. That'll allow us to get rid of the conversions and simplify the code a bit.

I was switching to usize because the value we return should always fit into a usize type.
I'm not yet fully familiar with the rust strategy here. I've seen that in many places there is just the u64 used even though a usize would probably be more efficient. Especially when we consider 32bit systems as well.
So I can change it back to u64 as in the original implementation in head.
But if you could point me to a document which tells more about the general strategy, I would be glad.

  • I don't think the stdin check is really correct, because we might be piping in a file with a well-defined size. So, I think the check should not just be checking for -.

You are right. Thinking about it a bit more I came to the conclution that I can just remove that code as the function will anyway return the DEFAULT when the call to metadata() fails. And the call to metadata() should fail in case of dash input "-".

  • We should think about the naming of the functions and document all the public functions. Maybe sane_blksize? I'd love to get rid of the distinction between the path and metadata versions too, but if that's not possible this is alright.

When removing the special handling for "-", then the path variant will be purely for convenience and can be removed. Is this was you mean?

@tertsdiepraam
Copy link
Member

Great!

I think for code paths like these, u64 vs usize will barely make a difference in performance. The more important thing you mention is that it should indeed fit into usize, so a usize might make sense. However, generally my approach is to avoid conversions as much as possible. If an API from std returns u64, for example, they probably have good reason for that.

The solution for - works, I think, but I do think we need to put a comment on it to say that we can do even better. The thing is that we should actually try to get the metadata of stdin on -. This is a larger issue in the utils we need to think about though. I'd love to improve the handling of - across the utils and make it more consistent. Ideas on that are very welcome :)

I think the path version is adding something useful in the end, so let's keep it.

@cre4ture cre4ture force-pushed the fix/stabilize_android_build_memory_kill_split branch 5 times, most recently from 3cadde7 to 4832118 Compare February 5, 2024 20:49
@cre4ture cre4ture force-pushed the fix/stabilize_android_build_memory_kill_split branch 2 times, most recently from 8e51b86 to b4a038f Compare February 7, 2024 21:54
Copy link
Member

@tertsdiepraam tertsdiepraam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Just a few more questions

src/uucore/src/lib/features/fs.rs Show resolved Hide resolved
src/uu/split/src/split.rs Show resolved Hide resolved
@cre4ture cre4ture force-pushed the fix/stabilize_android_build_memory_kill_split branch from b4a038f to 7e2a965 Compare February 10, 2024 13:20
@cre4ture cre4ture force-pushed the fix/stabilize_android_build_memory_kill_split branch from 7e2a965 to e68312c Compare February 10, 2024 13:33
@cakebaker cakebaker merged commit 826cdbe into uutils:main Feb 12, 2024
53 of 59 checks passed
@cakebaker
Copy link
Contributor

Thanks!

@cre4ture cre4ture deleted the fix/stabilize_android_build_memory_kill_split branch February 17, 2024 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants