Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix using System.retry_with_buffer with stack buffer #14615

Conversation

straight-shoota
Copy link
Member

@straight-shoota straight-shoota commented May 23, 2024

retry_with_buffer uses a stack allocated buffer for the happy case of small allocations. But this means the buffer is only valid inside the block. Outside of it, the stack memory may be corrupted because it can get overwritten by other data.

All call sites of retry_with_buffer were not paying attention to this. In all cases the buffer is used as scratch space to put dynamically sized properties of a struct. This data is later used to construct a String instance. This needs to happen within the block of retry_with_buffer to ensure the buffer is valid.

This error hasn't been noticed until I tried a little refactor in #14614. I suppose the code usually still works fine because there are no further calls which could override the stack data.
It only failed in the interpreter, which I guess does something differently. May this need further investigation?

Copy link
Contributor

@ysbaddaden ysbaddaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn you, dangling pointers 🙈

@BlobCodes
Copy link
Contributor

It only failed in the interpreter, which I guess does something differently. May this need further investigation?

Blocks are inlined in the normal backend, so the StaticArray is on the same stack frame and still valid.

In the interpreter, blocks are an own primitive which seem to behave like a call (variables are instantly discarded on return).


We can also see this with the following code:

def a(&)
  var = 0                      
  yield
  puts "a: #{pointerof(var)}"
end

def b
  var = 0
  puts "b: #{pointerof(var)}"
end

def c
  var = 0
  puts "c: #{pointerof(var)}"

  a {}
  a {}
  b
  b
end

c

When compiling this with LLVM, I get the following results (without optimization):

c: Pointer(Int32)@0x7ffe76f15b84
a: Pointer(Int32)@0x7ffe76f15b80
a: Pointer(Int32)@0x7ffe76f15b7c
b: Pointer(Int32)@0x7ffe76f15b64
b: Pointer(Int32)@0x7ffe76f15b64

But on the interpreter, I get the following:

c: Pointer(Int32)@0x7f68e1400000
a: Pointer(Int32)@0x7f68e1400008
a: Pointer(Int32)@0x7f68e1400008
b: Pointer(Int32)@0x7f68e1400008
b: Pointer(Int32)@0x7f68e1400008

Besides the fact that normal stacks grow down while the interpreter's stack grows up, we can see that the variables defined in all calls of c and b overlap in the interpreter, while in llvm only the normal calls overlap.

@asterite
Copy link
Member

Right, blocks are implemented differently in the interpreter (they work more like how Ruby works)

@beta-ziliani
Copy link
Member

So the real fix should be in the interpreter, right?

@BlobCodes
Copy link
Contributor

So the real fix should be in the interpreter, right?

I think it is reasonable to assume that variables defined in a yielding method may not be available anymore once we're not inside that block anymore (undefined behaviour).

Otherwise, we'd also have to manually inline everything annotated with @[AlwaysInline] in the interpreter. I think the added conplexity just isn't worth it for this small corner case.

@straight-shoota
Copy link
Member Author

@beta-ziliani This is the real fix. At least for this bug. buf must not be expected to be valid outside the block scope.

It's just incidental that the interpreter behaviour helped us notice this bug. It stays a bug even if the behaviour of the interpreter changed.
Whether we should align block execution in the interpreter to a behaviour more similar to compiled code is a different discussion, though.

@beta-ziliani
Copy link
Member

Yes, I see that now; I misunderstand the problem. I agree with @BlobCodes that this falls within the undefined behavior territory, so alignment between the compiler and interpreter is not a goal.

@beta-ziliani beta-ziliani added this to the 1.13.0 milestone May 28, 2024
@straight-shoota straight-shoota merged commit 6244f33 into crystal-lang:master May 29, 2024
61 checks passed
@straight-shoota straight-shoota deleted the fix/retry_with_buffer-stack_buffer branch May 29, 2024 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants