Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduce a utility function for converting Vec<T> to Vec<u8> #2898

Closed

Conversation

broccoliSpicy
Copy link
Contributor

@broccoliSpicy broccoliSpicy commented Sep 17, 2024

introduce a utility function for converting Vec to Vec, without copying, inspired by https://github.com/nabijaczleweli/safe-transmute-rs/blob/727ad53d29ae952da9ceb52a214af325cc49979b/src/to_bytes.rs#L235

I need this to convert Vec types like Vec<u16>, Vec<u32>, Vec<u64> to LanceBuffer

some doubts:
should I introduce a panic in transmute_to_bytes_vec when the machine is not little-endian?

It feels scary to do memory management in Rust, std::mem::forget(from);

@github-actions github-actions bot added the enhancement New feature or request label Sep 17, 2024
Copy link
Contributor

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this is safe. According to the rust docs:

T needs to have the same alignment as what ptr was allocated with. (T having a less strict alignment is not sufficient, the alignment really needs to be equal to satisfy the dealloc requirement that memory must be allocated and deallocated with the same layout.)

In other words. The following sequence of events can happen, and is unsafe:

  • User allocates Vec<u32> with an initial capacity of 10. This allocates a 40-byte buffer with a 4-byte alignment layout.
  • User transmutes into Vec<u8>
  • User adds a bunch more bytes. This causes the Vec to reallocate. This causes the Vec to deallocate the 40-byte buffer. The Vec passes a layout with 1-byte alignment to the deallocate method. This is unsafe because the layout does not match the layout originally used to allocate the buffer.

If you need a read-only buffer of u8 then you can use LanceBuffer::reinterpret_vec. There is no way to safely convert from Vec<T> to Vec<u8> (although I do believe you will get away with it 99% of the time).

@westonpace
Copy link
Contributor

Also, if you need a mutable byte buffer, that you sometimes need to read as &[T] then you can use LanceBuffer::Owned and LanceBuffer::borrow_to_typed_slice to get a &[T] view of the buffer when you need it.

@broccoliSpicy
Copy link
Contributor Author

the motivation here is that I want to Construct a new LanceBuffer from types other than Vec<u8>, like LanceBuffer::from(Vec<u16>)

@westonpace
Copy link
Contributor

the motivation here is that I want to Construct a new LanceBuffer from types other than Vec, like LanceBuffer::from(Vec)

Ah, you should be able to use LanceBuffer::reinterpret_vec for that.

@broccoliSpicy
Copy link
Contributor Author

the motivation here is that I want to Construct a new LanceBuffer from types other than Vec, like LanceBuffer::from(Vec)

Ah, you should be able to use LanceBuffer::reinterpret_vec for that.

Yeah, I can do Vec<u16> to arrow UInt16 Array then LanceBuffer, Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants