switch to a more efficient vector/string representation #8981

thestinger · 2013-09-04T22:34:02Z

They are currently represented them as *{ length, capacity, data[] }. It means we have to handle trickier uint overflow cases, which increases cost size a fair bit. It also means even an empty vector always requires a heap allocation.

C++ represents a vector as { length, capacity, *data }, so a fresh empty vector simply as a null data pointer. Unlike the OptVec optimization, it does not require an extra branch in every method. Calling realloc on a null pointer is well-defined, so you can ignore the special case.

This has the drawback of making a move involve a 3-word copy, not 1-word. However, passing around unique vectors by-move rather than slices is not common enough to be a very relevant performance issue.

Benchmarks on x86_64 Linux (linux 3.11.5, glibc 2.18):

Comparison	~[T]	Proposal
Empty	29 ns/iter (+/- 1)	2 ns/iter (+/- 0)
Push 10k, starting at 16 capacity	45528 ns/iter (+/- 2584)	7611 ns/iter (+/- 407)
Push 10k, starting at 10k capacity	43783 ns/iter (+/- 1429)	6876 ns/iter (+/- 320)
Size	`size_of::<uint>`	`3 * size_of::<uint>`
Empty heap allocation size	minimum `2 * size_of::<uint>`	n/a
4 element heap allocation size	`2 * size_of::<uint> + 4 * size_of::<T>`	`4 * size_of::<T>`

Note that since jemalloc is very clever, at some sizes preallocating has next to no advantage for the new proposed vector too.

Implementation:

#[allow(ctypes)];

#[cfg(test)]
extern mod extra;

use std::mem::size_of;
use std::cast::transmute;
use std::unstable::raw::Slice;
use std::unstable::intrinsics::move_val_init;
use std::ptr::read_ptr;

// redefine these to use `*mut` ...

extern {
    fn malloc(size: uint) -> *mut u8;
    fn realloc(ptr: *mut u8, size: uint) -> *mut u8;
    fn free(ptr: *mut u8);
    fn abort() -> !;
}

#[fixed_stack_segment]
unsafe fn malloc_raw(size: uint) -> *mut u8 {
    let ptr = malloc(size);
    if ptr.is_null() {
        abort()
    }
    ptr
}

#[fixed_stack_segment]
unsafe fn realloc_raw(ptr: *mut u8, size: uint) -> *mut u8 {
    let ptr = realloc(ptr, size);
    if ptr.is_null() {
        abort()
    }
    ptr
}

#[fixed_stack_segment]
unsafe fn free_raw(ptr: *mut u8) {
    free(ptr)
}

struct Vec<T> {
    priv len: uint,
    priv cap: uint,
    priv ptr: *mut T
}

impl<T> Vec<T> {
    #[inline]
    fn new() -> Vec<T> {
        Vec { len: 0, cap: 0, ptr: 0 as *mut T }
    }

    #[inline]
    fn with_capacity(capacity: uint) -> Vec<T> {
        let size = capacity.checked_mul(&size_of::<T>()).unwrap();
        unsafe {
            Vec { len: 0, cap: capacity, ptr: malloc_raw(size) as *mut T }
        }
    }

    #[inline]
    fn shrink_to_fit(&mut self) {
        unsafe {
            self.cap = self.len;
            if self.len == 0 {
                free_raw(self.ptr as *mut u8);
                self.ptr = 0 as *mut T;
            } else {
                self.ptr = realloc_raw(self.ptr as *mut u8, self.cap * size_of::<T>()) as *mut T;
            }
        }
    }

    #[inline]
    fn push(&mut self, value: T) {
        unsafe {
            if self.len == self.cap {
                if self.cap == 0 { self.cap += 2 }
                let old_size = self.cap * size_of::<T>();
                self.cap = self.cap * 2;
                let size = old_size.checked_mul(&2).unwrap();
                self.ptr = realloc_raw(self.ptr as *mut u8, size) as *mut T;
            }

            let end = self.ptr.offset(self.len as int);
            move_val_init(&mut *end, value);
            self.len += 1;
        }
    }

    #[inline]
    fn as_slice<'r>(&'r self) -> &'r [T] {
        let slice = Slice { data: self.ptr as *T, len: self.len };
        unsafe { transmute(slice) }
    }

    #[inline]
    fn as_mut_slice<'r>(&'r mut self) -> &'r mut [T] {
        let slice = Slice { data: self.ptr as *T, len: self.len };
        unsafe { transmute(slice) }
    }
}

#[unsafe_destructor]
impl<T> Drop for Vec<T> {
    fn drop(&mut self) {
        unsafe {
            for x in self.as_mut_slice().mut_iter() {
                read_ptr(x);
            }
            free_raw(self.ptr as *mut u8)
        }
    }
}

#[cfg(test)]
mod bench {
    use std;
    use super::Vec;
    use extra::test::BenchHarness;

    #[bench]
    fn empty(bh: &mut BenchHarness) {
        do bh.iter() {
            let _xs: Vec<int> = Vec::new();
        }
    }

    #[bench]
    fn empty_std(bh: &mut BenchHarness) {
        do bh.iter() {
            let _xs: ~[int] = ~[];
        }
    }

    static size: uint = 10000;

    #[bench]
    fn push(bh: &mut BenchHarness) {
        do bh.iter() {
            let mut xs = Vec::with_capacity(16);
            for i in range(0, size) {
                xs.push(i);
            }
        }
    }

    #[bench]
    fn push_std(bh: &mut BenchHarness) {
        do bh.iter() {
            let mut xs = std::vec::with_capacity(16);
            for i in range(0, size) {
                xs.push(i);
            }
        }
    }

    #[bench]
    fn push_preallocated(bh: &mut BenchHarness) {
        do bh.iter() {
            let mut xs = Vec::with_capacity(size);
            for i in range(0, size) {
                xs.push(i);
            }
        }
    }

    #[bench]
    fn push_preallocated_std(bh: &mut BenchHarness) {
        do bh.iter() {
            let mut xs = std::vec::with_capacity(size);
            for i in range(0, size) {
                xs.push(i);
            }
        }
    }
}

#[test]
fn push() {
    let mut xs = Vec::new();
    for x in std::iter::range_step(0, 100, 5) {
        xs.push(x)
    }
    let expected = std::iter::range_step(0, 100, 5).to_owned_vec();
    assert_eq!(xs.as_slice(), expected.as_slice());
}

The text was updated successfully, but these errors were encountered:

pnkfelix · 2013-09-05T07:55:12Z

It also means even an empty vector requires a heap allocation.

I do not see the fact that we are heap-allocating some storage when making empty vectors as a drawback, at least not a serious one (assuming that empty vectors always have non-zero capacity). When I see an expression constructing an empty vector, I expect some elements to be pushed into it in the (near?) future, so we are just eagerly doing some work that we would do eventually anyway.

(One exception to this that I could imagine would be a sparse collection backed by a vector of vectors, where most of its entries are empty.)

thestinger · 2013-09-05T08:00:27Z

The issue is that you don't have the option of creating an empty vector at all, without an extra branch on every operation (indexing, checking size, etc.) like you end up doing with OptVec. It makes sense to pay that price for the benefits of a small vector, but not for empty ones.

Before:

let mut xs = ~[]; // allocates space for 4 elements
xs.push(4);

After:

let mut xs = ~[]; // data pointer is null
xs.push(4); // allocates space for 4 elements

The allocation would just happen when you first push instead of at the declaration, so there wouldn't be no performance hit compared to the current situation. You always have the option of using with_capacity or reserve too.

pnkfelix · 2013-09-05T08:07:38Z

@thestinger When you say "empty vector", you here mean "zero-capacity vector", right?

(update: okay, a later revision to thestinger's comment makes it clear that is what he means.)

pnkfelix · 2013-09-05T08:15:37Z

@thestinger I understand your example, and it matches my understanding from when I wrote my original comment. When I wrote "empty vector", I mean zero-length vectors. I was trying to say "I do not see forcing non-zero-capacity on zero-length vectors as a drawback, because I expect zero-length vectors to be non-zero-length in the near future." (And then I noted one exception where I could see a case motivating support for zero-capacity vectors.)

I guess all I was saying was that I wanted us to apply appropriate weights when doing a cost/benefit analysis of the change you are proposing.

nikomatsakis · 2013-09-06T15:21:43Z

This is somewhat orthogonal to DST since we have already decided that there would be special representations for &[] and &str as compared to &T for some other T. This is ok because of monomorphization, though it does mean that one cannot cast a DST like [] to an object (that is &[] cannot be cast to an object &Foo even if [] implements Foo). The problem there is that objects must be manipulable without knowing the precise receiver type, so they must always be a single pointer.

However, thinking on this a bit harder, if we could preserve the invariant that ~T is always 1 word except for objects (that is, ~Foo is two words), we might be able to lift the object restriction, though it'd require some cleverness. The idea is that if you have a type like &[] that you cast to &Foo, we could actually create a temporary and store a pointer to the temporary. This is ok if the lifetime of the &Foo is short enough. Actually that's pretty yucky, because it would mean that you couldn't cast an &'a [] to a &'a Foo for all 'a, since some of those lifetimes might exceed the current stack frame. Moreover it'd require some tricky footwork in trans to deal with the extra deref. So scratch that idea. Rather I'm inclined to say that casting pointer-T to an object requires that T be sized.

Given that restriction, I believe we are free to represent pointer-DST in whatever ways we prefer.

lilyball · 2013-09-26T05:24:23Z

@pnkfelix I have written many functions that only sometimes need a vector, and sometimes don't. It's annoying to take the perf hit of heap allocation for the cases where I end up not needing the vector. A common case here is if I need to normalize data, but I want to avoid doing work in the case where the data is already normalized.

thestinger · 2013-09-26T06:01:12Z

There's no harm in doing the allocation at the first push instead of at initialization. I've verified that the only drop in efficiency is a call to realloc instead of malloc, with jemalloc branching immediately on null. LLVM could be taught to replace realloc calls on null pointers to malloc calls but the difference isn't measurable.

thestinger · 2013-10-19T19:46:18Z

I updated the numbers based on the removal of jemalloc. They are likely much more platform-specific now.

alexcrichton · 2013-11-05T19:04:50Z

The consensus from today's meeting is that we should change the representation of owned vectors. We may possibly want to change the representation of slices as well, but that'll be an issue to tackle after this one.

pnkfelix · 2013-11-07T18:36:03Z

accepted for P-backcompat-lang for 1.0

thestinger · 2014-01-28T13:34:40Z

Superseded by #11875.

…nt_drop_lint, r=flip1995 Add details about how significant drop in match scrutinees can cause deadlocks Adds more details about how a significant drop in a match scrutinee can cause a deadlock and include link to documentation. changelog: Add more details to significant drop lint to explicitly show how temporaries in match scrutinees can cause deadlocks.

This was referenced Oct 14, 2013

major performance regression #9858

Closed

Bringing in std::task generates an absurd amount of LLVM bytecode #9813

Closed

represent slices with length in elements, not bytes #9885

Merged

nikomatsakis mentioned this issue Nov 5, 2013

Investigate representing slices with 3 words, just like vectors #10295

Closed

huonw mentioned this issue Jan 7, 2014

libextra: Introduce typed arenas. #11358

Merged

thestinger closed this as completed Jan 28, 2014

huonw mentioned this issue Mar 2, 2014

libsyntax: Change ~[T] to Vec<T>. #12637

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

switch to a more efficient vector/string representation #8981

switch to a more efficient vector/string representation #8981

thestinger commented Sep 4, 2013

pnkfelix commented Sep 5, 2013

thestinger commented Sep 5, 2013

pnkfelix commented Sep 5, 2013

pnkfelix commented Sep 5, 2013

nikomatsakis commented Sep 6, 2013

lilyball commented Sep 26, 2013

thestinger commented Sep 26, 2013

thestinger commented Oct 19, 2013

alexcrichton commented Nov 5, 2013

pnkfelix commented Nov 7, 2013

thestinger commented Jan 28, 2014

switch to a more efficient vector/string representation #8981

switch to a more efficient vector/string representation #8981

Comments

thestinger commented Sep 4, 2013

pnkfelix commented Sep 5, 2013

thestinger commented Sep 5, 2013

pnkfelix commented Sep 5, 2013

pnkfelix commented Sep 5, 2013

nikomatsakis commented Sep 6, 2013

lilyball commented Sep 26, 2013

thestinger commented Sep 26, 2013

thestinger commented Oct 19, 2013

alexcrichton commented Nov 5, 2013

pnkfelix commented Nov 7, 2013

thestinger commented Jan 28, 2014