Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make BytesStreamOutput more efficient #5318

Closed
wants to merge 6 commits into from
Closed

Make BytesStreamOutput more efficient #5318

wants to merge 6 commits into from

Conversation

hhoffstaette
Copy link

Makes BytesStreamOutput more efficient by introducing internal paging via a (currently private) non-recycling PageCacheRecycler. This change brought to light a bug in FsTranslog, which used seek() incorrectly/ambiguously; this is fixed so that it works with both the old and new implementation.

Fix for bug #5159

via a (currently private) non-recycling PageCacheRecycler. This change
brought to light a bug in FsTranslog, which used seek()
incorrectly/ambiguously; this is fixed so that it works with both the
old and new implementation.

Fix for bug #5159
// acquire all other requested pages up front if expectedSize > pageSize
if (expectedSize > pageSize) {
ensureCapacity(expectedSize);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if there is a benefit in pre-allocating those pages compared to requesting them when necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are only preallocating (pre-requesting) when the initial capacity is given by the client. By default we only acquire a single page to get started. Strictly speaking it's not necessary, but clients who announce a capacity goal are expected to use it. Also, once we actually start using a real recycler, this will reduce contention (all pages for one acquire).

@jpountz
Copy link
Contributor

jpountz commented Mar 3, 2014

Part of the code looks very similar to what we have in BigByteArray so I'm wondering if we could somehow share some code between these classes?

byte[] page = pages.get(count / pageSize).v();
int offset = count % pageSize;
page[offset] = b;
count++;
}

@Override
public void writeBytes(byte[] b, int offset, int length) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When doing many small writes, it may help to keep a reference on the current page to not have to recompute it every time?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point - for now I was more concerned about getting it right every time, but adding a currentPage pointer should not be too difficult. Since we will likely revisit this code soon I'd like to keep this as it is for now.

@jpountz
Copy link
Contributor

jpountz commented Mar 3, 2014

I just discussed with @kimchy about this change and it might make sense to use BigArrays internally in order to implement paging, this should make things easier to implement as it already has the logic to do writes that may span across pages, and there is a BigArrays.NON_RECYCLING_INSTANCE static variable that would remove the need for NonePageCacheRecyclerService. @hhoffstaette what do you think?

@hhoffstaette hhoffstaette deleted the bytestream branch March 4, 2014 11:41
jakelandis added a commit that referenced this pull request Mar 5, 2020
This commit fixes ensures that for external builds
(e.g. plugin development) that the REST tests that are
copied are properly filtered to only include the API
by default.

The code prior to this change resulted in including both
the API and tests since the copy.include resulted as an
empty list by default since the stream is empty unless
explicitly configured.

related #52114
fixes #53183
jakelandis added a commit that referenced this pull request Mar 5, 2020
This commit fixes ensures that for external builds
(e.g. plugin development) that the REST tests that are
copied are properly filtered to only include the API
by default.

The code prior to this change resulted in including both
the API and tests since the copy.include resulted as an
empty list by default since the stream is empty unless
explicitly configured.

related #52114
fixes #53183
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants