Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

stream: switch _writableState.buffer to queue #8826

Conversation

chrisdickinson
Copy link

In cases where many small writes are made to a stream lacking _writev, the array data structure backing the WriteReq buffer would greatly increase GC pressure.

Specifically, in the fs.WriteStream case, the clearBuffer routine would only clear a single WriteReq from the buffer before exiting, but would cause the entire backing array to be GC'd. Switching to [].shift lessened pressure, but still the bulk of the time was spent in memcpy.

This replaces that structure with a linked list-backed queue so that adding and removing from the queue is O(1). In the _writev case, collecting the buffer requires an O(N) loop over the buffer, but that was already being performed to collect callbacks, so slowdown should be neglible.

Example code (courtesy @hayes, who originally discovered this slowdown while dealing with log output to bunyan):

var fs = require('fs')
var stream = fs.createWriteStream('/dev/null', {flags: 'w+'})
var buf = new Buffer('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n', 'utf8')
var start = Date.now()
var written = 0
function log() {
  for(var i = 0; i < 1000; i++) {
    ++written
    stream.write(buf, function() {
      --written
    })
  }
  if(Date.now() - start < 1000) {
    return setImmediate(log) 
  }
  setInterval(function() {
    console.log(written, process.memoryUsage())
  }, 1000).unref()
  console.log('done writing')
}
log()

before the change:

before

array.shift()

(this replaces the common c=1 slice case with [].shift()):

before

as queue:

after

I'm working on getting a better "after" flamegraph at the moment.

As a measurable effect: the current version clears ~300 or so WriteReqs/second on my machine; the queue version clears ~50k WriteReqs/second.

cc @wraithan

R=@trevnorris @misterdjules

In cases where many small writes are made to a stream
lacking _writev, the array data structure backing the
WriteReq buffer would greatly increase GC pressure.

Specifically, in the fs.WriteStream case, the
clearBuffer routine would only clear a single WriteReq
from the buffer before exiting, but would cause the
entire backing array to be GC'd. Switching to [].shift
lessened pressure, but still the bulk of the time was
spent in memcpy.

This replaces that structure with a linked list-backed
queue so that adding and removing from the queue is O(1).
In the _writev case, collecting the buffer requires an
O(N) loop over the buffer, but that was already being
performed to collect callbacks, so slowdown should be
neglible.
get: util.deprecate(function() {
return this.getBuffer();
}, '_writableState.buffer is deprecated. Use _writableState.getBuffer() instead.')
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is internal/private/undocumented state, I'm not sure we need to do this deprecation warning as we've never really explained to people how to interact with this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool -- I was mostly concerned with supporting projects like vstream that reach into ReadableState for metrics. I figured that if there's one I know about, there's probably more that I don't know about :)

@tjfontaine
Copy link

This looks great, I would love to also get a proposal on what it means for streams consumers to inspect the undocumented readable/writable state, and work up an API around that.

@trevnorris
Copy link

I like how you did this, and I'm cool with this being merged.

chrisdickinson added a commit that referenced this pull request Dec 18, 2014
In cases where many small writes are made to a stream
lacking _writev, the array data structure backing the
WriteReq buffer would greatly increase GC pressure.

Specifically, in the fs.WriteStream case, the
clearBuffer routine would only clear a single WriteReq
from the buffer before exiting, but would cause the
entire backing array to be GC'd. Switching to [].shift
lessened pressure, but still the bulk of the time was
spent in memcpy.

This replaces that structure with a linked list-backed
queue so that adding and removing from the queue is O(1).
In the _writev case, collecting the buffer requires an
O(N) loop over the buffer, but that was already being
performed to collect callbacks, so slowdown should be
neglible.

PR-URL: #8826
Reviewed-by: Timothy J Fontaine <tjfontaine@gmail.com>
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
@chrisdickinson
Copy link
Author

Merged in 9158666.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants