Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RangeError: Invalid array length #208

Open
AmitMY opened this issue Jun 14, 2022 · 7 comments
Open

RangeError: Invalid array length #208

AmitMY opened this issue Jun 14, 2022 · 7 comments

Comments

@AmitMY
Copy link

AmitMY commented Jun 14, 2022

I am trying to parse an a float array.
Normally, this code works, however, I now have one huge file (400MB of file), and I want to start reading it.

    const dataParser = newParser()
        .array("data", {
            type: "floatle",
            length: dataLength // 82,272,642
        })
        .saveOffset('dataLength');

    const data = dataParser.parse(buffer);

As you can see, I am trying to parse an array with 82 million entries, which is less than the 2147483647 limit in javascript, however, I am getting the following error:

Uncaught (in promise) RangeError: Invalid array length
at Array.push ()
at Parser.eval [as compiled]

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Errors/Invalid_array_length

Additional information

Manual experimentation finds that the limit is somewhere between 50,100,000 and 50,500,000.

(related to sign/translate#44)

@keichi
Copy link
Owner

keichi commented Jun 14, 2022

Thanks for reporting. I've never really created such huge arrays in JS.

Could you test if you can create an array with the same size (82 million elements) w/o using binary-parser?
Your parser compiles to the following code, but I don't see anything suspicious that would bloat the memory footprint than expected.

var dataView = new DataView(buffer.buffer, buffer.byteOffset, buffer.length);
var offset = 0;
var vars = {};

vars.data = [];
for (var $tmp0 = 82272642; $tmp0 > 0; $tmp0--) {
    var $tmp1 = dataView.getFloat32(offset, true);
    offset += 4;
    vars.data.push($tmp1);
}
vars.dataLength = offset

return vars;

@AmitMY
Copy link
Author

AmitMY commented Jun 14, 2022

This too fails, on the vars.data.push line.
If I add a log before that push, I get that the current vars.data.length is 50139473

Code that works:

    data.data = new Float32Array(82272642);
    for (var $tmp0 = 0; $tmp0 < 82272642; $tmp0++) {
        var $tmp1 = dataView.getFloat32(offset, true);
        offset += 4;
        data.data[$tmp0] = $tmp1
    }
    data.dataLength = offset

If I initialize the necessary float32 array, that it is fine. Also, never needs to realloc.

This method btw, is 8 times faster, for an array of size 27,424,214, compared to the regular parsing.

@keichi
Copy link
Owner

keichi commented Jun 14, 2022

Ok, that makes sense. Reallocs are definitely an overhead, and I guess typed arrays are more compact than normal arrays. But this approach would only work for fixed-length arrays of primitive types. Is that what you are parsing?

@AmitMY
Copy link
Author

AmitMY commented Jun 14, 2022

Yes, the largest arrays that I parse are indeed of fixed sizes (as in, I specify length to be parsed).
The normal behavior is still good for short arrays, I'd imagine.

@keichi
Copy link
Owner

keichi commented Jun 14, 2022

It turns out you can directly create a Float32Array from an ArrayBuffer (zero copy).
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Float32Array/Float32Array

Can you try if the following works?

const dataParser = new Parser()
    .buffer("data", {
        length: 82272642 * 4, // length in bytes
        formatter: (buf) => new Float32Array(buf.buffer) // buf is a DataView
    })
    .saveOffset('dataLength');

@AmitMY
Copy link
Author

AmitMY commented Jun 16, 2022

sorry, i missed your previous message.

if i do what you wrote, and add a console.log

            formatter: (buf) => {
                console.log(buf);
                return new Float32Array(buf.buffer)
            } 

In the console I see that buf is a Uint8Array, and an error:

Uncaught (in promise) RangeError: byte length of Float32Array should be a multiple of 4

because buf.buffer has an odd number of bytes
image

(for completeness sake, this is not the original large one, just a small scale test using length: 26578 * 4, and a seek of 1925 to get to the right place in this file)

@AmitMY
Copy link
Author

AmitMY commented Jun 24, 2022

Hi @keichi
Is there any plan to support this in this library?
If no plan, I'll use the custom solution, but it would be nice to at least catch this type of error and point people to this issue or some fix for future people.


I tried to write a test for it, but it passes, so I think it's out of my league to contribute here

describe('Large arrays', () => {
    it('should parse large array without error', () => {
      const length = 80_000_000;
      const array = Buffer.from(new Float32Array(length).fill(0).buffer);

      const parser = new Parser()
        .array("data", {
          type: "floatle",
          length
        });

      const buffer = factory(array);
      doesNotThrow(() => parser.parse(buffer));
    })
  })

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants