Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: fix DoS vector in atob #51670

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 34 additions & 31 deletions lib/buffer.js
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,8 @@

const {
Array,
ArrayFrom,
ArrayIsArray,
ArrayPrototypeForEach,
ArrayPrototypeIndexOf,
MathFloor,
MathMin,
MathTrunc,
Expand Down Expand Up @@ -1255,35 +1253,41 @@ function btoa(input) {
throw new ERR_MISSING_ARGS('input');
}
input = `${input}`;
let acc = 0;
for (let n = 0; n < input.length; n++) {
if (input[n].charCodeAt(0) > 0xff)
throw lazyDOMException('Invalid character', 'InvalidCharacterError');
acc |= StringPrototypeCharCodeAt(input, n);
}
if (acc & ~0xff) {
throw lazyDOMException('Invalid character', 'InvalidCharacterError');
}
const buf = Buffer.from(input, 'latin1');
return buf.toString('base64');
}

// Refs: https://infra.spec.whatwg.org/#forgiving-base64-decode
// https://infra.spec.whatwg.org/#ascii-whitespace
// Valid Characters: [\t\n\f\r +/0-9=A-Za-z]
// Lookup table (-1 = invalid, 0 = valid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you select -1 and 0, instead of 0 and 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment for line 1308.

/* eslint-disable no-multi-spaces, indent */
const kForgivingBase64AllowedChars = [
// ASCII whitespace
// Refs: https://infra.spec.whatwg.org/#ascii-whitespace
0x09, 0x0A, 0x0C, 0x0D, 0x20,

// Uppercase letters
...ArrayFrom({ length: 26 }, (_, i) => StringPrototypeCharCodeAt('A') + i),

// Lowercase letters
...ArrayFrom({ length: 26 }, (_, i) => StringPrototypeCharCodeAt('a') + i),

// Decimal digits
...ArrayFrom({ length: 10 }, (_, i) => StringPrototypeCharCodeAt('0') + i),

0x2B, // +
0x2F, // /
0x3D, // =
-1, -1, -1, -1, -1, -1, -1, -1,
-1, 0, 0, -1, 0, 0, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, -1,
0, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, 0, -1, -1, -1, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, -1, -1, -1, 0, -1, -1,
-1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, -1, -1, -1, -1, -1,
-1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, -1, -1, -1, -1, -1,
];
const kEqualSignIndex = ArrayPrototypeIndexOf(kForgivingBase64AllowedChars,
0x3D);
/* eslint-enable no-multi-spaces, indent */

function atob(input) {
// The implementation here has not been performance optimized in any way and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need to be revised.

  // The implementation here has not been performance optimized in any way and

Expand All @@ -1298,16 +1302,17 @@ function atob(input) {
let equalCharCount = 0;

for (let n = 0; n < input.length; n++) {
const index = ArrayPrototypeIndexOf(
kForgivingBase64AllowedChars,
StringPrototypeCharCodeAt(input, n));
const ch = StringPrototypeCharCodeAt(input, n);
const val = kForgivingBase64AllowedChars[ch & 0x7f];

if (index > 4) {
// The first 5 elements of `kForgivingBase64AllowedChars` are
// ASCII whitespace char codes.
if ((ch | val) & ~0x7f) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment to what this line is doing?

Copy link
Contributor Author

@chjj chjj Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, will do. It's some non-obvious bitwise magic. It's ensuring two things: that ch is ASCII and is a valid index of kForgivingBase64AllowedChars, and that val is not -1 (i.e. ch is a valid base64 character).

This line could actually be OR'd onto an accumulator and checked after the loop is complete to remove a branch, but I figure people here would be opposed to it due to the "no optimization of legacy functions" philosophy.

On that note, I was originally considering making this loop entirely branchless, but I'm guessing that definitely wouldn't be acceptable. It would also probably require a whole series of comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. I strongly think that we should add comments to the code as well.

throw lazyDOMException('Invalid character', 'InvalidCharacterError');
}

if (ch > 0x20) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment to here?

nonAsciiWhitespaceCharCount++;

if (index === kEqualSignIndex) {
if (ch === 0x3d) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment to what 0x3d represents?

equalCharCount++;
} else if (equalCharCount) {
// The `=` char is only allowed at the end.
Expand All @@ -1318,8 +1323,6 @@ function atob(input) {
// Only one more `=` is permitted after the first equal sign.
throw lazyDOMException('Invalid character', 'InvalidCharacterError');
}
} else if (index === -1) {
throw lazyDOMException('Invalid character', 'InvalidCharacterError');
}
}

Expand Down
63 changes: 63 additions & 0 deletions test/parallel/test-btoa-atob.js
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,69 @@ strictEqual(atob([]), '');
strictEqual(atob({ toString: () => '' }), '');
strictEqual(atob({ [Symbol.toPrimitive]: () => '' }), '');

const invalidChar = {
name: 'InvalidCharacterError'
};

// Test the entire 16 bit space for invalid characters.
for (let i = 0; i <= 0xffff; i++) {
switch (i) {
case 0x09: // \t
case 0x0A: // \n
case 0x0C: // \f
case 0x0D: // \r
case 0x20: // ' '
case 0x2B: // +
case 0x2F: // /
case 0x3D: // =
continue;
}

// 0-9
if (i >= 0x30 && i <= 0x39)
continue;

// A-Z
if (i >= 0x41 && i <= 0x5a)
continue;

// a-z
if (i >= 0x61 && i <= 0x7a)
continue;

const ch = String.fromCharCode(i);

throws(() => atob(ch), invalidChar);
throws(() => atob('a' + ch), invalidChar);
throws(() => atob('aa' + ch), invalidChar);
throws(() => atob('aaa' + ch), invalidChar);
throws(() => atob(ch + 'a'), invalidChar);
throws(() => atob(ch + 'aa'), invalidChar);
throws(() => atob(ch + 'aaa'), invalidChar);
}

throws(() => btoa('abcd\ufeffx'), invalidChar);

const charset =
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';

function randomString(size) {
let str = '';

for (let i = 0; i < size; i++)
str += charset[Math.random() * charset.length | 0];

while (str.length & 3)
str += '=';

return str;
}

for (let i = 0; i < 100; i++) {
const str = randomString(200);
strictEqual(btoa(atob(str)), str);
}

throws(() => atob(Symbol()), /TypeError/);
[
undefined, false, () => {}, {}, [1],
Expand Down