Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for TypedArray interop between WASM and JS #268

Closed
wants to merge 1 commit into from

Conversation

kyr0
Copy link

@kyr0 kyr0 commented Sep 17, 2018

As discussed in #263 and #240, I added support for memory mapped interop of all standard TypedArray subtypes. I added tests for every subtype to ensure a safe operation. Type checks ensure the safe construction of TypedArray subtypes in the loader lib.

view;

// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays
if (typedArrayCtor === Int8Array || typedArrayCtor === Uint8Array || typedArrayCtor === Uint8ClampedArray) {
Copy link
Member

@MaxGraey MaxGraey Sep 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use length * typedArrayCtor.BYTES_PER_ELEMENT. If BYTES_PER_ELEMENT doesn't exist in typedArrayCtor throw "Unsupported TypedArray constructor"

@dcodeIO
Copy link
Member

dcodeIO commented Sep 17, 2018

Just leaving my draft here, that I made before this PR, in case it's useful :)

  /** Allocates a new fixed-size buffer in the module's memory and returns its pointer. */
  function newBuffer(view, length) {
    return view.constructor === Function
      ? newEmptyBuffer(view, length)
      : newCopiedBuffer(view, length);
  }

  /** Allocates a new fixed-size empty buffer of the specified type and returns its pointer. */
  function newEmptyBuffer(ctor, length, unsafe) {
    var size = ctor.BYTES_PER_ELEMENT * length;
    var ptr = exports["memory.allocate"](size);
    if (!unsafe) exports["memory.fill"](ptr, 0, size);
    return ptr;
  }

  /** Copies an existing fixed-size buffer to the module's memory and returns its pointer. */
  function newCopiedBuffer(view, length) {
    if (length === undefined) length = view.length;
    var ctor = view.constructor;
    var ptr = exports["memory.allocate"](ctor.BYTES_PER_ELEMENT * length);
    new ctor(mem, ptr, length).set(view);
    return ptr;
  }

@MaxGraey
Copy link
Member

function newBuffer(view, length, unsafe) {
  return view.constructor === Function
      ? newEmptyBuffer(view, length, unsafe)
      : newCopiedBuffer(view, length);
}

?

@dcodeIO
Copy link
Member

dcodeIO commented Sep 17, 2018

That's just so you can do

newBuffer(new Float32Array([1.1, 1,2, ...]);

or

newBuffer(Float32Array, 10);

@MaxGraey
Copy link
Member

MaxGraey commented Sep 17, 2018

I see. I just add unsafe param, so we can do like this newBuffer(Float32Array, 10, true); without clear buffer area

@MaxGraey
Copy link
Member

MaxGraey commented Sep 17, 2018

And when we need read buffer after procces in js side we just create array from linear buffer:

const ptr = newBuffer(Float32Array, 10);
instance.exports.process(ptr, 10);
const bufF32 = new Float32Array(instance.exports.memory.buffer, ptr, 10);
// read processed result from bufF32

Right?

@dcodeIO
Copy link
Member

dcodeIO commented Sep 17, 2018

Either that or return the respective view as well as proposed in the PR.

@MaxGraey
Copy link
Member

MaxGraey commented Sep 17, 2018

function getBuffer(ctor, ptr, length) {
   checkMem();
   return new ctor(mem, ptr, length);
}

getBuffer(Float32Array, ptr, 10);

Or better:

function newBuffer(view, length, unsafe) {
  const ptr = view.constructor === Function
      ? newEmptyBuffer(view, length, unsafe)
      : newCopiedBuffer(view, length);

  return {
    ptr,
    length,
    array() {
      checkMem();
      return new view.constructor(mem, ptr, length);
    },
    free() {
      exports["memory.free"](ptr);
    }
  };
}
const buffer = newBuffer(Float32Array, 10);
instance.exports.someMutableProcess(buffer.ptr, buffer.length);
const result = buffer.array();
...
buffer.free();

@MaxGraey
Copy link
Member

I prefer create Typed Array lazily and only when we really need read back from memory

@MaxGraey
Copy link
Member

MaxGraey commented Sep 17, 2018

And will be great ability create ArrayBuffer from Pointer class on AssemblyScript but this require extra space (for 4-byte length field) during alloc inside newArray.

@dcodeIO
Copy link
Member

dcodeIO commented Sep 18, 2018

For reference, for a real typed array on the AS side we'd need to do something like this (didn't test, no GC support):

  function computeBufferSize(byteLength) {
    const HEADER_SIZE = 8;
    return 1 << (32 - Math.clz32(byteLength + HEADER_SIZE - 1));
  }

  /** Creates a new typed array in the module's memory and returns its pointer. */
  function newArray(view, length, unsafe) {
    var ctor = view.constructor;
    if (ctor === Function) { // TypedArray constructor
      ctor = view;
      view = null;
    } else { // TypedArray instance
      if (length === undefined) length = view.length;
    }
    var elementSize = ctor.BYTES_PER_ELEMENT;
    if (!elementSize) throw Error("not a typed array");
    var byteLength = elementSize * length;
    var ptr = exports["memory.allocate"](12); // TypedArray header
    var buf = exports["memory.allocate"](computeBufferSize(byteLength)); // ArrayBuffer
    checkMem();
    U32[ ptr      >>> 2] = buf;        // .buffer
    U32[(ptr + 4) >>> 2] = 0;          // .byteOffset
    U32[(ptr + 8) >>> 2] = byteLength; // .byteLength
    U32[ buf      >>> 2] = byteLength; // .byteLength
    U32[(buf + 4) >>> 2] = 0;          // 0
    if (view) {
      new ctor(mem, buf + 8, length).set(view);
    } else if (!unsafe) {
      exports["memory.fill"](buf + 8, 0, byteLength);
    }
    return ptr;
  }

  /** Gets a view on a typed array in the module's memory by its pointer. */
  function getArray(ctor, ptr) {
    var elementSize = ctor.BYTES_PER_ELEMENT;
    if (!elementSize) throw Error("not a typed array");
    checkMem();
    var buf        = U32[ ptr      >>> 2];
    var byteOffset = U32[(ptr + 4) >>> 2];
    var byteLength = U32[(ptr + 8) >>> 2];
    return new ctor(mem, buf + 8 + byteOffset, (byteLength - byteOffset) / elementSize);
  }

@kyr0
Copy link
Author

kyr0 commented Sep 18, 2018

Oh wow, thanks for the lively discussion. Your suggestions look great. I would suggest that I combine them and update the PR. To wrap it up:

Minimal lib/loader extension
newBuffer(ctorOrInstance, length) according to @MaxGraey interface and @dcodeIO impl. in loader. I would add the TS interfaces and add type checks for the constructor case. I think we can omit getBuffer() now, because newBuffer() returns the view like in my original proposal.
I really like the .free() in @MaxGraey response. For the lib I would write new and enhanced tests for all cases.

But I question myself, why do we call it newBuffer() when we essentially always create TypedArray's? When we think from the AS side the TypedArray created in JS is like a Buffer, because it only contains the values without byteLength. But seen from the JS side, the developer would probably wonder? Because newBuffer() essentially returns a TypedArray wrapper object with a view additional informations and a free() method to free the memory taken by the TypedArray.

So from the JS side of view, I would be more happy with newTypedArray.

Minimal AS side datatype extensions
Now from the AS side of view: When the developer created a new or passed in an existing TypedArray, we have a Buffer in AS, because all TypedArray's in AS need the memory layout of ArrayBuffer. Why don't we add a static method to the ArrayBuffer class like fromJSTypedArrayBuffer(). It returns an AS ArrayBuffer. The implementation of @dcodeIO looks awesome for this. Then we could add another static method to AS TypedArray like fromJSTypedArray(ptr, length) returning the correct AS TypedArray subtype in a generic way.

Powerful and easy API

For the developer, the experience would be awesome as this will be possible and has only the overhead decided for:

JS:

const existingTypedArray = createMyTypedArray(); // returns instance of e.g. Float32Array

// existing TypedArray instance case, can omit the length
const typedArrayRef = module.newTypedArray(existingTypedArray);
// or
const typedArrayRef = module.newTypedArray(Float32Array, 255);

module.process(typedArrayRef.ptr, typedArrayRef.length);

typedArrayRef.getTypedArray() // Float32Array instance
typedArrayRef.free() // free memory in AS context

AS:

process(ptr: usize, length: i32): void {

    // high level, super easy; alternative: Use Float32Array.fromJSTypedArray(...) to omit generics
    const myTypedArray: Float32Array = TypedArray.fromJSTypedArray<Float32Array>(ptr, length);

    // or, more low-level
    const myArrayBuffer: ArrayBuffer<f32> = ArrayBuffer.fromJSTypedArrayBuffer<f32>(ptr, length);

    // or almost raw
   const pointerFun: Pointer<f32> = new Pointer<f32>();

   // pointer arithmetic
}

What do you think? I believe it contains all the ideas?

@MaxGraey
Copy link
Member

...
U32[(ptr + 8) >>> 2] = byteLength; // .byteLength
U32[ buf      >>> 2] = byteLength; // .byteLength
...

Isn't this will be:

U32[(ptr + 8) >>> 2] = length; // .length
U32[ buf      >>> 2] = byteLength; // .byteLength

?

@dcodeIO
Copy link
Member

dcodeIO commented Sep 18, 2018

Layout of a typed array differs a bit from a normal array. A typed array computes the length property from its byteLength and type.

@MaxGraey
Copy link
Member

Got it! I thought this usual Array

@MaxGraey
Copy link
Member

MaxGraey commented Sep 18, 2018

Hmm, btw length from Array. Could we drop this and calculate length from ArrayBuffer's byteLength same as we calculate length for TypedArray?

@inline
get length(): i32 {
    return this.buffer_.byteLength >> alignof<T>();
}

EDIT
And the same for TypedArray. Removing byteLength from main header and leave only buffer and byteOffset. And we will header with 8-bytes alignment instead 12 (actually 16). And length calculation will be:

@inline
get length(): i32 {
    return (this.buffer.byteLength - this.byteOffset) >> alignof<T>();
}

@dcodeIO
Copy link
Member

dcodeIO commented Sep 18, 2018

We can't drop it there because a user can modify .length. For instance, the backing buffer might have space for 100 elements, but the length is still 0 until values are pushed to it, which the length property keeps track of.

@MaxGraey
Copy link
Member

Got it, so it actually works as capacity

@dcodeIO
Copy link
Member

dcodeIO commented Sep 18, 2018

As this seems to lead to some confusion, I decided to go ahead with an implementation that creates a proper typed array for use on the AS side here.

So, this should "just work":

// JS side
var ptr = module.newArray(new Int32Array([1, 2, 3]));
module.processArray(ptr);
console.log(module.getArray(ptr));
module.freeArray(ptr); // once not used anymore

// AS side
export function processArray(arr: Int32Array): void {
  ...
}

@kyr0
Copy link
Author

kyr0 commented Sep 19, 2018

Okay.

Edit: I was to fast in my response. It works, I just didn't see the assignment wasn't on the array index location. Maybe the AS code could do:

arr[i] += arr[i]; instead of v += arr[i]; and another test could compare it to [ 2, 4, 6, 8, 10, -2 ] to make sure the mutation + data passing works?

Also, as I mentioned earlier, I find it counter-intuitive that getArray() returns a TypedArray subclass instance. I think there is a significant difference between the datatypes Array and TypedArray in JS context and the current method naming could be irritating.

Also, module.getArray(ptr) requires the constructor function of the TypedArray subclass to be passed -- which means I have to repeat myself -- I find this a little bit counter-intuitive too (but this is up to taste, I guess).

@dcodeIO
Copy link
Member

dcodeIO commented Sep 19, 2018

I find it counter-intuitive that getArray() returns a TypedArray subclass instance

Hmm, might be, yeah. My thought was that all we can do currently is typed arrays anways, but if you think it helps we can rename it to getTypedArray etc.

module.getArray(ptr) requires the constructor function of the TypedArray subclass to be passed

Yeah, but it needs to know the type somehow, because without it doesn't know how to interpret the otherwise untyped buffer. In Wasm memory, instances of classes don't have runtime information that we could use dynamically.

@jbousquie
Copy link

I know that naming things is always a great problem, so just a semantic suggestion : wouldn't be more pertinent to call it newASArray() or ASArray() in order to not get confused with standard JS arrays, knowing its name is written in the code JS side anyway ?
It's a new Array type and it's pure AS, soI would have called it an ASArray in first intention if I didn't know its name.
Only my (newbee) opinion.

@kyr0
Copy link
Author

kyr0 commented Sep 25, 2018

@dcodeIO

Yeah, but it needs to know the type somehow

Did you try my initial PR? It returns the reference to the constructor function of the TypedArray together with it's instance and pointer so you don't need to put it again on JS side.

@jbousquie

It's a new Array type and it's pure AS, soI would have called it an ASArray in first intention if I didn't know its name.

Well, it is a TypedArray used in an unconventional way (to "share" memory with AS) -- but on JS side, it is indeed a standard TypedArray :) That's why I opted for newTypedArray(...) :)

@dcodeIO
Copy link
Member

dcodeIO commented Sep 25, 2018

Did you try my initial PR? It returns the reference to the constructor function of the TypedArray together with it's instance and pointer so you don't need to put it again on JS side.

Yeah, I was just concerned about the case where a typed array is returned from AS. That'd be just a number and it seemed more consistent this way.

@kyr0
Copy link
Author

kyr0 commented Sep 25, 2018

Yeah, I was just concerned about the case where a typed array is returned from AS. That'd be just a number and it seemed more consistent this way.

Oh well, on AS side my PR only differs in passing two parameters because I didn't knew I could just use the TypedArray counterpart as an argument in AS. I also like your proposal much more for the AS side. This is clearly more elegant.

I think the proposed API of my PR (JS side) and yours (JS and AS side) could go hand in hand...

Bringing @jbousquie remark into play... as it is a new use case... like "sharing a TypedArray" between JS and AS we could maybe name the methods for what they do context-wise...

newArray() could be newSharedArray and return an object like ShareDescriptor containing the actual TypedArray array, ptr and type to the constructor of the TypedArray subtype..., the call to AS side could still do the same as it does now and referencing to the constructor again wouldn't be necessary.

It would become:

// shared:
// shared.ptr -> number
// shared.array -> actual instance of Int32Array
// shared.type -> constructor reference (Int32Array)
var shared = module.newSharedArray(new Int32Array([1, 2, 3]));
module.processArray(shared.ptr);
console.log(shared.array); // no need for getArray() anymore
module.freeSharedArray(shared.type, shared.ptr); // once not used anymore

// AS side
export function processArray(arr: Int32Array): void {
    ...
}

But it's up to you. It's working as it is now and from my point of view it's already doing it's job...

@MaxGraey
Copy link
Member

MaxGraey commented Sep 25, 2018

newArray() could be newSharedArray

This could be confused because JS have SharedArrayBuffer which using for threads

@jbousquie
Copy link

newASSharedArray() ?

@dcodeIO
Copy link
Member

dcodeIO commented Oct 30, 2018

Regarding naming: I think newArray is fine as the typed nature of such an array is (currently) implicit anyway. Once GC/reference-types lands, normal arrays won't have to go through a helper anymore anyway I think.

Hence, closing for now, but feel free to discuss :)

@dcodeIO dcodeIO closed this Oct 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants