Don't place matrices into python files. Generate bundles. #182

KOLANICH · 2022-09-16T19:05:15Z

A bundle is a directory with pickle-less .npy files and a python script __init__.py binding them into a model.

Feel free to reuse the approaches used in https://github.com/UniGrammar/UniGrammarRuntime.py/blob/master/UniGrammarRuntime/ParserBundle.py . It uses urm, which is a bit of overkill for your simple use case, so in your runtime you can just scan a dir for npy files and load them all into a dict.

So the following tiny prolog is needed:

from pathlib import Path
thisDir = Path(__file__).absolute().resolve().parent
weights = {el.stem: np.load(el, mmap_mode="r") for el in thisDir.glob("*.npy")}

Also we can place there ONNX file, and dispatch in runtime whether to use the plain numpy impl or an accelerated one.

Also the same npy files can be accessed, for example, from JS. Their format is damn simple and there are parsers.

JS code can be generated using https://github.com/ksons/jscodegen.py from ESPrima AST (just a structure of dicts).

But to be honest, there exists a project called TVM and IMHO the usual use case should be calling this project through TVM.

So it is proposed to do the following:

implement a backend for TVM optimizing the models specifically for inference using plain numpy
add a message that users should usually use TVM

The text was updated successfully, but these errors were encountered:

ishiy1993 · 2022-09-17T01:50:52Z

Don't place matrices into python files.

Do you know the --export-tensor-size option in onnion?
If you want to separate matrices into npy files, use the option.

the usual use case should be calling this project through TVM.

I think that you should use TVM directly instead of onnion for the usual use case.

KOLANICH · 2022-09-17T08:11:25Z

Thanks. Looked into it. I think that

the tensor loading code inlined into the file should be replaced with the something like the generic proposed solution, which is more compact.

    with open(f"{graph_name}_tensors.npy", "wb") as f:
        for k in initializer:
            arr = initializer[k]
            if arr.size < export_tensor_size:
                init_code.append(f'    self.initializer["{k}"] = {embed_ndarray(arr)}')
            else:
                np.save(f, arr)  # type: ignore
init_code.append(f'    self.initializer["{k}"] = np.load(f)')

a. it unnecessarily opens .npy file in the case tensor gets inlined.
b. it generates code as strings, not as AST
c. is it possible to make an injection by supplying an ONNX file with a rogue name?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't place matrices into python files. Generate bundles. #182

Don't place matrices into python files. Generate bundles. #182

KOLANICH commented Sep 16, 2022

ishiy1993 commented Sep 17, 2022

KOLANICH commented Sep 17, 2022

Don't place matrices into python files. Generate bundles. #182

Don't place matrices into python files. Generate bundles. #182

Comments

KOLANICH commented Sep 16, 2022

ishiy1993 commented Sep 17, 2022

KOLANICH commented Sep 17, 2022