Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on mixing shapes of multiple enumerated shape inputs #2271

Open
0seba opened this issue Jul 9, 2024 · 4 comments
Open

Error on mixing shapes of multiple enumerated shape inputs #2271

0seba opened this issue Jul 9, 2024 · 4 comments
Labels
bug Unexpected behaviour that should be corrected (type)

Comments

@0seba
Copy link

0seba commented Jul 9, 2024

🐞Describing the bug

I have a program with 3 inputs, all have a flexible enumerated shape, two share the same shape. The input shapes are:

  • shapes_1 = [(1, 3, 64, 16), (1, 3, 64, 32)] # last dimension shape varies, called Length1
  • shapes_2 = [(1, 3, 64, 256), (1, 3, 64, 512)] # last dimension shape varies, called Length2

After building, the model runs correctly with Length1=16 and Length2=256, or Length1=32 and Length2=512, but when I try to run with Length1=32 and Length2=256 it fails.

I tried extending the enumerated shapes, to match different combinations (setting shape_1=[(1, 3, 64, 16), (1, 3, 64, 32), (1, 3, 64, 32), (1, 3, 64, 16)]) and analogous for shape_2), but it does not work.

Stack Trace

Running on Python trace

File ~/.pyenv/versions/apple/lib/python3.11/site-packages/coremltools/models/model.py:654, in MLModel.predict(self, data, state)
    651 MLModel._check_predict_data(data)
    653 if self.__proxy__:
--> 654     return self._get_predictions(self.__proxy__,
    655                                  verify_and_convert_input_dict,
    656                                  data,
    657                                  state)
    658 else:   # Error case
    659     if _macos_version() < (10, 13):

File ~/.pyenv/versions/apple/lib/python3.11/site-packages/coremltools/models/model.py:702, in MLModel._get_predictions(proxy, preprocess_method, data, state)
    700     preprocess_method(data)
    701     state = None if state is None else state.__proxy__
--> 702     return proxy.predict(data, state)
    703 else:
    704     assert type(data) == list

RuntimeError: Caught an unknown exception!

To Reproduce

  • Please add a minimal code example that can reproduce the error when running it.
import numpy as np
import coremltools as ct
from coremltools.converters.mil import Builder as mb
import coremltools.converters.mil as mil

A = [16, 32]
B = [256, 512]
# A, B = np.repeat(A, len(B)), B * len(A)

q_seqlens = A
kv_seqlens = B
input_ids_shapes = [(1, 3, 64, seqlen) for seqlen in q_seqlens]
kv_shapes = [(1, 3, 64, seqlen) for seqlen in kv_seqlens]

input_ids_shape_def = mil.input_types.EnumeratedShapes(shapes=input_ids_shapes)
kv_shape_def = mil.input_types.EnumeratedShapes(shapes=kv_shapes)

@mb.program(
    input_specs=[
        mb.TensorSpec(input_ids_shape_def.symbolic_shape, mil.input_types.types.fp16),
        mb.TensorSpec(kv_shape_def.symbolic_shape, mil.input_types.types.fp16),
        mb.TensorSpec(kv_shape_def.symbolic_shape, mil.input_types.types.fp16),
    ],
    opset_version=mil.builder.AvailableTarget.iOS18
)
def prog(
    query, key_cache, value_cache,
):
    scores = mb.matmul(x=query, y=key_cache, transpose_x=True)
    scores = mb.mul(x=scores, y=np.float16(64 ** -0.5))
    weights = mb.softmax(x=scores)
    attention = mb.matmul(x=value_cache, y=weights, transpose_y=True)
    return attention # , key_cache

cml_flex = ct.convert(
    prog,
    compute_units=ct.ComputeUnit.CPU_AND_NE,
    compute_precision=ct.precision.FLOAT16,
    minimum_deployment_target=ct.target.iOS18,
    inputs=[
        ct.TensorType(name="query", shape=ct.EnumeratedShapes(input_ids_shapes)),
        ct.TensorType(name="key_cache", shape=ct.EnumeratedShapes(kv_shapes)),
        ct.TensorType(name="value_cache", shape=ct.EnumeratedShapes(kv_shapes)),
    ],
)

QL = 16
CL = 512

np.random.seed(42)
cml_flex.predict({
    'query': np.random.randn(1, 3, 64, QL).astype(np.float16),
    'key_cache': np.random.randn(1, 3, 64, CL).astype(np.float16),
    'value_cache': np.random.randn(1, 3, 64, CL).astype(np.float16),
})

System environment (please complete the following information):

  • coremltools version: 8.0b1
  • OS (e.g. MacOS version or Linux type): 15.0 beta 2

Additional context

Also, when I remove the last comment in the program (#, key_cache), when running a prediction it does not raise an exception when running with different shapes, but all inputs all converted to 0s (query is an array of only 0 and same of key_cache and value_cache). I'll report this issue in Apple Forums and Feedback Assistant. But I wasn't sure if the first part of the problem is just a conversion issue or intrinsic CoreML issue, thus why I reported here also.

Additional Swift trace

*** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'There is no function in the program library for the provided input=query = MultiArray : Float16 1 × 3 × 64 × 16 array
key_cache = MultiArray : Float16 1 × 3 × 64 × 512 array
value_cache = MultiArray : Float16 1 × 3 × 64 × 512 array
.'
*** First throw call stack:
(
	0   CoreFoundation                      0x00000001998f6920 __exceptionPreprocess + 176
	1   libobjc.A.dylib                     0x00000001993deb1c objc_exception_throw + 76
	2   CoreFoundation                      0x00000001998f6810 +[NSException exceptionWithName:reason:userInfo:] + 0
	3   CoreML                              0x00000001a3745a98 -[MLE5EnumeratedShapeExecutionStreamOperationPool takeOutOperationForFeatures:error:] + 480
	4   CoreML                              0x00000001a3845a78 -[MLE5ExecutionStream setupOperationForInputFeatures:operationPool:error:] + 92
	5   CoreML                              0x00000001a37f58d0 -[MLE5Engine _cleanUpAndReconfigureStream:forInputFeatures:error:] + 108
	6   CoreML                              0x00000001a37f4be8 -[MLE5Engine _predictionFromFeatures:options:completionHandler:] + 256
	7   CoreML                              0x00000001a37f511c -[MLE5Engine submitPredictionRequest:completionHandler:] + 124
	8   CoreML                              0x00000001a37cc780 __62-[MLDelegateModel _submitPredictionRequest:completionHandler:]_block_invoke + 420
	9   libdispatch.dylib                   0x00000001001b0b6c _dispatch_call_block_and_release + 32
	10  libdispatch.dylib                   0x00000001001b28ac _dispatch_client_callout + 20
	11  libdispatch.dylib                   0x00000001001b6110 _dispatch_continuation_pop + 700
	12  libdispatch.dylib                   0x00000001001b50ac _dispatch_async_redirect_invoke + 616
	13  libdispatch.dylib                   0x00000001001ca9b8 _dispatch_root_queue_drain + 404
	14  libdispatch.dylib                   0x00000001001cb5c4 _dispatch_worker_thread2 + 188
	15  libsystem_pthread.dylib             0x000000010024d0c4 _pthread_wqthread + 228
	16  libsystem_pthread.dylib             0x0000000100254cf0 start_wqthread + 8
)
libc++abi: terminating due to uncaught exception of type NSException
@0seba 0seba added the bug Unexpected behaviour that should be corrected (type) label Jul 9, 2024
@YifanShenSZ
Copy link
Collaborator

Hi @0seba, there appears to be some wrong deduplication going on in Core ML framework... So yes it is correct to file issue on Apple forum

Concretely, I tried your reproduce, and it errors out indeed. Repeating enumerated shapes with different orders does not help. Only reverting enumerated shapes works (i.e. only one of 16 x 256 or 16 x 512 works...)

@YifanShenSZ
Copy link
Collaborator

Core ML framework has got back and fount the issue to be in coremltools: the protobuf is deduplicated.

@YifanShenSZ YifanShenSZ reopened this Jul 24, 2024
@0seba
Copy link
Author

0seba commented Aug 3, 2024

Thanks @YifanShenSZ , hopefully it is a simple issue and we can get support for these models soon 😅

@YifanShenSZ YifanShenSZ self-assigned this Aug 8, 2024
@YifanShenSZ
Copy link
Collaborator

This turns out to be more involved. Some progress has been made: We have 2 fixes for our protobuf. Can continue to investigate this issue once those protobuf fixes land

@YifanShenSZ YifanShenSZ removed their assignment Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unexpected behaviour that should be corrected (type)
Projects
None yet
Development

No branches or pull requests

2 participants