Clip update inputs format #2377

divyashreepathihalli · 2024-03-11T17:08:05Z

Update input format to use dict instead of positional args

    processor = CLIPProcessor(
      input_resolution=224,
      "path_to_vocab.json",
      "path_to_merges.txt"
      )
    processed_image = processor.process_images(["cat.jpg"])
    processed_text, attention_mask = processor.process_texts(
      ["mountains", "cat on tortoise", "two cats"]
      )
    model = CLIP.from_preset("clip-vit-base-patch16")
    image_logits, text_logits = model(
            {
                "image": processed_image,
                "text": processed_text,
                "attention_mask": attention_mask,
            }
        )

VarunS1997

Looks good! Maybe we can add a bit of a description as to why we are doing this change (so as to help posterity if similar changes are considered elsewhere).

MnifOmar · 2024-03-12T10:52:41Z

!pip install keras-cv==0.8.2 -q
!pip install -U tensorflow -q
!pip install keras-core -q
import keras_cv
from keras_cv.models.stable_diffusion.clip_tokenizer import SimpleTokenizer
from keras_cv.models.stable_diffusion.diffusion_model import DiffusionModel
from keras_cv.models.stable_diffusion.image_encoder import ImageEncoder
from keras_cv.models.stable_diffusion.noise_scheduler import NoiseScheduler
from keras_cv.models.stable_diffusion.text_encoder import TextEncoder
from tensorflow import keras

/usr/local/lib/python3.10/dist-packages/keras_cv/layers/preprocessing_3d/base_augmentation_layer_3d.py in
28
29 @keras.utils.register_keras_serializable(package="keras_cv")
---> 30 class BaseAugmentationLayer3D(keras.internal.layers.BaseRandomLayer):
31 """Abstract base layer for data augmentation for 3D perception.
32

AttributeError: module 'tensorflow.keras' has no attribute 'internal'

diffusion_ft_trainer = Trainer(
diffusion_model=DiffusionModel(RESOLUTION, RESOLUTION, MAX_PROMPT_LENGTH),
# Remove the top layer from the encoder, which cuts off the variance and only
# returns the mean.
vae=tf.keras.Model(
image_encoder.input,
image_encoder.layers[-2].output,
),
noise_scheduler=NoiseScheduler(),
use_mixed_precision=USE_MP,
)

RuntimeError: Exception encountered when calling SpatialTransformer.call().

Could not automatically infer the output shape / dtype of 'spatial_transformer_2' (of type SpatialTransformer). Either the SpatialTransformer.call() method is incorrect, or you need to implement the SpatialTransformer.compute_output_spec() / compute_output_shape() method. Error encountered:

Exception encountered when calling BasicTransformerBlock.call().

Only input tensors may be passed as positional arguments. The following argument value should be passed as a keyword argument: None (of type <class 'NoneType'>)

Arguments received by BasicTransformerBlock.call():
• inputs=['tf.Tensor(shape=(None, 4096, 320), dtype=float16)', 'tf.Tensor(shape=(None, 77, 768), dtype=float16)']

Arguments received by SpatialTransformer.call():
• args=(['<KerasTensor shape=(None, 64, 64, 320), dtype=float16, sparse=False, name=keras_tensor_288>', '<KerasTensor shape=(None, 77, 768), dtype=float16, sparse=None, name=keras_tensor_278>'],)
• kwargs=<class 'inspect._empty'>

anyone have solutions? nothing seem to work , nor the imports nor loading the model.

Divyashree Sreepathihalli and others added 8 commits February 26, 2024 21:56

enable clip large GPU tests and fix jax broadcast_to error

84625c9

update to use ops

b5d64a0

Merge branch 'keras-team:master' into fix_clip_jax

42a73e4

update model input format

d676c35

update golden values

52f3cc5

Merge branch 'keras-team:master' into clip_update_inputs

8b65b42

update

16a2ca6

code reformat

6203af0

divyashreepathihalli requested review from sampathweb and VarunS1997 March 11, 2024 17:11

remove input tensors

7c918e9

VarunS1997 approved these changes Mar 11, 2024

View reviewed changes

Update docstring

4c71250

divyashreepathihalli merged commit c8efcd7 into keras-team:master Mar 12, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clip update inputs format #2377

Clip update inputs format #2377

divyashreepathihalli commented Mar 11, 2024 •

edited

Loading

VarunS1997 left a comment

MnifOmar commented Mar 12, 2024

Clip update inputs format #2377

Clip update inputs format #2377

Conversation

divyashreepathihalli commented Mar 11, 2024 • edited Loading

VarunS1997 left a comment

Choose a reason for hiding this comment

MnifOmar commented Mar 12, 2024

divyashreepathihalli commented Mar 11, 2024 •

edited

Loading