Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support SageMaker Endpoints in chat #197

Merged
merged 11 commits into from
Jun 2, 2023
Merged

Conversation

dlqqq
Copy link
Member

@dlqqq dlqqq commented May 30, 2023

Description

Allows for usage of language models hosted on SageMaker Endpoints. The only constraint is that both the request and response of the model are in JSON.

Screen.Recording.2023-05-30.at.10.44.09.AM.mov

Providers may now declare fields, which are keyword arguments expected by the constructor. Each field is declared via a Field object, which is defined as follows:

class TextField(BaseModel):
    type: Literal["text"] = "text"
    key: str
    label: str

class MultilineTextField(BaseModel):
    type: Literal["text-multiline"] = "text-multiline"
    key: str
    label: str

Field = Union[TextField, MultilineTextField]

The backend stores the values of each fields under config.fields.[<model-id>], where <model-id> is the global model ID of a model. The backend will also automatically read this config object and pass fields for a model to a model provider's constructor as keyword arguments.

In the case of SageMaker Endpoints (SMEP), there are 3 additional fields declared:

class SmEndpointProvider(BaseProvider, SagemakerEndpoint):
    id = "sagemaker-endpoint"
    ...
    fields = [
        TextField(
            key="region_name",
            label="Region name",
        ),
        MultilineTextField(
            key="request_schema",
            label="Request schema",
        ),
        TextField(
            key="response_path",
            label="Response path",
        )
    ]

The constructor of the SMEP provider has been modified to accept two new keyword arguments: request_schema and response_path.

  • request_schema is a JSON string. Any values that match the exact literal "<prompt>" are substituted with the value of the prompt. For example, when using flan-t5-xl on SMEP via SM Jumpstart, the request schema should be {"text_inputs":"<prompt>"}
  • response_path is a JSON path as defined in the specification. So for example, when using flan-t5-xl on SMEP, this should be generated_texts.[0].

The change to the SMEP provider constructor should enable usage in magics fairly easily, but unfortunately I cannot finish this before my vacation tomorrow.

Follow-up items

  • Replicate this on SageMaker and verify that you're able to reproduce my results.
  • Add documentation for this
  • Do additional UI testing and iron out bugs that you discover.
    • The UI state in the Chat Settings component is fairly complex; this should be refactored and decomposed into a larger set of smaller components. However, due to time constraints, this may not happen for the foreseeable future.
  • Replicate this support in magics (see cannot call SageMaker Endpoint in magics #36).
    • There needs to be a standard syntax for passing these additional kwargs/fields to a model provider. Ideally, each field can be specified as a separate argument to the magic, rather than necessitating each user pass an ugly and difficult-to-read JSON blob as a single argument.

@dlqqq dlqqq added the enhancement New feature or request label May 30, 2023
Copy link
Collaborator

@JasonWeill JasonWeill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have at least basic info in the user docs to indicate how people can start using models in Jupyter AI via SageMaker.

@dlqqq
Copy link
Member Author

dlqqq commented May 30, 2023

I've added SageMaker Endpoints in the user documentation.

@3coins
Copy link
Collaborator

3coins commented May 31, 2023

@JasonWeill
Tested successfully that this works now, one thing that tripped me was that while adding the request schema, I copied it from the Studio notebook, this failed json decode while running the prediction, as the version from studio was a python object vs JSON. Created #202 for adding validation.

Here are working request/response schemas for flan-t5-xl model.

Request schema

{"text_inputs":"<prompt>", "max_length":50, "num_return_sequences":3, "top_k":50, "top_p":0.95, "do_sample":true}

Response schema

generated_texts.[0]

@JasonWeill
Copy link
Collaborator

UI quibble: I don't love that the headings ("Language model", "Embedding model", etc) are now encased in rectangles with rounded corners. They look like buttons, but they're not operable as buttons.

image

@3coins
Copy link
Collaborator

3coins commented Jun 2, 2023

@JasonWeill
Updated the headers, here is a screenshot.

Screen Shot 2023-06-01 at 7 48 56 PM

@3coins
Copy link
Collaborator

3coins commented Jun 2, 2023

The CI failure is not related to the changes here.

@3coins 3coins merged commit dd12385 into jupyterlab:main Jun 2, 2023
@dlqqq dlqqq deleted the custom-schemas branch June 16, 2023 16:23
dbelgrod pushed a commit to dbelgrod/jupyter-ai that referenced this pull request Jun 10, 2024
* allow models from registry providers in chat

* support language model fields

* add json content handler for SM Endpoints

* remove console log

* rename variables for clarity

* add user documentation for SageMaker Endpoints

* update docstring

Co-authored-by: Piyush Jain <piyushjain@duck.com>

* remove redundant height attribute

Co-authored-by: Jason Weill <93281816+JasonWeill@users.noreply.github.com>

* fix memo dependencies

* Updated headers for settings panel sections

* Fixing CI failure for check-release

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
Co-authored-by: Jason Weill <93281816+JasonWeill@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants