Support SageMaker Endpoints in chat #197

dlqqq · 2023-05-30T18:02:44Z

Description

See Specify request/response data types for SageMaker endpoints in chat UI #193

Allows for usage of language models hosted on SageMaker Endpoints. The only constraint is that both the request and response of the model are in JSON.

Screen.Recording.2023-05-30.at.10.44.09.AM.mov

Providers may now declare fields, which are keyword arguments expected by the constructor. Each field is declared via a Field object, which is defined as follows:

class TextField(BaseModel):
    type: Literal["text"] = "text"
    key: str
    label: str

class MultilineTextField(BaseModel):
    type: Literal["text-multiline"] = "text-multiline"
    key: str
    label: str

Field = Union[TextField, MultilineTextField]

The backend stores the values of each fields under config.fields.[<model-id>], where <model-id> is the global model ID of a model. The backend will also automatically read this config object and pass fields for a model to a model provider's constructor as keyword arguments.

In the case of SageMaker Endpoints (SMEP), there are 3 additional fields declared:

class SmEndpointProvider(BaseProvider, SagemakerEndpoint):
    id = "sagemaker-endpoint"
    ...
    fields = [
        TextField(
            key="region_name",
            label="Region name",
        ),
        MultilineTextField(
            key="request_schema",
            label="Request schema",
        ),
        TextField(
            key="response_path",
            label="Response path",
        )
    ]

The constructor of the SMEP provider has been modified to accept two new keyword arguments: request_schema and response_path.

request_schema is a JSON string. Any values that match the exact literal "<prompt>" are substituted with the value of the prompt. For example, when using flan-t5-xl on SMEP via SM Jumpstart, the request schema should be {"text_inputs":"<prompt>"}
response_path is a JSON path as defined in the specification. So for example, when using flan-t5-xl on SMEP, this should be generated_texts.[0].

The change to the SMEP provider constructor should enable usage in magics fairly easily, but unfortunately I cannot finish this before my vacation tomorrow.

Follow-up items

Replicate this on SageMaker and verify that you're able to reproduce my results.
Add documentation for this
Do additional UI testing and iron out bugs that you discover.
- The UI state in the Chat Settings component is fairly complex; this should be refactored and decomposed into a larger set of smaller components. However, due to time constraints, this may not happen for the foreseeable future.
Replicate this support in magics (see cannot call SageMaker Endpoint in magics #36).
- There needs to be a standard syntax for passing these additional kwargs/fields to a model provider. Ideally, each field can be specified as a separate argument to the magic, rather than necessitating each user pass an ugly and difficult-to-read JSON blob as a single argument.

JasonWeill

We should have at least basic info in the user docs to indicate how people can start using models in Jupyter AI via SageMaker.

packages/jupyter-ai/src/components/chat-settings.tsx

dlqqq · 2023-05-30T21:48:48Z

I've added SageMaker Endpoints in the user documentation.

packages/jupyter-ai-magics/jupyter_ai_magics/providers.py

docs/source/users/index.md

Co-authored-by: Piyush Jain <piyushjain@duck.com>

Co-authored-by: Jason Weill <93281816+JasonWeill@users.noreply.github.com>

3coins · 2023-05-31T19:54:32Z

@JasonWeill
Tested successfully that this works now, one thing that tripped me was that while adding the request schema, I copied it from the Studio notebook, this failed json decode while running the prediction, as the version from studio was a python object vs JSON. Created #202 for adding validation.

Here are working request/response schemas for flan-t5-xl model.

Request schema

{"text_inputs":"<prompt>", "max_length":50, "num_return_sequences":3, "top_k":50, "top_p":0.95, "do_sample":true}

Response schema

generated_texts.[0]

JasonWeill · 2023-05-31T22:40:33Z

UI quibble: I don't love that the headings ("Language model", "Embedding model", etc) are now encased in rectangles with rounded corners. They look like buttons, but they're not operable as buttons.

3coins · 2023-06-02T02:49:51Z

@JasonWeill
Updated the headers, here is a screenshot.

3coins · 2023-06-02T03:37:19Z

The CI failure is not related to the changes here.

* allow models from registry providers in chat * support language model fields * add json content handler for SM Endpoints * remove console log * rename variables for clarity * add user documentation for SageMaker Endpoints * update docstring Co-authored-by: Piyush Jain <piyushjain@duck.com> * remove redundant height attribute Co-authored-by: Jason Weill <93281816+JasonWeill@users.noreply.github.com> * fix memo dependencies * Updated headers for settings panel sections * Fixing CI failure for check-release --------- Co-authored-by: Piyush Jain <piyushjain@duck.com> Co-authored-by: Jason Weill <93281816+JasonWeill@users.noreply.github.com>

dlqqq added the enhancement New feature or request label May 30, 2023

dlqqq mentioned this pull request May 30, 2023

improve support for SageMaker Endpoint provider #52

Closed

JasonWeill reviewed May 30, 2023

View reviewed changes

3coins reviewed May 30, 2023

View reviewed changes

packages/jupyter-ai-magics/jupyter_ai_magics/providers.py Outdated Show resolved Hide resolved

JasonWeill reviewed May 30, 2023

View reviewed changes

docs/source/users/index.md Outdated Show resolved Hide resolved

JasonWeill reviewed May 30, 2023

View reviewed changes

docs/source/users/index.md Outdated Show resolved Hide resolved

dlqqq mentioned this pull request May 30, 2023

Support for locally hosted models #190

Closed

JasonWeill reviewed May 30, 2023

View reviewed changes

docs/source/users/index.md Outdated Show resolved Hide resolved

JasonWeill approved these changes May 30, 2023

View reviewed changes

dlqqq and others added 8 commits May 30, 2023 23:36

allow models from registry providers in chat

f4e14db

support language model fields

773dd37

add json content handler for SM Endpoints

584fde6

remove console log

575cabe

rename variables for clarity

c78591d

add user documentation for SageMaker Endpoints

cdf8bff

update docstring

d2eaa04

Co-authored-by: Piyush Jain <piyushjain@duck.com>

remove redundant height attribute

15735be

Co-authored-by: Jason Weill <93281816+JasonWeill@users.noreply.github.com>

dlqqq force-pushed the custom-schemas branch from 1598576 to 15735be Compare May 30, 2023 23:37

fix memo dependencies

b796d3f

3coins approved these changes May 31, 2023

View reviewed changes

Updated headers for settings panel sections

754d18a

Fixing CI failure for check-release

9867eb0

3coins merged commit dd12385 into jupyterlab:main Jun 2, 2023

krassowski mentioned this pull request Jun 4, 2023

Fix check release by allowing Python 3.11 #210

Closed

dlqqq deleted the custom-schemas branch June 16, 2023 16:23

dlqqq mentioned this pull request Aug 22, 2023

Specify request/response data types for SageMaker endpoints in chat UI #193

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support SageMaker Endpoints in chat #197

Support SageMaker Endpoints in chat #197

dlqqq commented May 30, 2023 •

edited

Loading

JasonWeill left a comment

dlqqq commented May 30, 2023

3coins commented May 31, 2023 •

edited

Loading

JasonWeill commented May 31, 2023

3coins commented Jun 2, 2023

3coins commented Jun 2, 2023

Support SageMaker Endpoints in chat #197

Support SageMaker Endpoints in chat #197

Conversation

dlqqq commented May 30, 2023 • edited Loading

Description

Follow-up items

JasonWeill left a comment

Choose a reason for hiding this comment

dlqqq commented May 30, 2023

3coins commented May 31, 2023 • edited Loading

Request schema

Response schema

JasonWeill commented May 31, 2023

3coins commented Jun 2, 2023

3coins commented Jun 2, 2023

dlqqq commented May 30, 2023 •

edited

Loading

3coins commented May 31, 2023 •

edited

Loading