ChatAnthropic streamEvents not working with bindTools #5906

mbennett-talentnet · 2024-06-26T18:03:08Z

mbennett-talentnet
Jun 26, 2024

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

import {ChatAnthropic} from '@langchain/anthropic';
import {DynamicStructuredTool} from 'langchain/tools';
import {z} from 'zod';
import {END, MessageGraph, START} from '@langchain/langgraph';
import {ToolNode} from '@langchain/langgraph/prebuilt';

const multiplyTool = new DynamicStructuredTool({
    name: 'multiply',
    description: 'multiply two numbers together',
    schema: z.object({
        a: z.number().describe('the first number to multiply'),
        b: z.number().describe('the second number to multiply'),
    }),
    func: async ({a, b}) => {
        return (a * b).toString();
    },
});

const tools = [multiplyTool];

const model = new ChatAnthropic({
    model: 'claude-3-5-sonnet-20240620',
    streaming: true
}).bindTools(tools);

const shouldContinue = (messages) => {
    const lastMessage = messages[messages.length - 1];

    if (!lastMessage?.tool_calls?.length) {
        return END;
    }

    return 'action';
};

const workflow = new MessageGraph()
    .addNode('agent', model)
    .addNode('action', new ToolNode(tools));

workflow.addEdge(START, 'agent');
workflow.addConditionalEdges('agent', shouldContinue);
workflow.addEdge('action', 'agent');

const app = workflow.compile();

for await (const event of await app.streamEvents(
    [['user', 'Hello, I would like to multiply 5 and 6.']],
    {streamMode: 'values', version: 'v2',}
)) {
    if (event.event === 'on_chat_model_stream') {
        let msg = event.data?.chunk;
        console.log(msg.content);
    }
}

Description

There is no output because there are no "on_chat_model_stream" events emitted. However if the same code is executed, except the .bindTools(tools) is removed from the model's definition, the output is streamed as excepted.

System Info

"@langchain/anthropic": "^0.2.2",
"@langchain/community": "^0.2.14",
"@langchain/core": "^0.2.8",
"node": v20.11.1
"npm": 10.2.4
"platform": "linux"

@dosu · 2024-06-26T18:09:08Z

dosubot[bot]
bot Jun 26, 2024

The reason no "on_chat_model_stream" events are emitted when using the streamEvents method in conjunction with the bindTools method in the ChatAnthropic class is due to the specific implementation of the ChatAnthropic class. The streamEvents method is designed to handle streaming responses, but the emission of "on_chat_model_stream" events depends on how the messages are processed and streamed.

In the ChatAnthropic class, the _streamResponseChunks method is responsible for streaming response chunks. This method processes the messages and yields ChatGenerationChunk objects. However, the emission of specific events like "on_chat_model_stream" is not explicitly handled within this method or the surrounding logic.

Here is a relevant excerpt from the ChatAnthropic class implementation:

async *_streamResponseChunks(
    messages: BaseMessage[],
    options: this["ParsedCallOptions"],
    runManager?: CallbackManagerForLLMRun
  ): AsyncGenerator<ChatGenerationChunk> {
    const params = this.invocationParams(options);
    const formattedMessages = _formatMessagesForAnthropic(messages);
    if (options.tools !== undefined && options.tools.length > 0) {
      const { generations } = await this._generateNonStreaming(
        messages,
        params,
        {
          signal: options.signal,
        }
      );
      const result = generations[0].message as AIMessage;
      const toolCallChunks = result.tool_calls?.map(
        (toolCall: ToolCall, index: number) => ({
          name: toolCall.name,
          args: JSON.stringify(toolCall.args),
          id: toolCall.id,
          index,
        })
      );
      yield new ChatGenerationChunk({
        message: new AIMessageChunk({
          content: result.content,
          additional_kwargs: result.additional_kwargs,
          tool_call_chunks: toolCallChunks,
        }),
        text: generations[0].text,
      });
    } else {
      const stream = await this.createStreamWithRetry({
        ...params,
        ...formattedMessages,
        stream: true,
      });
      let usageData = { input_tokens: 0, output_tokens: 0 };
      for await (const data of stream) {
        if (options.signal?.aborted) {
          stream.controller.abort();
          throw new Error("AbortError: User aborted the request.");
        }
        if (data.type === "message_start") {
          // eslint-disable-next-line @typescript-eslint/no-unused-vars
          const { content, usage, ...additionalKwargs } = data.message;
          // eslint-disable-next-line @typescript-eslint/no-explicit-any
          const filteredAdditionalKwargs: Record<string, any> = {};
          for (const [key, value] of Object.entries(additionalKwargs)) {
            if (value !== undefined && value !== null) {
              filteredAdditionalKwargs[key] = value;
            }

The _streamResponseChunks method handles the streaming of response chunks but does not explicitly emit "on_chat_model_stream" events. This is why you do not see these events when using the streamEvents method in conjunction with the bindTools method.

To emit "on_chat_model_stream" events, you would need to modify the implementation to include event emission logic within the streaming process. This could involve adding calls to an event emitter or callback manager at appropriate points in the _streamResponseChunks method.

For example, in the BaseChatModel class from the @langchain/core package, the _streamIterator method handles the streaming of response chunks and integrates callback management, which can be used to emit events like "on_chat_model_stream":

async *_streamIterator(
    input: BaseLanguageModelInput,
    options?: CallOptions
  ): AsyncGenerator<OutputMessageType> {
    // Subclass check required to avoid double callbacks with default implementation
    if (
      this._streamResponseChunks ===
      BaseChatModel.prototype._streamResponseChunks
    ) {
      yield this.invoke(input, options);
    } else {
      const prompt = BaseChatModel._convertInputToPromptValue(input);
      const messages = prompt.toChatMessages();
      const [runnableConfig, callOptions] =
        this._separateRunnableConfigFromCallOptions(options);
      const inheritableMetadata = {
        ...runnableConfig.metadata,
        ...this.getLsParams(callOptions),
      };
      const callbackManager_ = await CallbackManager.configure(
        runnableConfig.callbacks,
        this.callbacks,
        runnableConfig.tags,
        this.tags,
        inheritableMetadata,
        this.metadata,
        { verbose: this.verbose }
      );
      const extra = {
        options: callOptions,
        invocation_params: this?.invocationParams(callOptions),
        batch_size: 1,
      };
      const runManagers = await callbackManager_?.handleChatModelStart(
        this.toJSON(),
        [messages],
        runnableConfig.runId,
        undefined,
        extra,
        undefined,
        undefined,
        runnableConfig.runName
      );
      let generationChunk: ChatGenerationChunk | undefined;
      try {
        for await (const chunk of this._streamResponseChunks(
          messages,
          callOptions,
          runManagers?.[0]
        )) {
          chunk.message.response_metadata = {
            ...chunk.generationInfo,
            ...chunk.message.response_metadata,
          };
          yield chunk.message as OutputMessageType;
          if (!generationChunk) {
            generationChunk = chunk;
          } else {
            generationChunk = generationChunk.concat(chunk);
          }
        }
      } catch (err) {
        await Promise.all(
          (runManagers ?? []).map((runManager) =>
            runManager?.handleLLMError(err)
          )
        );
        throw err;
      }
      await Promise.all(
        (runManagers ?? []).map((runManager) =>
          runManager?.handleLLMEnd({
            // TODO: Remove cast after figuring out inheritance
            generations: [[generationChunk as ChatGeneration]],
          })
        )
      );
    }
  }

In this example, the _streamIterator method integrates callback management, which can be used to emit events during the streaming process. To achieve similar functionality in the ChatAnthropic class, you would need to integrate similar event emission logic within the _streamResponseChunks method ^[1]^[2].

To continue talking to Dosu, mention @dosu.

0 replies

mbennett-talentnet · 2024-07-03T20:48:39Z

mbennett-talentnet
Jul 3, 2024
Author

What it is recommended way to stream the final output when using ChatAnthropic with tools?

1 reply

dosubot[bot] bot Jul 3, 2024

The recommended way to stream the final output when using ChatAnthropic with tools is to use the .stream() method. This method returns an iterator, which can be consumed with a for await...of loop. Here is an example:

import { ChatAnthropic } from "@langchain/anthropic";
import { concat } from "@langchain/core/utils/stream";
import type { AIMessageChunk } from "@langchain/core/messages";

const model = new ChatAnthropic({ model: "claude-3-sonnet-20240229" });

const stream = await model.stream("what color is the sky?");

let gathered: AIMessageChunk | undefined = undefined;

for await (const chunk of stream) {
  console.log(chunk);
  if (gathered === undefined) {
    gathered = chunk;
  } else {
    gathered = concat(gathered, chunk);
  }
}

console.log(gathered);

Additionally, if you are using tools with ChatAnthropic, you can bind the tools and stream the output as shown in the following example:

const anthropicTool = {
  name: "get_weather",
  description: "Get the weather of a specific location and return the temperature in Celsius.",
  input_schema: {
    type: "object",
    properties: {
      location: {
        type: "string",
        description: "The name of city to get the weather for.",
      },
    },
    required: ["location"],
  },
};

const modelWithTools = model.bind({
  tools: [anthropicTool],
});

const result = await modelWithTools.stream("What is the weather in London today?");
let finalMessage;
for await (const item of result) {
  console.log("item", JSON.stringify(item, null, 2));
  finalMessage = item;
}

if (!finalMessage) {
  throw new Error("No final message returned");
}

console.log({
  tool_calls: JSON.stringify(finalMessage.content, null, 2),
}, "Can bind & invoke StructuredTools");

expect(Array.isArray(finalMessage.content)).toBeTruthy();
if (!Array.isArray(finalMessage.content)) {
  throw new Error("Content is not an array");
}
let toolCall;
finalMessage.content.forEach((item) => {
  if (item.type === "tool_use") {
    toolCall = item;
  }
});
if (!toolCall) {
  throw new Error("No tool call found");
}
expect(toolCall).toBeTruthy();
const { name, input } = toolCall;
expect(name).toBe("get_weather");
expect(input).toBeTruthy();
expect(input.location).toBeTruthy();

This approach ensures that you can consume output as soon as it becomes available, which is vital for reducing latency in applications that use LLMs ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChatAnthropic streamEvents not working with bindTools #5906

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

ChatAnthropic streamEvents not working with bindTools #5906

mbennett-talentnet Jun 26, 2024

Checked other resources

Commit to Help

Example Code

Description

System Info

Replies: 2 comments · 1 reply

dosubot[bot] bot Jun 26, 2024

mbennett-talentnet Jul 3, 2024 Author

dosubot[bot] bot Jul 3, 2024

mbennett-talentnet
Jun 26, 2024

Replies: 2 comments 1 reply

dosubot[bot]
bot Jun 26, 2024

mbennett-talentnet
Jul 3, 2024
Author