Skip to content

Commit

Permalink
Multiply prefill time by prompt length
Browse files Browse the repository at this point in the history
  • Loading branch information
rahul-tuli committed Sep 18, 2023
1 parent b1ed606 commit cceeddf
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion examples/chatbot_llm/chatbot.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,9 @@ def main(
print("Bot: ", response.sequences[0])
if show_tokens_per_sec:
times = pipeline.timer_manager.times
prefill_speed = 1.0 / times["engine_prompt_prefill_single"]
prefill_speed = (
1.0 * prompt_sequence_length / times["engine_prompt_prefill_single"]
)
generation_speed = 1.0 / times["engine_token_generation_single"]
print(
f"[prefill: {prefill_speed:.2f} tokens/sec]",
Expand Down

0 comments on commit cceeddf

Please sign in to comment.