-
Notifications
You must be signed in to change notification settings - Fork 205
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* add talkbot-pc code * fix llm int4 in talkingbot --------- Co-authored-by: VincyZhang <wenxin.zhang@intel.com>
- Loading branch information
1 parent
53fbeb4
commit be2a267
Showing
3 changed files
with
247 additions
and
4 deletions.
There are no files selected for viewing
198 changes: 198 additions & 0 deletions
198
...xtension_for_transformers/neural_chat/examples/talkingbot_pc/build_talkingbot_on_pc.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,198 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# EndToEnd TalkingBot on PC client (Windows)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"> make sure you are running in a conda environment with Python 3.10\n", | ||
"\n", | ||
"[Intel® Extension for Transformers Neural Chat](https://github.com/intel/intel-extension-for-transformers/tree/main/intel_extension_for_transformers/neural_chat) provides a lot of plugins to meet different users' scenarios. In this notebook we will show you how to create a TalkingBot on your local laptop with **Intel CPU** (no GPU needed).\n", | ||
"\n", | ||
"Behind the scene, a TalkingBot is composed of a pipeline of\n", | ||
"1. recognize user's prompt audio and convert to text\n", | ||
"2. text understanding and question answering by Large Language Models\n", | ||
"2. convert answer text to speech\n", | ||
"\n", | ||
"This is a notebook to let you know how to create such a TalkingBot on PC." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Audio To Text" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!curl -O https://github.com/raw/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from intel_extension_for_transformers.neural_chat.pipeline.plugins.audio.asr import AudioSpeechRecognition" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from IPython.display import Audio\n", | ||
"Audio(r\"./sample.wav\", rate=16000)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"asr = AudioSpeechRecognition(model_name_or_path=\"openai/whisper-tiny\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"text = asr.audio2text(r\"./sample.wav\")\n", | ||
"print(text)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## LLM" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Directly load given int4 model to do inference\n", | ||
"\n", | ||
"Here for quick demo, we just use a given int4 model to generate text. If you want to convert your int4 model manually, please refer to next cell." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from intel_extension_for_transformers.llm.runtime.graph import Model\n", | ||
"model = Model()\n", | ||
"model.bin_file = r\"mpt_q4_0.bin\"\n", | ||
"model.init_from_bin(\"mpt\", model.bin_file, max_new_tokens=32, seed=12)\n", | ||
"prompt = text\n", | ||
"output = model.generate(prompt)\n", | ||
"print(output)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Convert int4 model to do inference" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from intel_extension_for_transformers.transformers import AutoModel, WeightOnlyQuantConfig\n", | ||
"model_name = r\"THUDM/ChatGLM2-6B\"\n", | ||
"woq_config = WeightOnlyQuantConfig(compute_dtype=\"int8\", weight_dtype=\"int4\")\n", | ||
"model = AutoModel.from_pretrained(model_name, quantization_config=woq_config, use_llm_runtime=True, trust_remote_code=True)\n", | ||
"prompt = text\n", | ||
"output = model.generate(prompt, max_new_tokens=32)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Text To Speech" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from intel_extension_for_transformers.neural_chat.pipeline.plugins.audio.tts import TextToSpeech" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"tts = TextToSpeech()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"result_path = tts.text2speech(\"Hello there, I am your Talking Bot!\", \"output.wav\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from IPython.display import Audio\n", | ||
"Audio(result_path, rate=16000)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "itrex", | ||
"language": "python", | ||
"name": "itrex" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.13" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
39 changes: 39 additions & 0 deletions
39
intel_extension_for_transformers/neural_chat/requirements_pc.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
transformers==4.32.1 | ||
peft | ||
fschat | ||
num2words | ||
speechbrain | ||
paddlepaddle | ||
paddlespeech==1.4.1 | ||
shortuuid | ||
gptcache | ||
evaluate | ||
pydub | ||
python-multipart | ||
PyPDF2 | ||
langchain | ||
python-docx | ||
scikit-learn | ||
farm-haystack | ||
librosa | ||
beautifulsoup4 | ||
InstructorEmbedding | ||
chromadb | ||
fastapi | ||
pydantic | ||
starlette | ||
yacs | ||
uvicorn | ||
optimum | ||
optimum-intel | ||
sentence_transformers | ||
unstructured | ||
markdown | ||
rouge_score | ||
openpyxl | ||
numpy==1.23.5 | ||
tiktoken==0.4.0 | ||
lm_eval | ||
--extra-index-url https://download.pytorch.org/whl/cpu | ||
torch | ||
torchaudio |