[NeuralChat] Integrate photoai backend into restful API (#478)

intel · Oct 30, 2023 · d7a1d8d · d7a1d8d
1 parent d9d0a32
commit d7a1d8d
Show file tree

Hide file tree

Showing 19 changed files with 1,702 additions and 15 deletions.
diff --git a/.github/workflows/script/unitTest/run_unit_test_neuralchat.sh b/.github/workflows/script/unitTest/run_unit_test_neuralchat.sh
@@ -73,6 +73,10 @@ function main() {
     apt-get update
     apt-get install ffmpeg -y
     apt-get install lsof
+    apt-get install libgl1
+    apt-get install -y libgl1-mesa-glx
+    apt-get install -y libgl1-mesa-dev
+    apt-get install libsm6 libxext6 -y
     wget http://nz2.archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2.19_amd64.deb
     dpkg -i libssl1.1_1.1.1f-1ubuntu2.19_amd64.deb
     python -m pip install --upgrade --force-reinstall torch

diff --git a/intel_extension_for_transformers/neural_chat/examples/photo_ai/README.md b/intel_extension_for_transformers/neural_chat/examples/photo_ai/README.md
@@ -0,0 +1,8 @@
+Welcome to Photo AI! This example introduces how to deploy the Text Chatbot system and guides you through setting up both the backend and frontend components. You can deploy this chatbot on various platforms, including Intel XEON Scalable Processors, Habana's Gaudi processors (HPU), Intel Data Center GPU and Client GPU, Nvidia Data Center GPU and Client GPU.
+
+| Section              | Link                      |
+| ---------------------| --------------------------|
+| Backend Setup        | [Backend README](./backend/README.md) |
+| Frontend Setup       | [Frontend README](./frontend/README.md) |
+
+
diff --git a/intel_extension_for_transformers/neural_chat/examples/photo_ai/backend/README.md b/intel_extension_for_transformers/neural_chat/examples/photo_ai/backend/README.md
@@ -0,0 +1,99 @@
+This README is intended to guide you through setting up the backend for a Photo AI demo using the NeuralChat framework. You can deploy it on various platforms, including Intel XEON Scalable Processors, Habana's Gaudi processors (HPU), Intel Data Center GPU and Client GPU, Nvidia Data Center GPU and Client GPU.
+
+
+# Setup Environment
+
+
+## Setup Conda
+
+First, you need to install and configure the Conda environment:
+
+```shell
+# Download and install Miniconda
+wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
+bash `Miniconda*.sh`
+source ~/.bashrc
+```
+
+## Install numactl
+
+Next, install the numactl library:
+
+```shell
+sudo apt install numactl
+```
+
+## Install Python dependencies
+
+Install the following Python dependencies using Conda:
+
+```shell
+conda install astunparse ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests dataclasses -y
+conda install jemalloc gperftools -c conda-forge -y
+conda install git-lfs -y
+# install libGL.so.1 for opencv
+sudo apt-get update
+sudo apt-get install -y libgl1-mesa-glx
+```
+
+Install other dependencies using pip:
+
+```bash
+pip install -r ../../../requirements.txt
+```
+
+## Install Models
+```shell
+git-lfs install
+# download llama-2 model for NER plugin
+git clone https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
+# download spacy model for NER post process
+python -m spacy download en_core_web_lg
+```
+
+
+# Setup Database
+## Install MySQL
+```shell
+# install mysql
+sudo apt-get install mysql-server
+# start mysql server
+systemctl status mysql
+```
+
+## Create Tables
+```shell
+cd ../../../utils/database/
+# login mysql
+mysql
+source ./init_db_ai_photos.sql
+```
+
+## Create Image Database
+```shell
+mkdir /home/nfs_images
+export IMAGE_SERVER_IP="your.server.ip"
+```
+
+# Configurate photoai.yaml
+
+You can customize the configuration file 'photoai.yaml' to match your environment setup. Here's a table to help you understand the configurable options:
+
+|  Item               | Value                                  |
+| ------------------- | ---------------------------------------|
+| host                | 127.0.0.1                              |
+| port                | 9000                                   |
+| model_name_or_path  | "./Llama-2-7b-chat-hf"        |
+| device              | "auto"                                  |
+| asr.enable          | true                                   |
+| tts.enable          | true                                   |
+| ner.enable          | true                                   |
+| tasks_list          | ['voicechat', 'photoai']               |
+
+
+# Run the PhotoAI server
+To start the PhotoAI server, use the following command:
+
+```shell
+nohup bash run.sh &
+```
diff --git a/intel_extension_for_transformers/neural_chat/examples/photo_ai/backend/photoai.py b/intel_extension_for_transformers/neural_chat/examples/photo_ai/backend/photoai.py
@@ -0,0 +1,29 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Copyright (c) 2023 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+from intel_extension_for_transformers.neural_chat import NeuralChatServerExecutor
+
+def main():
+    server_executor = NeuralChatServerExecutor()
+    server_executor(
+        config_file="./photoai.yaml",
+        log_file="./photoai.log")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/intel_extension_for_transformers/neural_chat/examples/photo_ai/backend/photoai.yaml b/intel_extension_for_transformers/neural_chat/examples/photo_ai/backend/photoai.yaml
@@ -0,0 +1,53 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Copyright (c) 2023 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# This is the parameter configuration file for NeuralChat Serving.
+
+#################################################################################
+#                             SERVER SETTING                                    #
+#################################################################################
+host: 0.0.0.0
+port: 9000
+
+model_name_or_path: "meta-llama/Llama-2-7b-chat-hf"
+device: "auto"
+
+asr:
+    enable: true
+    args:
+        device: "cpu"
+        model_name_or_path: "openai/whisper-small"
+        bf16: false
+
+tts:
+    enable: true
+    args:
+        device: "cpu"
+        voice: "default"
+        stream_mode: true
+        output_audio_path: "./output_audio"
+
+ner:
+    enable: true
+    args:
+        device: "cpu"
+        model_path: "./Llama-2-7b-chat-hf"
+        spacy_model: "en_core_web_lg"
+        bf16: true
+
+
+tasks_list: ['voicechat', 'photoai']
diff --git a/intel_extension_for_transformers/neural_chat/examples/photo_ai/backend/run.sh b/intel_extension_for_transformers/neural_chat/examples/photo_ai/backend/run.sh
@@ -0,0 +1,38 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Copyright (c) 2023 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Kill the exist and re-run
+ps -ef |grep 'photoai' |awk '{print $2}' |xargs kill -9
+
+# KMP
+export KMP_BLOCKTIME=1
+export KMP_SETTINGS=1
+export KMP_AFFINITY=granularity=fine,compact,1,0
+
+# OMP
+export OMP_NUM_THREADS=56
+export LD_PRELOAD=${CONDA_PREFIX}/lib/libiomp5.so
+
+# tc malloc
+export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
+
+# environment variables
+export MYSQL_PASSWORD="root"
+export MYSQL_HOST="127.0.0.1"
+export MYSQL_DB="ai_photos"
+
+numactl -l -C 0-55 python -m photoai 2>&1 | tee run.log
diff --git a/intel_extension_for_transformers/neural_chat/examples/photo_ai/frontend b/intel_extension_for_transformers/neural_chat/examples/photo_ai/frontend
diff --git a/intel_extension_for_transformers/neural_chat/pipeline/plugins/ner/ner.py b/intel_extension_for_transformers/neural_chat/pipeline/plugins/ner/ner.py
@@ -26,7 +26,6 @@
     TextIteratorStreamer,
     AutoConfig,
 )
-import intel_extension_for_pytorch as intel_ipex
 from .utils.utils import (
     enforce_stop_tokens,
     get_current_time
@@ -41,11 +40,17 @@ class NamedEntityRecognition():
         Set bf16=True if you want to inference with bf16 model.
     """
 
-    def __init__(self, model_path="./Llama-2-7b-chat-hf/", spacy_model="en_core_web_lg", bf16: bool=False) -> None:
+    def __init__(self, 
+                 model_path="meta-llama/Llama-2-7b-chat-hf", 
+                 spacy_model="en_core_web_lg", 
+                 bf16: bool=False, 
+                 device="cpu") -> None:
         # initialize tokenizer and models
         self.nlp = spacy.load(spacy_model)
         config = AutoConfig.from_pretrained(model_path, trust_remote_code=True)
         config.init_device = 'cuda:0' if torch.cuda.is_available() else "cpu"
+        self.device = device
+        self.bf16 = False
         self.tokenizer = AutoTokenizer.from_pretrained(
             model_path,
             use_fast=False if (re.search("llama", model_path, re.IGNORECASE)
@@ -59,9 +64,15 @@ def __init__(self, model_path="./Llama-2-7b-chat-hf/", spacy_model="en_core_web_
             device_map="auto",
             trust_remote_code=True
         )
-        self.bf16 = bf16
+        # make sure ipex is available on current server
+        try:
+            import intel_extension_for_pytorch as intel_ipex
+            self.is_ipex_available = True
+        except ImportError:
+            self.is_ipex_available = False
         # optimize model with ipex if bf16
-        if bf16:
+        if bf16 and self.is_ipex_available:
+            self.bf16 = bf16
             self.model = intel_ipex.optimize(
                 self.model.eval(),
                 dtype=torch.bfloat16,

diff --git a/intel_extension_for_transformers/neural_chat/pipeline/plugins/ner/ner_int.py b/intel_extension_for_transformers/neural_chat/pipeline/plugins/ner/ner_int.py
@@ -38,10 +38,11 @@ class NamedEntityRecognitionINT():
     """
 
     def __init__(self, 
-                 model_path="/home/tme/Llama-2-7b-chat-hf/", 
+                 model_path="meta-llama/Llama-2-7b-chat-hf", 
                  spacy_model="en_core_web_lg", 
                  compute_dtype="fp32", 
-                 weight_dtype="int8") -> None:
+                 weight_dtype="int8",
+                 device="cpu") -> None:
         self.nlp = spacy.load(spacy_model)
         config = WeightOnlyQuantConfig(compute_dtype=compute_dtype, weight_dtype=weight_dtype)
         self.tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

diff --git a/intel_extension_for_transformers/neural_chat/requirements.txt b/intel_extension_for_transformers/neural_chat/requirements.txt
@@ -39,6 +39,9 @@ tiktoken==0.4.0
 lm_eval
 accelerate
 cchardet
+pymysql
+deepface
+exifread
 spacy
 neural-compressor==2.3.1
 pymysql
diff --git a/intel_extension_for_transformers/neural_chat/requirements_cpu.txt b/intel_extension_for_transformers/neural_chat/requirements_cpu.txt
@@ -42,3 +42,6 @@ torchaudio==2.1.0
 spacy
 neural-compressor==2.3.1
 pymysql
+deepface
+exifread
+
diff --git a/intel_extension_for_transformers/neural_chat/requirements_hpu.txt b/intel_extension_for_transformers/neural_chat/requirements_hpu.txt
@@ -35,3 +35,8 @@ tiktoken==0.4.0
 lm_eval
 spacy
 neural-compressor==2.3.1
+intel_extension_for_pytorch
+pymysql
+deepface
+exifread
+
diff --git a/intel_extension_for_transformers/neural_chat/server/README.md b/intel_extension_for_transformers/neural_chat/server/README.md
@@ -10,7 +10,7 @@ neuralchat_server help
 ### Start the server
 - Command Line (Recommended)
 
-NeuralChat provides a default chatbot configuration in `./conf/neuralchat.yaml`. User could customize the behavior of this chatbot by modifying the value of these fields in the configuration file to specify which LLM model and plugins to be used.
+NeuralChat provides a default chatbot configuration in `./config/neuralchat.yaml`. User could customize the behavior of this chatbot by modifying the value of these fields in the configuration file to specify which LLM model and plugins to be used.
 
 | Fields                    | Sub Fields               | Default Values                             | Possible Values                  |
 | ------------------------- | ------------------------ | --------------------------------------- | --------------------------------- |
@@ -42,18 +42,27 @@ NeuralChat provides a default chatbot configuration in `./conf/neuralchat.yaml`.
 |                           | args.process             | false                                   | true, false                       |
 | cache                     | enable                   | false                                   | true, false                       |
 |                           | args.config_dir          | "../../pipeline/plugins/caching/cache_config.yaml" | A valid directory path |
-|                           | args.embedding_model_dir | "hkunlp/instructor-large"              | A valid directory path            |
+|                           | args.embedding_model_dir | "hkunlp/instructor-large"              | A valid directory path             |
 | safety_checker            | enable                   | false                                   | true, false                       |
-| tasks_list                |                          | ['textchat', 'retrieval']              | List of task names, including 'textchat', 'voicechat', 'retrieval', 'text2image', 'finetune'                |
+| ner                       | enable                   | false                                   | true, false                       |
+|                           | args.model_path          | "meta-llama/Llama-2-7b-chat-hf"        | A valid directory path of llm model   |
+|                           | args.spacy_model         | "en_core_web_lg"                       | A valid name of downloaded spacy model      |
+|                           | args.bf16                | false                                   | true, false                          |
+| ner_int                   | enable                   | false                                   | true, false                          |
+|                           | args.model_path          | "meta-llama/Llama-2-7b-chat-hf"        | A valid directory path of llm model      |
+|                           | args.spacy_model         | "en_core_web_lg"                       | A valid name of downloaded spacy model   |
+|                           | args.compute_dtype       | "fp32"                                  | "fp32", "int8"                       |
+|                           | args.weight_dtype        | "int8"                                  | "int8", "int4"                       |
+| tasks_list                |                          | ['textchat', 'retrieval']              | List of task names, including 'textchat', 'voicechat', 'retrieval', 'text2image', 'finetune', 'photoai'                |
 
 
 
-First set the service-related configuration parameters, similar to `./conf/neuralchat.yaml`. Set `tasks_list`, which represents the supported tasks included in the service to be started.
+First set the service-related configuration parameters, similar to `./config/neuralchat.yaml`. Set `tasks_list`, which represents the supported tasks included in the service to be started.
 **Note:** If the service can be started normally in the container, but the client access IP is unreachable, you can try to replace the `host` address in the configuration file with the local IP address.
 
 Then start the service:
 ```bash
-neuralchat_server start --config_file ./server/conf/neuralchat.yaml
+neuralchat_server start --config_file ./server/config/neuralchat.yaml
 ```
 
 - Python API
@@ -62,7 +71,7 @@ from neuralchat.server.neuralchat_server import NeuralChatServerExecutor
 
 server_executor = NeuralChatServerExecutor()
 server_executor(
-    config_file="./conf/neuralchat.yaml", 
+    config_file="./config/neuralchat.yaml", 
     log_file="./log/neuralchat.log")
 ```
 

diff --git a/intel_extension_for_transformers/neural_chat/server/config/neuralchat.yaml b/intel_extension_for_transformers/neural_chat/server/config/neuralchat.yaml
@@ -77,17 +77,19 @@ safety_checker:
 ner:
     enable: false
     args:
+        device: "cpu"
         model_path: "meta-llama/Llama-2-7b-chat-hf"
         spacy_model: "en_core_web_lg"
         bf16: False
 
 ner_int:
     enable: false
     args:
+        device: "cpu"
         model_path: "meta-llama/Llama-2-7b-chat-hf"
         spacy_model: "en_core_web_lg"
         compute_dtype: "fp32"
         weight_dtype: "int8"
 
-# task choices = ['textchat', 'voicechat', 'retrieval', 'text2image', 'finetune']
+# task choices = ['textchat', 'voicechat', 'retrieval', 'text2image', 'finetune'， 'photoai']
 tasks_list: ['textchat', 'retrieval']