Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support execution of metrics on a remote host #568

Merged
merged 125 commits into from
Mar 4, 2024
Merged
Show file tree
Hide file tree
Changes from 105 commits
Commits
Show all changes
125 commits
Select commit Hold shift + click to select a range
73bbc6b
adding service and remote metric
assaftibm Jan 15, 2024
d4eb457
new MetricPipeline metrics.rag.mrr
matanor Jan 18, 2024
3c7f0d2
adjust field names
matanor Jan 18, 2024
f7a5aba
update Perplexity implementation
matanor Jan 18, 2024
5118513
add metrics.rag.context_relevancy
matanor Jan 18, 2024
70d0d4c
fix init
matanor Jan 18, 2024
ce4c52f
a new tool for running metrics on a dataframe
matanor Jan 18, 2024
cc48004
adding more metrics to rag and to evaluate
assaftibm Jan 21, 2024
06ceade
fix answer relevance
assaftibm Jan 22, 2024
0745cee
fix instance score
assaftibm Jan 22, 2024
cfaafc5
update to relative imports, as needed within unitxt
matanor Jan 22, 2024
041e8da
flip the order such that the prediction (e.g. the retrieved context) …
matanor Jan 22, 2024
c43864a
add comments
matanor Jan 22, 2024
bd47f0e
rename to eval_utils.py
matanor Jan 22, 2024
7d81efc
add an import of eval_utils.py
matanor Jan 22, 2024
f005398
save reference scores in a list
matanor Jan 22, 2024
c1379d2
add expected reference_scores to perplexity.py expected outputs
matanor Jan 22, 2024
4d4281f
add context_perplexity
matanor Jan 22, 2024
6d81ee7
add context_perplexity
matanor Jan 22, 2024
9491aa7
add new evaluate_rag example
matanor Jan 22, 2024
cc4e34a
add import of eval_utils
matanor Jan 22, 2024
a7712b0
add comments explaining review questions
matanor Jan 23, 2024
7bf9206
add context_preplexity.json
matanor Jan 23, 2024
df0985a
fix context perplexity
assaftibm Jan 23, 2024
a09caf5
service
assaftibm Jan 25, 2024
d705384
merge
assaftibm Jan 25, 2024
1e4d65e
Merge branch 'main' into service
matanor Jan 25, 2024
0349281
Merge remote-tracking branch 'remotes/origin/main' into service
matanor Jan 29, 2024
8de9fae
move RemoteMetric from src/unitxt/test_utils/metrics.py to src/unitxt…
matanor Jan 30, 2024
82b68a4
update the api of the metric service, and the result returned by the …
matanor Jan 30, 2024
a454748
modify RemoteMetric not to inherit from GlobalMetric. to avoid runnin…
matanor Jan 30, 2024
09b6f18
add log prints to metric service
matanor Feb 1, 2024
bfc6411
support artifact_identifier in Artifact objects
matanor Feb 1, 2024
3f8fb9c
support for remote metrics
matanor Feb 1, 2024
3e12cb0
reorganize code
matanor Feb 1, 2024
cd8c025
add docstrings
matanor Feb 1, 2024
d18da61
add tests for reading the remote metrics config vars from the environ…
matanor Feb 1, 2024
784be4f
update update_instance_scores() and set_global_score()
matanor Feb 1, 2024
ed18abc
add missing return statement in wrap_inner_metric_pipeline_metric()
matanor Feb 4, 2024
af1e6b8
set fixed version to metric service requirements
matanor Feb 4, 2024
5fdb92b
assume the dockerfile runs from unitxt/service/metric
matanor Feb 4, 2024
0728dfb
use same dir imports for the service code
matanor Feb 4, 2024
6da9061
add metric service related command to make file
matanor Feb 4, 2024
14bb1eb
update HF env params location
matanor Feb 4, 2024
44c0e9f
add init_logger()
matanor Feb 4, 2024
db8a4ef
support build_number and release_version in metric service image names
matanor Feb 4, 2024
4e646d1
report request handling time in INFO logging level
matanor Feb 4, 2024
bc09936
update metric service commands to accept only one param tag_name
matanor Feb 4, 2024
fade3da
remove --proxy-headers
matanor Feb 4, 2024
761bc2c
restore --proxy-headers, since removing it did not solve the authenti…
matanor Feb 4, 2024
9d1a1b7
add locking around metric computation
matanor Feb 5, 2024
c656817
run main.py from docker
matanor Feb 5, 2024
875c739
downgrade to cuda11.6.1, to support running with older cuda drivers
matanor Feb 5, 2024
d75a103
use nvidia/cuda:12.1.1-cudnn8-devel-ubuntu20.04
matanor Feb 5, 2024
8a0773e
update ubuntu setup
matanor Feb 5, 2024
e312e36
use 11.8 cuda in image
matanor Feb 6, 2024
623971a
move unitxt imports
matanor Feb 6, 2024
b5c4b1b
update docker using dockerfile that works for another service
matanor Feb 6, 2024
1f33658
remove unitxt from requirements.txt
matanor Feb 6, 2024
dbaca07
restore installation of requirements
matanor Feb 6, 2024
4534e74
use dockerfile from sbert service
matanor Feb 6, 2024
032bd45
restore unitxt to requirements.txt
matanor Feb 6, 2024
f2ea3d1
comment out conda install commands
matanor Feb 6, 2024
3e234f4
fix copying of code
matanor Feb 6, 2024
9af95dc
add installation of cffi to fix "pyo3_runtime.PanicException: Python …
matanor Feb 6, 2024
5644d7a
add conda install of torch 1.12.1
matanor Feb 6, 2024
18db41a
move conda install to start of script
matanor Feb 7, 2024
c88d527
add comment
matanor Feb 7, 2024
2f17333
support GPU usage in compute_batch()
matanor Feb 7, 2024
6d97979
set batch_size to 16 in BertScore
matanor Feb 7, 2024
ffb7931
use latest unitxt in metric service requirements.txt
matanor Feb 7, 2024
7d8e1f7
replace unitxt requirement installation: remove it from the requireme…
matanor Feb 7, 2024
ea3d0b4
explicitly set the device for the Reward metric
matanor Feb 8, 2024
0f6d58e
update prints
matanor Feb 8, 2024
78d0be1
update prints
matanor Feb 8, 2024
170f499
explicitly set device in SentenceBert
matanor Feb 8, 2024
f0f064c
refactor
matanor Feb 8, 2024
d8bdc2b
Merge remote-tracking branch 'remotes/origin/main' into service
matanor Feb 8, 2024
7f42e73
update following changes in service.metrics.client_config
matanor Feb 8, 2024
4985ba7
restore version from main
matanor Feb 8, 2024
ad4b8c8
revert changes to perplexity.py
matanor Feb 8, 2024
69f1821
clean dockerfile
matanor Feb 8, 2024
1a79e6d
clean dockerfile
matanor Feb 8, 2024
47d78ee
add docstrings and comments
matanor Feb 8, 2024
38ed68e
remove use of ApplyMetric
matanor Feb 8, 2024
e428178
add docstrings and explanations
matanor Feb 8, 2024
b1e81eb
remove prints
matanor Feb 8, 2024
df833fd
RemoteMetric must have a main_score
matanor Feb 8, 2024
b9616ab
add start_metrics_http_service()
matanor Feb 8, 2024
6305e2a
add pydantic required for the metric service api
matanor Feb 8, 2024
58104f6
move metric service api from api.py into unitxt: places in metric_uti…
matanor Feb 8, 2024
3a659af
update import of RemoteMetric
matanor Feb 8, 2024
81060d8
restore init of main_score to None, otherwise the main_score is consi…
matanor Feb 8, 2024
ccc2529
add test_remote_service_with_valid_response()
matanor Feb 8, 2024
56bdd4e
move client_config.py functionality into metric_utils.py
matanor Feb 8, 2024
2a3cf13
Merge remote-tracking branch 'remotes/origin/main' into service
matanor Feb 8, 2024
87632b2
Merge remote-tracking branch 'remotes/origin/main' into service
matanor Feb 11, 2024
d57ed6d
remove test_service.py
matanor Feb 11, 2024
6e53912
add doc strings, type hints
matanor Feb 11, 2024
b153a14
add doc strings, small code update
matanor Feb 11, 2024
28523ca
remove type hints that cause an import error
matanor Feb 11, 2024
5af1df1
Merge branch 'main' into service
matanor Feb 11, 2024
1a845d7
Merge remote-tracking branch 'remotes/origin/main' into service
matanor Feb 15, 2024
da473ea
remove pydantic from the service requirements, it is only used in the…
matanor Feb 15, 2024
b91056e
update the BUILD_DIR env parameter
matanor Feb 15, 2024
06d93d0
remove the pydantic dependency
matanor Feb 25, 2024
7501694
move service code into src/unitxt/service.
matanor Feb 25, 2024
da41a1b
use plain dicts for request and response
matanor Feb 25, 2024
7e26d6a
add __init__.py files to new packages
matanor Feb 25, 2024
febca2a
update to run the server module
matanor Feb 25, 2024
b0ed177
remove use of buildx (not needed)
matanor Feb 25, 2024
dba6e9e
Merge remote-tracking branch 'remotes/origin/main' into service
matanor Feb 25, 2024
63c968a
rename metric -> metric_name
matanor Feb 26, 2024
9e5f0c5
restore usage of first_step to disable confidence interval
matanor Feb 26, 2024
c4cf224
rename metric_artifact -> metric
matanor Feb 26, 2024
7dd5f03
disable confidence interval for remote metrics
matanor Feb 26, 2024
a2c950e
add disable_confidence_interval_calculation and set_n_resamples for R…
matanor Feb 26, 2024
3963e41
fix endpoint following rename of 'metric' -> 'metric_name'
matanor Feb 26, 2024
5770822
remove get_env_variable
matanor Feb 26, 2024
933456c
add an option to start the service using a command unitxt-metrics-ser…
matanor Feb 29, 2024
0ccf9c1
add an option to start the service using a command unitxt-metrics-ser…
matanor Feb 29, 2024
a62f638
Merge remote-tracking branch 'remotes/origin/main' into service
matanor Feb 29, 2024
8eaaf52
switch to using the new settings classes
matanor Feb 29, 2024
c3ebf69
add default to remote_metrics setting
matanor Feb 29, 2024
2e4f627
Merge branch 'main' into service
matanor Mar 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,21 @@ metric:
build:
format
pypi

# command: make tag_name=${TAG_NAME} metric-service-build
# example: make tag_name=unitxt-service-metric:b1v0.1 metric-service-build
# Use the unitxt dir as the build context for docker, so the entire codebase
# can be copied into the image. This way the latest code changes are intergrated into
# the image, without requiring a formal unitxt release.
metric-service-build:
cd $(DIR) && docker buildx build --tag $(tag_name) --file $(DIR)/service/metrics/Dockerfile .

# command: make tag_name=${TAG_NAME} metric-service-run-bash
# example: make tag_name=unitxt-service-metric:b1v0.1 metric-service-run-bash
metric-service-run-bash:
docker run -it $(tag_name) /bin/bash

# command: make tag_name=${TAG_NAME} metric-service-run
# example: make tag_name=unitxt-service-metric:b1v0.1 metric-service-run
metric-service-run:
docker run -p 8000:8000 --memory=20g $(tag_name)
5 changes: 4 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -91,4 +91,7 @@ line-ending = "auto"


[tool.ruff.lint.pydocstyle]
convention = "google"
convention = "google"

[tool.ruff.flake8-bugbear]
extend-immutable-calls = ["fastapi.Depends", "fastapi.params.Depends", "fastapi.Query", "fastapi.params.Query"]
1 change: 1 addition & 0 deletions requirements/base.rqr
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ mecab-python3
absl-py
dpath
ipadic
pydantic==2.6.0
matanor marked this conversation as resolved.
Show resolved Hide resolved
3 changes: 2 additions & 1 deletion requirements/tests.rqr
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@ transformers
sentence_transformers
ibm-cos-sdk
opendatasets
httpretty~=1.1.4
editdistance
rouge-score
nltk
sacrebleu
scikit-learn
jiwer
jiwer
87 changes: 87 additions & 0 deletions service/metrics/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# This dockerfile exemplifies how a unitxt metrics service may be containerized

FROM registry.access.redhat.com/ubi8/ubi:latest

# Disable Red Hat Subscripstion
RUN sed -i 's/1/0/g' /etc/yum/pluginconf.d/subscription-manager.conf

# Set up conda env var
ENV CONDA_HOME=/opt/conda \
PATH=/opt/conda/bin:/usr/local/nvidia/bin:$PATH \
CONFIG_DIR=/root/config_dir \
BUILD_DIR=/tmp/unitxt_metric_service \
USE_TF=0 \
USE_TORCH=1

USER root

RUN yum -y update --allowerasing --nobest && yum -y upgrade --allowerasing --nobest && \
yum -y install --allowerasing --nobest \
wget \
bzip2 \
git \
unzip && \
yum clean all

# Install Anaconda
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh && \
chmod 755 ./Miniconda3-py37_4.8.2-Linux-x86_64.sh && \
/bin/bash ./Miniconda3-py37_4.8.2-Linux-x86_64.sh -b -p ${CONDA_HOME} && \
rm -rf ./Miniconda3-py37_4.8.2-Linux-x86_64.sh && \
echo >>"${CONDA_HOME}/conda-meta/pinned" "conda=4.8" && \
conda config --system --set always_yes True && \
conda config --system --set auto_update_conda False && \
conda config --system --set notify_outdated_conda False && \
conda install python~=3.8.0 && \
conda clean -a

ENV LD_LIBRARY_PATH=/usr/local/nvidia/lib64/:/opt/conda/lib/
# Fix versions for deployment on a specific GPU type
# Change the version numbers to match your hardware
RUN conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch

# Non-root user
ENV NONROOT_USER=gpuuser \
NONROOT_UID=1000 \
NONROOT_GID=1000 \
NONROOT_HOME=/home/gpuuser \
PATH="/home/gpuuser/.local/bin:${PATH}"

RUN groupadd -g ${NONROOT_GID} ${NONROOT_USER} && \
useradd -u ${NONROOT_UID} -g ${NONROOT_GID} -G users -m -c "" -e "" -l -s /bin/bash ${NONROOT_USER} && \
mkdir -p /var/run/sshd && mkdir -p ${NONROOT_HOME}/.ssh && \
chown ${NONROOT_USER} ${NONROOT_HOME}/.ssh && \
chmod -R 700 ${NONROOT_HOME}

# Update pip
RUN pip install --upgrade pip

RUN mkdir -p /usr/local/bin/

USER ${NONROOT_USER}

# Copy unitxt into the image
COPY --chown=${NONROOT_USER}:${NONROOT_GID} . /app/unitxt/.

# Install the unitxt metrics service requirements
RUN cat /app/unitxt/service/metrics/requirements.txt
RUN pip3 install -r /app/unitxt/service/metrics/requirements.txt
RUN pip3 install cffi --upgrade

# pip install unitxt
WORKDIR /app/unitxt
RUN pip install -e ".[all]"

WORKDIR /app/unitxt/service/metrics
ENV HF_HOME=/app/hf/misc
ENV HF_DATASETS_CACHE=/app/hf/datasets
ENV TRANSFORMERS_CACHE=/app/hf/models
EXPOSE 8000

ENV PYTHONPATH=/app/unitxt
ENV PYTHONHASHSEED 0

RUN env
RUN pip3 list
RUN conda list
CMD python3 main.py
158 changes: 158 additions & 0 deletions service/metrics/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
import datetime
import logging
import threading
import time
from logging import Formatter, StreamHandler, getLevelName, getLogger
from typing import cast

import torch
import uvicorn
from fastapi import Depends, FastAPI, Request
from fastapi.exceptions import HTTPException
from starlette.responses import JSONResponse
from tokens import verify_token

from src.unitxt.metric_utils import MetricRequest, MetricResponse

"""
This module defines an http server that wraps unitxt metrics.
It accepts requests detailing which metric to run, and what is the data to run on.
The requests are handled by running them one by one locally, potentially on a GPU.
"""

# init the FastAPI app object
app = FastAPI(version="0.0.1", title="Unitxt Metrics Service")


def init_logger():
log = getLogger()
log.setLevel(getLevelName("INFO"))
log_formatter = Formatter(
"%(asctime)s [%(levelname)s] %(filename)s %(lineno)d: %(message)s [%(threadName)s]"
)

console_handler = StreamHandler()
console_handler.setFormatter(log_formatter)
log.handlers = []
log.addHandler(console_handler)


init_logger()


# for sanity check
@app.get("/", include_in_schema=False)
def read_root():
return {"Hello": "Unitxt Metrics"}


# for k8s health checks
@app.get("/health", include_in_schema=False)
def health():
return "OK"


# A lock to make sure single use of the GPU
compute_lock = threading.Lock()


# for computing a metric
@app.post("/compute/{metric}", response_model=MetricResponse)
def compute(metric: str, request: MetricRequest, token: dict = Depends(verify_token)):
# imports are here, so the service could start even if unitxt is not installed.
# This is useful for testing, it enabled running health checks and sanity checks, without unitxt.
from unitxt.artifact import Artifact
from unitxt.operator import MultiStreamOperator
from unitxt.operators import ArtifactFetcherMixin
from unitxt.stream import MultiStream

t0 = time.perf_counter()
try:
logging.info(f"Request from [{token['sub']}]")
logging.info(f"Computing metric '{metric}'.")
logging.info(
f"MetricRequest contains {len(request.instance_inputs)} input instances"
)

start_time = datetime.datetime.now()
# Only allow single use of the GPU, other requests wait on this lock, till
# current computation is done.
with compute_lock:
logging.info("Acquired compute_lock, starting computation .. ")
start_infer_time = datetime.datetime.now()
# obtain the metric to compute
metric_artifact: Artifact = ArtifactFetcherMixin.get_artifact(metric)
metric_artifact: MultiStreamOperator = cast(
MultiStreamOperator, metric_artifact
)

# prepare the input stream
multi_stream: MultiStream = MultiStream.from_iterables(
{"test": request.model_dump()["instance_inputs"]}, copying=True
)

# apply the metric and obtain the results
metric_results = list(metric_artifact(multi_stream)["test"])

infer_time = datetime.datetime.now() - start_infer_time
wait_time = start_infer_time - start_time
logging.info(
f"Computed {len(metric_results)} metric '{metric}' results, "
f"took: {infer_time!s}, waited: {wait_time!s}')"
)

metric_response = {
"instances_scores": [
metric_result["score"]["instance"] for metric_result in metric_results
],
"global_score": metric_results[0]["score"]["global"],
}
return MetricResponse.model_validate(metric_response)
finally:
t1 = time.perf_counter()
logging.info(f"Request for metric '{metric}' handled in [{t1 - t0:.2f}] secs.")


# wrapper for HTTP exceptions that we throw
@app.exception_handler(HTTPException)
async def unicorn_http_exception_handler(_request: Request, exc: HTTPException):
logging.exception("HTTP Exception raised")
return JSONResponse(
status_code=exc.status_code,
headers=exc.headers,
content={"message": exc.detail},
)


# wrapper for unexpected exceptions
@app.exception_handler(Exception)
async def unicorn_exception_handler(_request: Request, exc: Exception):
logging.exception(f"Unexpected exception raised: {type(exc).__name__}")
return JSONResponse(
status_code=500,
content={"message": "Internal Server Error"},
)


def print_gpus_status():
if torch.cuda.is_available():
logging.info("Using CUDA")
logging.info(f"CUDNN VERSION: {torch.backends.cudnn.version()}")
gpu_id = torch.cuda.current_device()
logging.info(
f"There are {torch.cuda.device_count()} GPUs available, using GPU {gpu_id}, name: {torch.cuda.get_device_name(gpu_id)}"
)
logging.info(
f"CUDA Device Total Memory [GB]: {torch.cuda.get_device_properties(0).total_memory / 1e9}"
)
else:
logging.info("There are NO GPUs available.")


def start_metrics_http_service():
print_gpus_status()
uvicorn.run("main:app", host="0.0.0.0", port=8000, reload=False, log_config=None)


if __name__ == "__main__":
start_metrics_http_service()
3 changes: 3 additions & 0 deletions service/metrics/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
fastapi==0.109.0
uvicorn[standard]==0.27.0.post1
python-jose[cryptography]==3.3.0
83 changes: 83 additions & 0 deletions service/metrics/tokens.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
import logging
import os
from datetime import datetime, timedelta

from fastapi import Depends, HTTPException
from fastapi.security import OAuth2PasswordBearer
from jose import JWTError, jwt
from starlette import status

# This module handles authorization tokens for a service.
# To generate a master token key, run "openssl rand -hex 32".
# Then, save the value in the environment variable UNITXT_METRICS_MASTER_KEY.
# To create tokens that have access for the master key, use create_token(..), as shown in main().

if "UNITXT_METRICS_MASTER_KEY" in os.environ:
MASTER_KEY = os.environ["UNITXT_METRICS_MASTER_KEY"]
else:
MASTER_KEY = None

ALGORITHM = "HS256"
ACCESS_TOKEN_EXPIRE_DAYS = 360

log = logging.getLogger("tokens")


class InvalidTokenError(Exception):
pass


def create_token(name: str):
assert MASTER_KEY is not None

# create the token data
now = datetime.utcnow()
expires_delta = timedelta(days=ACCESS_TOKEN_EXPIRE_DAYS)
payload = {
"iss": "Unitxt Metrics",
"sub": name,
"iat": now,
"exp": now + expires_delta,
}

# generate the jwt token and return it
return jwt.encode(payload, MASTER_KEY, algorithm=ALGORITHM)


def verify_jwt_token(jwt_token):
try:
if MASTER_KEY:
payload = jwt.decode(jwt_token, MASTER_KEY, algorithms=[ALGORITHM])
if payload["sub"] is None:
raise InvalidTokenError("Token subject claim is empty")
return payload
return {"sub": "Anonymous"}
except JWTError as e:
raise InvalidTokenError from e


# This object makes sure that the incoming HTTP request has a header with
# an authorization token (e.g. passed with 'curl -H "Authorization: Bearer {token}"').
# It does NOT check that the token has a valid value (that is done by verify_token(..)).
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")


async def verify_token(token: str = Depends(oauth2_scheme)):
try:
return verify_jwt_token(token)
except InvalidTokenError as e:
log.exception(e)
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Could not validate credentials",
headers={"WWW-Authenticate": "Bearer"},
) from e


def main():
name = "unitxt-metrics-service-tester"
log.info(f"{name}: {create_token(name)}")


if __name__ == "__main__":
main()
Loading
Loading