Update docker ENTRYPOINT to ensure proper argument handling (#962) · fpgmaas/outlines@0f9f854

Commit

Update docker ENTRYPOINT to ensure proper argument handling (outlines…

…-dev#962)

## Summary

This PR updates the `ENTRYPOINT` instruction in the Dockerfile to ensure
that additional arguments passed to the container via `docker run` are
correctly appended to the entrypoint command.

### Before the change:

Parameter `model` is not passed to the entrypoint command and the
default model `facebook/opt-125m` is loaded instead.

```bash
> sudo docker run --runtime=nvidia --gpus all -p 8000:8000 my-outlines-image --model="microsoft/phi-2"

/usr/local/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
INFO 06-12 14:45:46 llm_engine.py:161] Initializing an LLM engine (v0.5.0) with config: model='facebook/opt-125m', speculative_config=None, tokenizer='facebook/opt-125m', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=2048, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), seed=0, served_model_name=facebook/opt-125m)
```

### After the change:

Parameter `model` is correctly passed to the entrypoint command

```bash
> sudo docker run --runtime=nvidia --gpus all -p 8000:8000 my-outlines-image --model="microsoft/phi-2"

/usr/local/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
INFO 06-12 14:59:17 llm_engine.py:161] Initializing an LLM engine (v0.5.0) with config: model='microsoft/phi-2', speculative_config=None, tokenizer='microsoft/phi-2', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=2048, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), seed=0, served_model_name=microsoft/phi-2)
```

Loading branch information

shashankmangla authored and fpgmaas committed Jun 14, 2024

1 parent 7656cab commit 0f9f854

Dockerfile

            
                      Original file line number
                      Diff line number
                      Diff line change
                  
    @@ -14,4 +14,4 @@ RUN --mount=source=.git,target=.git,type=bind \
  
        pip install --no-cache-dir .[serve]

    # https://outlines-dev.github.io/outlines/reference/vllm/

    ENTRYPOINT python3 -m outlines.serve.serve

    ENTRYPOINT ["python3", "-m", "outlines.serve.serve"]

0 comments on commit `0f9f854`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `0f9f854`

Commit

There are no files selected for viewing

0 comments on commit 0f9f854

0 comments on commit `0f9f854`