Problems encountered during speculative decoding execution #888

PoHaoYen · 2024-07-31T04:39:00Z

Hi, I attempted to use speculative decoding but encountered some errors. May I ask for your assistance?

I used the parameters from the first example.

python ./examples/speculative_inference.py \
--model gpt2-xl
--draft_model gpt2
--temperature 0.3
--gamma 5
--max_new_tokens 512
--gpu 0

An error occurred during the first execution:
RuntimeError: Expected one of cpu, cuda, ipu, xpu, mkldnn, opengl, opencl, ideep, hip, ve, fpga, ort, xla, lazy, vulkan, mps, meta, hpu, mtia, privateuseone device type at start of device string: gpu

Then I modified HFDecoderModel in hf_decoder_model.py to use cuda, and the following error occurred:
NotImplementedError: device "cuda" is not supported

On the third attempt, I changed it to use cpu and got the error:
ValueError: The following model_kwargs are not used by the model: ['use_accelerator']"

Is there any configuration or environment setting error on my part?

huangzl19 · 2024-09-02T12:39:41Z

I encountered the same problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems encountered during speculative decoding execution #888

Problems encountered during speculative decoding execution #888

PoHaoYen commented Jul 31, 2024

huangzl19 commented Sep 2, 2024

Problems encountered during speculative decoding execution #888

Problems encountered during speculative decoding execution #888

Comments

PoHaoYen commented Jul 31, 2024

huangzl19 commented Sep 2, 2024