Skip to content

Pull requests: huggingface/text-generation-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

feat: add release and sha tagged images
#2360 opened Aug 5, 2024 by drbh Loading…
WIP: Update ROCM libs
#2358 opened Aug 5, 2024 by mht-sharma Draft
4 tasks done
Add FlashInfer support
#2354 opened Aug 2, 2024 by danieldk Loading…
5 tasks
Unsigned integer underflow in max_batch_size
#2352 opened Aug 1, 2024 by maxdebayser Loading…
fix: fix num_ln_in_parallel_attn attribute name typo in RWConfig
#2350 opened Aug 1, 2024 by almersawi Loading…
3 of 5 tasks
fix EleutherAI/gpt-neox-20b does not work in tgi
#2346 opened Aug 1, 2024 by sywangyi Loading…
fix: allocate tmp based on sgmv kernel if available
#2345 opened Jul 31, 2024 by drbh Loading…
More fixes trtllm
#2342 opened Jul 31, 2024 by mfuntowicz Loading…
add numa to improve cpu inference perf
#2330 opened Jul 30, 2024 by sywangyi Loading…
Update vLLM dependency to 0.5.3.post1
#2317 opened Jul 26, 2024 by danieldk Draft
5 tasks
Add model_load_time metric
#2311 opened Jul 26, 2024 by Edwinhr716 Loading…
2 of 5 tasks
no repeat ngram size ci
#2308 opened Jul 25, 2024 by ErikKaum Loading…
Fix missing model id in rocm warmup
#2298 opened Jul 24, 2024 by almersawi Loading…
2 of 5 tasks
adding max_token_capacity metric
#2279 opened Jul 22, 2024 by Edwinhr716 Loading…
2 of 5 tasks
Don't error on OpenAI valid top_p values.
#2231 opened Jul 15, 2024 by ErikKaum Loading…
doc: Add metrics documentation and add a 'Reference' section documentation Improvements or additions to documentation
#2230 opened Jul 15, 2024 by Hugoch Loading…
2 of 5 tasks
feat: Add load tests
#2217 opened Jul 11, 2024 by Hugoch Loading…
1 of 5 tasks
added tie_weights support to mlp speculator
#2215 opened Jul 10, 2024 by JRosenkranz Loading…
5 tasks
ProTip! What’s not been updated in a month: updated:<2024-07-06.