Skip to content

Actions: ggerganov/llama.cpp

Server

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
5,988 workflow runs
5,988 workflow runs

Filter by Event

Loading

Filter by Status

Loading

Filter by Branch

Loading

Filter by Actor

Loading
ggml : do not crash when quantizing q4_x_x with an imatrix
Server #6259: Pull request #9192 opened by slaren
August 26, 2024 14:59 In progress sl/fix-q4xx-imatrix
August 26, 2024 14:59 In progress
ggml : add SSM Metal kernels (#8546)
Server #6258: Commit fc18425 pushed by ggerganov
August 26, 2024 14:55 42m 38s master
August 26, 2024 14:55 42m 38s
tests : fix compile warnings for unreachable code (#9185)
Server #6257: Commit 879275a pushed by ggerganov
August 26, 2024 13:30 22m 10s master
August 26, 2024 13:30 22m 10s
tokenize : add --show-count-only (token) option
Server #6256: Pull request #9182 synchronize by danbev
August 26, 2024 11:39 1h 19m 35s danbev:show-token-count-only
August 26, 2024 11:39 1h 19m 35s
ggml:Mamba Cuda kernel performance improve
Server #6255: Pull request #9186 opened by piDack
August 26, 2024 09:42 2h 43m 46s piDack:mfalcon_mamba_cuda
August 26, 2024 09:42 2h 43m 46s
tests : fix compile warnings for unreachable code
Server #6254: Pull request #9185 opened by ggerganov
August 26, 2024 09:30 1h 33m 12s gg/tests-fix-unreach
August 26, 2024 09:30 1h 33m 12s
server : update deps (#9183)
Server #6253: Commit e5edb21 pushed by ggerganov
August 26, 2024 09:17 1h 11m 5s master
August 26, 2024 09:17 1h 11m 5s
metal : gemma2 flash attention support (#9159)
Server #6252: Commit 0c41e03 pushed by slaren
August 26, 2024 09:09 31m 37s master
August 26, 2024 09:09 31m 37s
metal : gemma2 flash attention support
Server #6251: Pull request #9159 synchronize by slaren
August 26, 2024 08:51 25m 3s sl/metal-logit-softcap
August 26, 2024 08:51 25m 3s
ggml : remove K_QUANTS_PER_ITERATION macro
Server #6250: Pull request #9034 synchronize by ggerganov
August 26, 2024 06:52 8m 57s gg/remove-k-quants-per-iter
August 26, 2024 06:52 8m 57s
server : update deps
Server #6249: Pull request #9183 opened by ggerganov
August 26, 2024 06:17 28m 18s gg/server-update-deps
August 26, 2024 06:17 28m 18s
llama : fix time complexity of string replacement (#9163)
Server #6248: Commit 436787f pushed by ggerganov
August 26, 2024 06:09 9m 42s master
August 26, 2024 06:09 9m 42s
Threadpool: take 2
Server #6246: Pull request #8672 synchronize by max-krasnyansky
August 26, 2024 04:37 20m 18s CodeLinaro:threadpool
August 26, 2024 04:37 20m 18s
llama : support RWKV v6 models
Server #6245: Pull request #8980 synchronize by MollySophia
August 26, 2024 01:53 9m 30s MollySophia:for-upstream
August 26, 2024 01:53 9m 30s
llama : support RWKV v6 models
Server #6244: Pull request #8980 synchronize by MollySophia
August 26, 2024 01:51 2m 7s MollySophia:for-upstream
August 26, 2024 01:51 2m 7s
llama : support RWKV v6 models
Server #6243: Pull request #8980 synchronize by MollySophia
August 26, 2024 01:32 10m 12s MollySophia:for-upstream
August 26, 2024 01:32 10m 12s
August 25, 2024 22:54 9m 36s
CUDA: fix Gemma 2 numerical issues for FA (#9166)
Server #6237: Commit f91fc56 pushed by JohannesGaessler
August 25, 2024 20:11 9m 19s master
August 25, 2024 20:11 9m 19s
Changes for the existing quant strategies / FTYPEs and new ones
Server #6235: Pull request #8836 synchronize by Nexesenex
August 25, 2024 12:27 10m 1s Nexesenex:patch-1
August 25, 2024 12:27 10m 1s