-
Notifications
You must be signed in to change notification settings - Fork 140
Issues: THUDM/AgentBench
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug/Assistance] kg-std issues
bug
Something isn't working
help wanted
Extra attention is needed
#159
opened Aug 5, 2024 by
night-chen
[Bug/Assistance]
bug
Something isn't working
help wanted
Extra attention is needed
#154
opened Jul 30, 2024 by
matinaghaei
Could you please upload the dockerfile?
bug
Something isn't working
help wanted
Extra attention is needed
#152
opened Jul 25, 2024 by
HCHCXY
[Bug/Assistance] A lot of os-std tasks are impossible
bug
Something isn't working
help wanted
Extra attention is needed
#151
opened Jul 25, 2024 by
rjmoss
[Bug/Assistance] how to use local model to replace gpt3.5?
bug
Something isn't working
help wanted
Extra attention is needed
#150
opened Jul 19, 2024 by
lambda7xx
[Bug/Assistance] card game 测评 开源大模型 运行报错 failed with error INTERACT_FAILED {"detail":"Error: Worker not responding\n"}
bug
Something isn't working
help wanted
Extra attention is needed
#147
opened Jul 9, 2024 by
moon-fall
通过fastchat部署本地模型遇到的问题
bug
Something isn't working
help wanted
Extra attention is needed
#146
opened Jul 4, 2024 by
YinSonglin1997
DBbench-std task with error "Can't connect to MySQL server"
bug
Something isn't working
help wanted
Extra attention is needed
#145
opened Jun 27, 2024 by
realbillbao
urgent - if there one of the problems throws an error , why does the overall.json not show up??
bug
Something isn't working
help wanted
Extra attention is needed
#144
opened Jun 21, 2024 by
ishapuri
Would llama3 wizardlm2 and other latest models be tested and published in leaderboard? 请求添加llama3 wizardlm等24年4-5月大模型的测试结果
enhancement
New feature or request
#136
opened May 11, 2024 by
dercaft
[Feature] 请问每个任务的分是怎么计算的呢?比如OS任务中得到的只是一个准确率,但是在论文中Table3每个任务对应的都是分数,这中间的映射过程我在文中并没有找到,可以提示一下吗
enhancement
New feature or request
#135
opened May 10, 2024 by
lonerFarea
请问支持使用openai的tool_call接口进行测试吗?
enhancement
New feature or request
#132
opened Apr 9, 2024 by
Maybewuss
Excellent Job! Well, no offense, it seems LLM-Bench rather than AgentBench in essence.
enhancement
New feature or request
#130
opened Mar 26, 2024 by
Konisberg
[Bug/Assistance] mind2web的unknown是怎么回事?
bug
Something isn't working
help wanted
Extra attention is needed
#129
opened Mar 24, 2024 by
Tangent-90C
OS std 测试集结果
bug
Something isn't working
help wanted
Extra attention is needed
#128
opened Mar 18, 2024 by
webdxq
[Bug/Assistance] - Reproducing Results on Alfworld (HH) (vs. ReAct paper)
bug
Something isn't working
help wanted
Extra attention is needed
#127
opened Mar 9, 2024 by
ai-nikolai
Benchmark for mistral models
enhancement
New feature or request
#122
opened Mar 1, 2024 by
mingxuan-he
Card_Game这个任务跑不起来
bug
Something isn't working
help wanted
Extra attention is needed
#121
opened Feb 29, 2024 by
yupeijei1997
[Bug/Assistance] 测试kg-std任务时,输出文件中全部状态都是task limit reached
bug
Something isn't working
help wanted
Extra attention is needed
#115
opened Feb 5, 2024 by
13416157913
ltp无法启动
bug
Something isn't working
help wanted
Extra attention is needed
#110
opened Jan 31, 2024 by
Fu-Dayuan
[Bug/Assistance]
bug
Something isn't working
help wanted
Extra attention is needed
#109
opened Jan 27, 2024 by
ibingzhaoi
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.