OpenCompass v0.2.3
The OpenCompass team is thrilled to announce the release of OpenCompass v0.2.3! This version is packed with new features, crucial fixes, and documentation updates to improve your experience. We're continuously working to enhance OpenCompass, making it more robust and versatile for all users.
🌟 Highlights:
- Enhanced Model Support: Introduction of new models and configurations, including support for the LightllmApi, lmdeploy pytorch engine, and more.
- New Datasets and Benchmarks: Expanding our dataset repository with additions like OpenFinData, lveval benchmark, and an upgrade to Needlebench.
- Documentation and Sync Improvements: Updated dataset pack URLs, fixed documentation errors, and synchronized with internal codes for consistency.
Explore the key updates in this release:
🌟 New Features:
-
📦 Dataset and Benchmark Expansion:
-
🛠 Model and API Integrations:
- Enhanced functionality with support for LightllmApi input_format and prompt templates, alongside the introduction of get_ppl for TurbomindModel (#888, #878).
- New model configurations added, including support for gemini and deepseek-coder, further broadening the tools available for users (#931, #943).
-
📖 Documentation and Sync Updates:
🐛 Bug Fixes:
- Addressed various configuration and template issues to ensure smoother operation across different models and benchmarks (#894, #893).
- Fixed issues related to IFEval, including type hints and config bugs, enhancing evaluation accuracy and functionality (#906, #915).
🎉 Welcome New Contributors:
- We're delighted to welcome our new contributors: @xu-song, @x22x22, @yuantao2108, and @fanqiNO1. Your contributions are invaluable to the growth of OpenCompass!
🔗 Full Changelog
- Support LightllmApi input_format by @helloyongyang in #888
- [Fix] rename qwen2-beta -> qwen1.5 by @Leymore in #894
- [Fix] Fix chatglm2 config by @Leymore in #893
- [Fix] Fix moss template config by @xu-song in #897
- Support lmdeploy pytorch engine by @RunningLeon in #875
- [Fix] fix ifeval by @bittersweet1999 in #906
- [Fix] fix ifeval by @jingmingzhuo in #909
- [Fix] Fix type hint in IFEval for python<=3.8 by @Leymore in #915
- [Docs] Update dataset pack urls by @Leymore in #922
- [Sync] update github blacklist by @Leymore in #929
- [Feature] add support for gemini by @bittersweet1999 in #931
- [Feature] Support OpenFinData by @Skyfall-xzz in #896
- [Fix]Fixed the problem of never entering task.run() mode in local scheduling mode. by @x22x22 in #930
- Add VLLM Model Configs by @DseidLi in #938
- [Feature] Upgrade the needle-in-a-haystack experiment to Needlebench by @DseidLi in #913
- [Feature] add lveval benchmark by @yuantao2108 in #914
- [Sync] Sync with internal 2023.03.04 by @Leymore in #941
- [Fix] fix a bug of humanevalplus config by @jingmingzhuo in #944
- [Feature] Add configs of deepseek-coder by @jingmingzhuo in #943
- Fix FinanceIQ_datasets import error by @xu-song in #939
- [Docs] Update rank link in README by @fanqiNO1 in #911
- Support get_ppl for TurbomindModel by @RunningLeon in #878
- Support prompt template for LightllmApi. Update LightllmApi token bucket. by @helloyongyang in #945
- Fix LightllmApi ppl test by @helloyongyang in #951
- [Fix] Chinese version of ReadTheDoc by @tonysy in #947
- [fix] add different temp for different question in mtbench by @bittersweet1999 in #954
- [Sync] Sync with internal codes 2024.03.08 by @Leymore in #953
- [Docs] Update README by @tonysy in #956
- [Misc] Update owners by @Leymore in #961
- [Fix] Use logger.error on failure by @Leymore in #960
- [Sync] Bump version 0.2.3 by @Leymore in #957
For a detailed overview of all changes, check out our Full Changelog.