Skip to content

Latest commit

 

History

History
2729 lines (1364 loc) · 124 KB

promptapplication.md

File metadata and controls

2729 lines (1364 loc) · 124 KB

📄 Prompt Application

Paper List

IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization2024.07.03

Ahmed Frikha, Nassim Walha, K. K. Nakka, Ricardo Mendes, Xue Jiang, etc


Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs2024.06.28

Sukmin Yun, Haokun Lin, Rusiru Thushara, Mohammad Qazim Bhat, Yongxin Wang, etc


OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding2024.06.27

Tao Zhang, Xiangtai Li, Hao Fei, Haobo Yuan, Shengqiong Wu, etc


Adversarial Search Engine Optimization for Large Language Models2024.06.26

Fredrik Nestaas, Edoardo Debenedetti, F. Tramèr


VideoLLM-online: Online Video Large Language Model for Streaming Video2024.06.17

Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, etc


Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs2024.06.14

Rui Yang, Ruomeng Ding, Yong Lin, Huan Zhang, Tong Zhang


Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation2024.06.10

Peize Sun, Yi Jiang, Shoufa Chen, Shilong Zhang, Bingyue Peng, etc


Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences2024.06.07

Patrick Haller, Lena S. Bolliger, Lena A. Jager


PaCE: Parsimonious Concept Engineering for Large Language Models2024.06.06

Jinqi Luo, Tianjiao Ding, Kwan Ho Ryan Chan, D. Thaker, Aditya Chattopadhyay, etc


Yuan 2.0-M32: Mixture of Experts with Attention Router2024.05.28

Shaohua Wu, Jiangang Luo, Xi Chen, Lingjun Li, Xudong Zhao, etc . - 【arXiv.org】


When Generative AI Meets Workplace Learning: Creating A Realistic & Motivating Learning Experience With A Generative PCA2024.05.24

Andreas Bucher, Birgit Schenk, Mateusz Dolata, Gerhard Schwabe . - 【arXiv.org】


Measuring Impacts of Poisoning on Model Parameters and Embeddings for Large Language Models of Code2024.05.19

Aftab Hussain, Md Rafiqul Islam Rabin, Mohammad Amin Alipour . - 【arXiv.org】


CPS-LLM: Large Language Model based Safe Usage Plan Generator for Human-in-the-Loop Human-in-the-Plant Cyber-Physical System2024.05.19

Ayan Banerjee, Aranyak Maity, Payal Kamboj, Sandeep K. S. Gupta . - 【arXiv.org】


Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling2024.05.10

Lyumanshan Ye, Jiandong Jiang, Danni Chang, Pengfei Liu . - 【arXiv.org】


UniDM: A Unified Framework for Data Manipulation with Large Language Models2024.05.10

Yichen Qian, Yongyi He, Rong Zhu, Jintao Huang, Zhijian Ma, etc . - 【Conference on Machine Learning and Systems】


FlockGPT: Guiding UAV Flocking with Linguistic Orchestration2024.05.09

Artem Lykov, Sausar Karaf, Mikhail Martynov, Valerii Serpiva, A. Fedoseev, etc . - 【arXiv.org】


Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning2024.05.09

Shibo Jie, Yehui Tang, Ning Ding, Zhi-Hong Deng, Kai Han, etc . - 【arXiv.org】


DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model2024.05.07

Zhihong Shao, Damai Dai, Daya Guo, Bo Liu, Zihan Wang . - 【arXiv.org】


QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving2024.05.07

Yujun Lin, Haotian Tang, Shang Yang, Zhekai Zhang, Guangxuan Xiao, etc . - 【arXiv.org】


Study of adoption of artificial intelligence technology-driven natural large language model-based chatbots by firms for customer service interaction2024.05.06

S. Bhattacharyya . - 【Journal of Science and Technology Policy Management】


FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems2024.05.03

Yashar Deldjoo . - 【arXiv.org】


Single and Multi-Hop Question-Answering Datasets for Reticular Chemistry with GPT-4-Turbo2024.05.03

Nakul Rampal, Kaiyu Wang, Matthew Burigana, Lingxiang Hou, Juri Al-Johani, etc . - 【arXiv.org】


What matters when building vision-language models?2024.05.03

Hugo Laurençon, Léo Tronchon, Matthieu Cord, Victor Sanh . - 【arXiv.org】


Analyzing Narrative Processing in Large Language Models (LLMs): Using GPT4 to test BERT2024.05.03

Patrick Krauss, Jannik Hösch, C. Metzner, Andreas K. Maier, Peter Uhrig, etc . - 【arXiv.org】


Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows2024.05.02

Jasmine Y. Shih, Vishal Mohanty, Yannis Katsis, Hariharan Subramonyam . - 【CHI Extended Abstracts】


NumLLM: Numeric-Sensitive Large Language Model for Chinese Finance2024.05.01

Huan-Yi Su, Ke Wu, Yu-Hao Huang, Wu-Jun Li . - 【arXiv.org】


Is Bigger Edit Batch Size Always Better? - An Empirical Study on Model Editing with Llama-32024.05.01

Junsang Yoon, Akshat Gupta, G. Anumanchipalli . - 【arXiv.org】


Re-Thinking Inverse Graphics With Large Language Models2024.04.23

Peter Kulits, Haiwen Feng, Weiyang Liu, Victoria Abrevaya, Michael J. Black . - 【arXiv.org】


Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models2024.04.23

Aidan Z. H. Yang, Sophia Kolak, Vincent J. Hellendoorn, Ruben Martins, Claire Le Goues . - 【arXiv.org】


Quantifying Multilingual Performance of Large Language Models Across Languages2024.04.17

Zihao Li, Yucheng Shi, Zirui Liu, Fan Yang, Ninghao Liu, etc . - 【arXiv.org】


Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding2024.04.17

Zezhong Fan, Xiaohan Li, Kaushiki Nag, Chenhao Fang, Topojoy Biswas, etc . - 【The Web Conference】


LLMorpheus: Mutation Testing using Large Language Models2024.04.15

Frank Tip, Jonathan Bell, Max Schäfer . - 【arXiv.org】


Generating consistent PDDL domains with Large Language Models2024.04.11

Pavel Smirnov, F. Joublin, A. Ceravola, Michael Gienger


Generating consistent PDDL domains with Large Language Models2024.04.11

Pavel Smirnov, F. Joublin, A. Ceravola, Michael Gienger . - 【arXiv.org】


Manipulating Large Language Models to Increase Product Visibility2024.04.11

Aounon Kumar, Himabindu Lakkaraju


High-Dimension Human Value Representation in Large Language Models2024.04.11

Samuel Cahyawijaya, Delong Chen, Yejin Bang, Leila Khalatbari, Bryan Wilie, etc


MetaCheckGPT -- A Multi-task Hallucination Detector Using LLM Uncertainty and Meta-models2024.04.10

Rahul Mehta, Andrew Hoblitzell, Jack O'Keefe, Hyeju Jang, Vasudeva Varma


Halu-NLP at SemEval-2024 Task 6: MetaCheckGPT - A Multi-task Hallucination Detection using LLM uncertainty and meta-models2024.04.10

Rahul Mehta, Andrew Hoblitzell, Jack O’keefe, Hyeju Jang, Vasudeva Varma . - 【International Workshop on Semantic Evaluation】


From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications2024.04.10

Yongqiang Ma, Lizhi Qin, Jiawei Liu, Yangyang Kang, Yue Zhang, etc


LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding2024.04.08

Chuwei Luo, Yufan Shen, Zhaoqing Zhu, Qi Zheng, Zhi Yu, etc


Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models2024.04.08

Yutao Ouyang, Jinhan Li, Yunfei Li, Zhongyu Li, Chao Yu, etc


Topic-based Watermarks for LLM-Generated Text2024.04.02

Alexander Nemecek, Yuzhou Jiang, Erman Ayday


Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks2024.04.02

Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion


Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference2024.03.29

Jovan Stojkovic, Esha Choukse, Chaojie Zhang, Íñigo Goiri, Josep Torrellas . - 【arXiv.org】


LUQ: Long-text Uncertainty Quantification for LLMs2024.03.29

Caiqi Zhang, Fangyu Liu, Marco Basaldella, Nigel Collier . - 【arXiv.org】


Gecko: Versatile Text Embeddings Distilled from Large Language Models2024.03.29

Jinhyuk Lee, Zhuyun Dai, Xiaoqi Ren, Blair Chen, Daniel Cer, etc . - 【arXiv.org】


WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models2024.03.28

Piotr Molenda, Adian Liusie, Mark J. F. Gales . - 【arXiv.org】


MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model2024.03.27

Yike Wu, Jiatao Zhang, Nan Hu, LanLing Tang, Guilin Qi, etc . - 【arXiv.org】


Comp4D: LLM-Guided Compositional 4D Scene Generation2024.03.25

Dejia Xu, Hanwen Liang, N. Bhatt, Hezhen Hu, Hanxue Liang, etc


MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?2024.03.21

Renrui Zhang, Dongzhi Jiang, Yichi Zhang, Haokun Lin, Ziyu Guo, etc


Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs2024.03.20

Zhihong Sun, Chen Lyu, Bolun Li, Yao Wan, Hongyu Zhang, etc


Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model2024.03.20

Peng Zhou, Jianmin Wang, Chunyan Li, Zixu Wang, Yiping Liu, etc


Towards Robots That Know When They Need Help: Affordance-Based Uncertainty for Large Language Model Planners2024.03.19

James F. Mullen, Dinesh Manocha


ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference2024.03.15

Hyungjun Oh, Kihong Kim, Jaemin Kim, Sungkyun Kim, Junyeol Lee, etc


ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning2024.03.14

Ahmed Masry, Mehrad Shahmohammadi, Md. Rizwan Parvez, Enamul Hoque, Shafiq R. Joty


Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference2024.03.14

Piotr Nawrot, Adrian La'ncucki, Marcin Chochowski, David Tarjan, E. M. Ponti


Towards Proactive Interactions for In-Vehicle Conversational Assistants Utilizing Large Language Models2024.03.14

Huifang Du, Xuejing Feng, Jun Ma, Meng Wang, Shiyu Tao, etc


Simple and Scalable Strategies to Continually Pre-train Large Language Models2024.03.13

Adam Ibrahim, Benjamin Th'erien, Kshitij Gupta, Mats L. Richter, Quentin Anthony, etc


LG-Traj: LLM Guided Pedestrian Trajectory Prediction2024.03.12

Pranav Singh Chib, Pravendra Singh


Big City Bias: Evaluating the Impact of Metropolitan Size on Computational Job Market Abilities of Language Models2024.03.12

Charlie Campanella, R. Goot . - 【NLP4HR】


InfiCoder-Eval: Systematically Evaluating the Question-Answering Capabilities of Code Large Language Models2024.03.11

Linyi Li, Shijie Geng, Zhenwen Li, Yibo He, Hao Yu, etc


Naming, Describing, and Quantifying Visual Objects in Humans and LLMs2024.03.11

Alberto Testoni, Juell Sprott, Sandro Pezzelle


LLM4Decompile: Decompiling Binary Code with Large Language Models2024.03.08

Hanzhuo Tan, Qi Luo, Jing Li, Yuqun Zhang


Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs2024.03.08

Arijit Nag, Animesh Mukherjee, Niloy Ganguly, Soumen Chakrabarti


Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference2024.03.07

Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, etc


SaulLM-7B: A pioneering Large Language Model for Law2024.03.06

Pierre Colombo, Telmo Pessoa Pires, Malik Boudiaf, Dominic Culver, Rui Melo, etc


KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection2024.03.04

Yuexin Li, Chengyu Huang, Shumin Deng, Mei Lin Lock, Tri Cao, etc


Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models2024.02.29

Chen Qian, Jie Zhang, Wei Yao, Dongrui Liu, Zhen-fei Yin, etc


Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers2024.02.29

Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, etc


Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models2024.02.29

Soham De, Samuel L. Smith, Anushan Fernando, Aleksandar Botev, George Cristian-Muraru, etc


The All-Seeing Project V2: Towards General Relation Comprehension of the Open World2024.02.29

Weiyun Wang, Yiming Ren, Hao Luo, Tiantong Li, Chenxiang Yan, etc


LeMo-NADe: Multi-Parameter Neural Architecture Discovery with LLMs2024.02.28

Md Hafizur Rahman, Prabuddha Chakraborty


The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits2024.02.27

Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, etc . - 【arXiv.org】


OncoGPT: A Medical Conversational Model Tailored with Oncology Domain Expertise on a Large Language Model Meta-AI (LLaMA)2024.02.26

Fujian Jia, Xin Liu, Lixi Deng, Jiwen Gu, Chunchao Pu, etc . - 【arXiv.org】


API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs2024.02.23

Kinjal Basu, Ibrahim Abdelaziz, Subhajit Chaudhury, Soham Dan, M. Crouse, etc


Genie: Generative Interactive Environments2024.02.23

Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, etc


Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs2024.02.22

Aaditya K. Singh, DJ Strouse . - 【arXiv.org】


Semantic Mirror Jailbreak: Genetic Algorithm Based Jailbreak Prompts Against Open-source LLMs2024.02.21

Xiaoxia Li, Siyuan Liang, Jiyi Zhang, Hansheng Fang, Aishan Liu, etc . - 【arXiv.org】


Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages2024.02.19

Yuan Zhang, Yile Wang, Zijun Liu, Shuo Wang, Xiaolong Wang, etc . - 【arXiv.org】


Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs2024.02.12

Víctor Gallego . - 【arXiv.org】


FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs2024.02.08

Eun Cheol Choi, Emilio Ferrara . - 【arXiv.org】


SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models2024.02.08

Peng Gao, Renrui Zhang, Chris Liu, Longtian Qiu, Siyuan Huang, etc . - 【arXiv.org】


On the Convergence of Zeroth-Order Federated Tuning in Large Language Models2024.02.08

Zhenqing Ling, Daoyuan Chen, Liuyi Yao, Yaliang Li, Ying Shen . - 【arXiv.org】


Large Language Model Meets Graph Neural Network in Knowledge Distillation2024.02.08

Shengxiang Hu, Guobing Zou, Song Yang, Yanglan Gan, Bofeng Zhang, etc . - 【arXiv.org】


Panacea: Pareto Alignment via Preference Adaptation for LLMs2024.02.03

Yifan Zhong, Chengdong Ma, Xiaoyuan Zhang, Ziran Yang, Qingfu Zhang, etc . - 【arXiv.org】


Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?2024.01.31

Andreas Opedal, Alessandro Stolfo, Haruki Shirakami, Ying Jiao, Ryan Cotterell, etc . - 【arXiv.org】


InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model2024.01.29

Xiao-wen Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Bin Wang, etc . - 【arXiv.org】


True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning2024.01.25

Weihao Tan, Wentao Zhang, Shanqi Liu, Longtao Zheng, Xinrun Wang, etc . - 【arXiv.org】


ChatQA: Building GPT-4 Level Conversational QA Models2024.01.18

Zihan Liu, Wei Ping, Rajarshi Roy, Peng Xu, Chankyu Lee, etc . - 【arXiv.org】


Beyond Reference-Based Metrics: Analyzing Behaviors of Open LLMs on Data-to-Text Generation2024.01.18

Zdeněk Kasner, Ondvrej Duvsek . - 【arXiv.org】


DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models2024.01.11

Damai Dai, Chengqi Deng, Chenggang Zhao, R. Xu, Huazuo Gao, etc . - 【arXiv.org】


Can Large Language Models Beat Wall Street? Unveiling the Potential of AI in Stock Selection2024.01.08

G. Fatouros, Konstantinos Metaxas, John Soldatos, D. Kyriazis . - 【Social Science Research Network】


Instruct-Imagen: Image Generation with Multi-modal Instruction2024.01.03

Hexiang Hu, Kelvin C.K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, etc . - 【arXiv.org】


TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones2023.12.28

Zhengqing Yuan, Zhaoxu Li, Lichao Sun . - 【arXiv.org】


MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices2023.12.28

Xiangxiang Chu, Limeng Qiao, Xinyang Lin, Shuang Xu, Yang Yang, etc . - 【arXiv.org】


Generative AI for Math: Part I - MathPile: A Billion-Token-Scale Pretraining Corpus for Math2023.12.28

Zengzhi Wang, Rui Xia, Pengfei Liu . - 【arXiv.org】


WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation2023.12.20

Zhaojian Yu, Xin Zhang, Ning Shang, Yangyu Huang, Can Xu, etc . - 【arXiv.org】


A mathematical perspective on Transformers2023.12.17

Borjan Geshkovski, Cyril Letrouit, Yury Polyanskiy, Philippe Rigollet


Mathematical discoveries from program search with large language models.2023.12.14

Bernardino Romera-Paredes, M. Barekatain, Alexander Novikov, Matej Balog, M. P. Kumar, etc . - 【Nature】


LMDrive: Closed-Loop End-to-End Driving with Large Language Models2023.12.12

Hao Shao, Yuxuan Hu, Letian Wang, Steven L. Waslander, Yu Liu, etc . - 【arXiv.org】


LLM360: Towards Fully Transparent Open-Source LLMs2023.12.11

Zhengzhong Liu, Aurick Qiao, W. Neiswanger, Hongyi Wang, Bowen Tan, etc


From Text to Motion: Grounding GPT-4 in a Humanoid Robot"Alter3"2023.12.11

Takahide Yoshida, A. Masumori, Takashi Ikegami


Control Risk for Potential Misuse of Artificial Intelligence in Science2023.12.11

Jiyan He, Weitao Feng, Yaosen Min, Jingwei Yi, Kunsheng Tang, etc


Sequential Modeling Enables Scalable Learning for Large Vision Models2023.12.01

Yutong Bai, Xinyang Geng, K. Mangalam, Amir Bar, Alan Yuille, etc


MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers2023.11.27

Yawar Siddiqui, A. Alliegro, Alexey Artemov, Tatiana Tommasi, Daniele Sirigatti, etc . - 【arXiv.org】


Minimizing Factual Inconsistency and Hallucination in Large Language Models2023.11.23

I. Muneeswaran, Shreya Saxena, Siva Prasad, M. V. S. Prakash, Advaith Shankar, etc . - 【arXiv.org】


Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents2023.11.20

Zhuosheng Zhang, Yao Yao, Aston Zhang, Xiangru Tang, Xinbei Ma, etc . - 【arXiv.org】


An Embodied Generalist Agent in 3D World2023.11.18

Jiangyong Huang, Silong Yong, Xiaojian Ma, Xiongkun Linghu, Puhao Li, etc . - 【arXiv.org】


Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning2023.11.17

Rohit Girdhar, Mannat Singh, Andrew Brown, Quentin Duval, S. Azadi, etc . - 【arXiv.org】


Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding2023.11.14

Peng Jin, Ryuichi Takanobu, Caiwan Zhang, Xiaochun Cao, Li Yuan . - 【arXiv.org】


SpectralGPT: Spectral Foundation Model2023.11.13

D. Hong, Bing Zhang, Xuyang Li, Yuxuan Li, Chenyu Li, etc . - 【arXiv.org】


Social Motion Prediction with Cognitive Hierarchies2023.11.08

Wentao Zhu, Jason Qin, Yuke Lou, Hang Ye, Xiaoxuan Ma, etc . - 【arXiv.org】


Pre-training LLMs using human-like development data corpus2023.11.08

Khushi Bhardwaj, Raj Sanjay Shah, Sashank Varma . - 【arXiv.org】


mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration2023.11.07

Qinghao Ye, Haiyang Xu, Jiabo Ye, Mingshi Yan, Anwen Hu, etc . - 【arXiv.org】


Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation2023.11.06

Rusheb Shah, Quentin Feuillade--Montixi, Soroush Pour, Arush Tagade, Stephen Casper, etc . - 【arXiv.org】


Ziya2: Data-centric Learning is All LLMs Need2023.11.06

Ruyi Gan, Ziwei Wu, Renliang Sun, Junyu Lu, Xiaojun Wu, etc . - 【arXiv.org】


Levels of AGI: Operationalizing Progress on the Path to AGI2023.11.04

Meredith Ringel Morris, Jascha Narain Sohl-Dickstein, Noah Fiedel, T. Warkentin, Allan Dafoe, etc . - 【arXiv.org】


PILL: Plug Into LLM with Adapter Expert and Attention Gate2023.11.03

Fangyuan Zhang, Tingting Liang, Zhengyuan Wu, Yuyu Yin . - 【arXiv.org】


RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation2023.11.02

Yufei Wang, Zhou Xian, Feng Chen, Tsun-Hsuan Wang, Yian Wang, etc . - 【arXiv.org】


TopicGPT: A Prompt-based Topic Modeling Framework2023.11.02

Chau Minh Pham, Alexander Miserlis Hoyle, Simeng Sun, Mohit Iyyer . - 【arXiv.org】


ChipNeMo: Domain-Adapted LLMs for Chip Design2023.10.31

Mingjie Liu, Teodor-Dumitru Ene, Robert Kirby, Chris Cheng, Nathaniel Pinckney, etc . - 【arXiv.org】


Narratron: Collaborative Writing and Shadow-playing of Children Stories with Large Language Models2023.10.29

Yubo Zhao, Xiying Bao . - 【Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology】


CodeFusion: A Pre-trained Diffusion Model for Code Generation2023.10.26

Mukul Singh, J. Cambronero, Sumit Gulwani, Vu Le, Carina Negreanu, etc


GraphGPT: Graph Instruction Tuning for Large Language Models2023.10.19

Jiabin Tang, Yuhao Yang, Wei Wei, Lei Shi, Lixin Su, etc . - 【arXiv.org】


Creative Robot Tool Use with Large Language Models2023.10.19

Mengdi Xu, Peide Huang, Wenhao Yu, Shiqi Liu, Xilun Zhang, etc . - 【arXiv.org】


MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models2023.10.18

Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, etc . - 【arXiv.org】


Llemma: An Open Language Model For Mathematics2023.10.16

Zhangir Azerbayev, Hailey Schoelkopf, Keiran Paster, Marco Dos Santos, Stephen McAleer, etc . - 【arXiv.org】


BiLL-VTG: Bridging Large Language Models and Lightweight Visual Tools for Video-based Texts Generation2023.10.16

Ji Qi, Kaixuan Ji, Jifan Yu, Duokang Wang, Bin Xu, etc . - 【arXiv.org】


JMedLoRA: Medical Domain Adaptation on Japanese Large Language Models using Instruction-tuning2023.10.16

Issey Sukeda, Masahiro Suzuki, Hiroki Sakaji, Satoshi Kodera . - 【arXiv.org】


Table-GPT: Table-tuned GPT for Diverse Table Tasks2023.10.13

Peng Li, Yeye He, Dror Yashar, Weiwei Cui, Song Ge, etc . - 【arXiv.org】


MemGPT: Towards LLMs as Operating Systems2023.10.12

Charles Packer, Vivian Fang, Shishir G. Patil, Kevin Lin, Sarah Wooders, etc


Ferret: Refer and Ground Anything Anywhere at Any Granularity2023.10.11

Haoxuan You, Haotian Zhang, Zhe Gan, Xianzhi Du, Bowen Zhang, etc


Understanding the Effects of RLHF on LLM Generalisation and Diversity2023.10.10

Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis, Jelena Luketina, Eric Hambro, etc


Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading2023.10.08

Howard Chen, Ramakanth Pasunuru, Jason Weston, Asli Celikyilmaz . - 【arXiv.org】


xVal: A Continuous Number Encoding for Large Language Models2023.10.04

Siavash Golkar, Mariel Pettee, Michael Eickenberg, Alberto Bietti, M. Cranmer, etc . - 【arXiv.org】


How FaR Are Large Language Models From Agents with Theory-of-Mind?2023.10.04

Pei Zhou, Aman Madaan, Srividya Pranavi Potharaju, Aditya Gupta, Kevin R. McKee, etc


MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens2023.10.03

Kaizhi Zheng, Xuehai He, Xin Eric Wang . - 【arXiv.org】


PB-LLM: Partially Binarized Large Language Models2023.09.29

Yuzhang Shang, Zhihang Yuan, Qiang Wu, Zhen Dong . - 【arXiv.org】


GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and Beyond2023.09.28

Shen Zheng, Yuyu Zhang, Yijie Zhu, Chenguang Xi, Pengyang Gao, etc . - 【arXiv.org】


Chatmap : Large Language Model Interaction with Cartographic Data2023.09.28

Eren Unlu . - 【arXiv.org】


Integration of Large Language Models within Cognitive Architectures for Autonomous Robots2023.09.26

Miguel Ángel González Santamarta, F. J. Lera, Ángel Manuel Guerrero Higueras, Vicente Matellán Olivera . - 【arXiv.org】


Effective Distillation of Table-based Reasoning Ability from LLMs2023.09.22

Bohao Yang, Chen Tang, Kangning Zhao, Chenghao Xiao, Chenghua Lin . - 【arXiv.org】


ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs2023.09.22

Justin Chih-Yao Chen, Swarnadeep Saha, Mohit Bansal . - 【arXiv.org】


Chain-of-Verification Reduces Hallucination in Large Language Models2023.09.20

S. Dhuliawala, M. Komeili, Jing Xu, Roberta Raileanu, Xian Li, etc . - 【arXiv.org】


Kosmos-2.5: A Multimodal Literate Model2023.09.20

Tengchao Lv, Yupan Huang, Jingye Chen, Lei Cui, Shuming Ma, etc . - 【arXiv.org】


DreamLLM: Synergistic Multimodal Comprehension and Creation2023.09.20

Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, etc . - 【arXiv.org】


SwitchGPT: Adapting Large Language Models for Non-Text Outputs2023.09.14

Xinyu Wang, Bohan Zhuang, Qi Wu . - 【arXiv.org】


NExT-GPT: Any-to-Any Multimodal LLM2023.09.11

Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-Seng Chua . - 【arXiv.org】


From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting2023.09.08

Griffin Adams, Alexander R. Fabbri, Faisal Ladhak, Eric Lehman, Noémie Elhadad . - 【arXiv.org】


Large Language Models as Optimizers2023.09.07

Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, etc


DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models2023.09.07

Yung-Sung Chuang, Yujia Xie, Hongyin Luo, Yoon Kim, James R. Glass, etc . - 【arXiv.org】


YaRN: Efficient Context Window Extension of Large Language Models2023.08.31

Bowen Peng, Jeffrey Quesnelle, Honglu Fan, Enrico Shippole . - 【arXiv.org】


MVDream: Multi-view Diffusion for 3D Generation2023.08.31

Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, etc . - 【arXiv.org】


FedLogic: Interpretable Federated Multi-Domain Chain-of-Thought Prompt Selection for Large Language Models2023.08.29

Pengwei Xing, Songtao Lu, Han Yu . - 【arXiv.org】


PE-MED: Prompt Enhancement for Interactive Medical Image Segmentation2023.08.26

Ao Chang, Xing Tao, Xin Yang, Yuhao Huang, Xinrui Zhou, etc . - 【arXiv.org】


DARWIN Series: Domain Specific Large Language Models for Natural Science2023.08.25

Tong Xie, Yuwei Wan, Wei Huang, Yufei Zhou, Yixuan Liu, etc . - 【arXiv.org】


ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation2023.08.22

Jianghao Lin, Rongjie Shan, Chenxu Zhu, Kounianhua Du, Bo Chen, etc . - 【arXiv.org】


SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding2023.08.21

Tianyu Yu, Chengyue Jiang, Chao Lou, Shen Huang, Xiaobin Wang, etc . - 【arXiv.org】


Giraffe: Adventures in Expanding Context Lengths in LLMs2023.08.21

Arka Pal, Deep Karkhanis, Manley Roberts, S. Dooley, Arvind Sundararajan, etc . - 【arXiv.org】


ExpeL: LLM Agents Are Experiential Learners2023.08.20

Andrew Zhao, Daniel Huang, Quentin Xu, Matthieu Lin, Y. Liu, etc . - 【arXiv.org】


Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes2023.08.17

Zehan Wang, Haifeng Huang, Yang Zhao, Ziang Zhang, Zhou Zhao . - 【arXiv.org】


The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation2023.08.14

Patrick Fernandes, Daniel Deutsch, M. Finkelstein, Parker Riley, André F. T. Martins, etc . - 【arXiv.org】


Accelerating LLM Inference with Staged Speculative Decoding2023.08.08

Benjamin Spector, Christal Re . - 【arXiv.org】


Shepherd: A Critic for Language Model Generation2023.08.08

Tianlu Wang, Ping Yu, Xiaoqing Tan, Sean O'Brien, Ramakanth Pasunuru, etc . - 【arXiv.org】


AgentBench: Evaluating LLMs as Agents2023.08.07

Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, etc . - 【arXiv.org】


Scaling Relationship on Learning Mathematical Reasoning with Large Language Models2023.08.03

Zheng Yuan, Hongyi Yuan, Cheng Li, Guanting Dong, Chuanqi Tan, etc . - 【arXiv.org】


Advancing Beyond Identification: Multi-bit Watermark for Language Models2023.08.01

Kiyoon Yoo, W. Ahn, N. Kwak . - 【arXiv.org】


A Private Watermark for Large Language Models2023.07.30

Aiwei Liu, Leyi Pan, Xuming Hu, Shuang Li, Lijie Wen, etc . - 【arXiv.org】


Robust Distortion-free Watermarks for Language Models2023.07.28

Rohith Kuditipudi, John Thickstun, Tatsunori Hashimoto, Percy Liang . - 【arXiv.org】


Publisher Correction: Large language models encode clinical knowledge.2023.07.27

K. Singhal, Shekoofeh Azizi, Tao Tu, S. S. Mahdavi, Jason Wei, etc . - 【Nature】


Med-Flamingo: a Multimodal Medical Few-shot Learner2023.07.27

Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, C. Zakka, etc . - 【arXiv.org】


Med-Flamingo: a Multimodal Medical Few-shot Learner2023.07.27

Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, C. Zakka, etc . - 【arXiv.org】


CARTIER: Cartographic lAnguage Reasoning Targeted at Instruction Execution for Robots2023.07.21

Nikhil Kakodkar, D. Rivkin, Bobak H. Baghi, F. Hogan, Gregory Dudek . - 【arXiv.org】


ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning2023.07.18

Liang Zhao, En Yu, Zheng Ge, Jinrong Yang, Hao-Ran Wei, etc . - 【arXiv.org】


TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT2023.07.17

Liangyu Zha, Junlin Zhou, Liyao Li, Rui Wang, Qingyi Huang, etc . - 【arXiv.org】


MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots2023.07.16

Gelei Deng, Yi Liu, Yuekang Li, Kailong Wang, Ying Zhang, etc


Self-consistency for open-ended generations2023.07.11

Siddhartha Jain, Xiaofei Ma, Anoop Deoras, Bing Xiang . - 【arXiv.org】


LongNet: Scaling Transformers to 1, 000, 000, 000 Tokens2023.07.05

Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, etc . - 【arXiv.org】


Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation2023.07.04

Jian Guan, Minlie Huang . - 【Annual Meeting of the Association for Computational Linguistics】


Math Agents: Computational Infrastructure, Mathematical Embedding, and Genomics2023.07.04

M. Swan, Takashi Kido, Eric Roland, R. P. D. Santos . - 【arXiv.org】


Conformer LLMs - Convolution Augmented Large Language Models2023.07.02

Prateek Verma . - 【arXiv.org】


Inferring the Goals of Communicating Agents from Actions and Instructions2023.06.28

Lance Ying, Tan Zhi-Xuan, Vikash K. Mansinghka, J. Tenenbaum . - 【arXiv.org】


Kosmos-2: Grounding Multimodal Large Language Models to the World2023.06.26

Zhiliang Peng, Wenhui Wang, Li Dong, Y. Hao, Shaohan Huang, etc . - 【arXiv.org】


AudioPaLM: A Large Language Model That Can Speak and Listen2023.06.22

Paul K. Rubenstein, Chulayuth Asawaroengchai, D. Nguyen, Ankur Bapna, Zalán Borsos, etc . - 【arXiv.org】


Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models2023.06.14

Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Kaifeng Bi, Xiaotao Gu, etc . - 【arXiv.org】


XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models2023.06.13

Omkar Thawakar, Abdelrahman M. Shaker, Sahal Shaji Mullappilly, Hisham Cholakkal, R. Anwer, etc . - 【arXiv.org】


Judging LLM-as-a-judge with MT-Bench and Chatbot Arena2023.06.09

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, etc . - 【arXiv.org】


Judging LLM-as-a-judge with MT-Bench and Chatbot Arena2023.06.09

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, etc . - 【arXiv.org】


PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance2023.06.08

Qianqian Xie, Weiguang Han, Xiao Zhang, Yanzhao Lai, Min Peng, etc . - 【arXiv.org】


ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory2023.06.06

Chenxu Hu, Jie Fu, Chenzhuang Du, Simian Luo, J. Zhao, etc . - 【arXiv.org】


The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only2023.06.01

Guilherme Penedo, Quentin Malartic, Daniel Hesslow, Ruxandra-Aimée Cojocaru, Alessandro Cappelli, etc . - 【arXiv.org】


Baselines for Identifying Watermarked Large Language Models2023.05.29

Leonard Tang, Gavin Uberti, Tom Shlomi . - 【arXiv.org】


Undetectable Watermarks for Language Models2023.05.25

Miranda Christ, S. Gunn, Or Zamir . - 【IACR Cryptology ePrint Archive】


Mitigating Temporal Misalignment by Discarding Outdated Facts2023.05.24

Michael J.Q. Zhang, Eunsol Choi


Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective2023.05.24

Guhao Feng, Yuntian Gu, Bohang Zhang, Haotian Ye, Di He, etc


Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering2023.05.24

Avi Caciularu, Matthew E. Peters, Jacob Goldberger, Ido Dagan, Arman Cohan


Context-Aware Transformer Pre-Training for Answer Sentence Selection2023.05.24

Luca Di Liello, Siddhant Garg, Alessandro Moschitti


Gorilla: Large Language Model Connected with Massive APIs2023.05.24

Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez


Visual Programming for Text-to-Image Generation and Evaluation2023.05.24

Jaemin Cho, Abhay Zala, Mohit Bansal


Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model2023.05.24

Zirui Liu, Guanchu Wang, Shaochen Zhong, Zhaozhuo Xu, Daochen Zha, etc


LMs with a Voice: Spoken Language Modeling beyond Speech Tokens2023.05.24

Eliya Nachmani, Alon Levkovitch, Julian Salazar, Chulayutsh Asawaroengchai, Soroosh Mariooryad, etc


Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator2023.05.24

Ziwei He, Meng Yang, Minwei Feng, Jingcheng Yin, Xinbing Wang, etc


CSTS: Conditional Semantic Textual Similarity2023.05.24

Ameet Deshpande, Carlos E. Jimenez, Howard Chen, Vishvak S. Murahari, Victoria Graf, etc


STAR: Boosting Low-Resource Event Extraction by Structure-to-Text Data Generation with Large Language Models2023.05.24

Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung, P. Jeffrey Brantingham, Nanyun Peng, etc


Contrastive Learning of Sentence Embeddings from Scratch2023.05.24

Junlei Zhang, Zhenzhong Lan, Junxian He


Meta-Learning Online Adaptation of Language Models2023.05.24

Nathan J. Hu, Eric Mitchell, Christopher D. Manning, Chelsea Finn


Who Wrote this Code? Watermarking for Code Generation2023.05.24

Taehyun Lee, Seokhee Hong, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, etc


Reasoning over Hierarchical Question Decomposition Tree for Explainable Question Answering2023.05.24

Jiajie Zhang, Shulin Cao, Tingjia Zhang, Xin Lv, Jiaxin Shi, etc


Understanding Arithmetic Reasoning in Language Models using Causal Mediation Analysis2023.05.24

Alessandro Stolfo, Yonatan Belinkov, Mrinmaya Sachan


Active Learning for Natural Language Generation2023.05.24

Yotam Perlitz, Ariel Gera, Michal Shmueli-Scheuer, Dafna Sheinwald, Noam Slonim, etc


SmartTrim: Adaptive Tokens and Parameters Pruning for Efficient Vision-Language Models2023.05.24

Zekun Wang, Jingchang Chen, Wangchunshu Zhou, Ming Liu, Bing Qin


How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives2023.05.24

Xinpeng Wang, Leonie Weissweiler, Hinrich Schutze, Barbara Plank


ChatAgri: Exploring Potentials of ChatGPT on Cross-linguistic Agricultural Text Classification2023.05.24

Biao Zhao, Weiqiang Jin, Javier Del Ser, Guang Yang


Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models2023.05.24

Gen Luo, Yiyi Zhou, Tianhe Ren, Shengxin Chen, Xiaoshuai Sun, etc


Unlocking Temporal Question Answering for Large Language Models Using Code Execution2023.05.24

Xingxuan Li, Liying Cheng, Qingyu Tan, Hwee Tou Ng, Shafiq Joty, etc


Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation2023.05.24

Haonan Li, Fajri Koto, Minghao Wu, Alham Fikri Aji, Timothy Baldwin


Injecting Knowledge into Biomedical Pre-trained Models via Polymorphism and Synonymous Substitution2023.05.24

Hongbo Zhang, Xiang Wan, Benyou Wang


LLMDet: A Large Language Models Detection Tool2023.05.24

Kangxi Wu, Liang Pang, Huawei Shen, Xueqi Cheng, Tat-Seng Chua


The Art of SOCRATIC QUESTIONING: Zero-shot Multimodal Reasoning with Recursive Thinking and Self-Questioning2023.05.24

Jingyuan Qi, Zhiyang Xu, Ying Shen, Minqian Liu, Di Jin, etc


Reasoning with Language Model is Planning with World Model2023.05.24

Shibo Hao, Yi Gu, Haodi Ma, Joshua Jiahua Hong, Zhen Wang, etc


Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers2023.05.24

Yilun Zhao, Haowei Zhang, Shengyun Si, Linyong Nan, Xiangru Tang, etc


Improving Factuality of Abstractive Summarization without Sacrificing Summary Quality2023.05.24

Tanay Dixit, Fei Wang, Muhao Chen


OverPrompt: Enhancing ChatGPT Capabilities through an Efficient In-Context Learning Approach2023.05.24

Jiazheng Li, Runcong Zhao, Yulan He, Lin Gui


MMNet: Multi-Mask Network for Referring Image Segmentation2023.05.24

Yichen Yan, Xingjian He, Wenxuan Wan, Jing Liu


Tricking LLMs into Disobedience: Understanding, Analyzing, and Preventing Jailbreaks2023.05.24

Abhinav Rao, Sachin Vashistha, Atharva Naik, Somak Aditya, Monojit Choudhury


Editing Commonsense Knowledge in GPT2023.05.24

Anshita Gupta, Debanjan Mondal, Akshay Krishna Sheshadri, Wenlong Zhao, Xiang Lorraine Li, etc


Cross-lingual Data Augmentation for Document-grounded Dialog Systems in Low Resource Languages2023.05.24

Qi Gou, Zehua Xia, Wen-Hau Du


Trade-Offs Between Fairness and Privacy in Language Modeling2023.05.24

Cleo Matzken, Steffen Eger, Ivan Habernal


Frugal Prompting for Dialog Models2023.05.24

Bishal Santra, Sakya Basak, Abhinandan De, Manish Gupta, Pawan Goyal


M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection2023.05.24

Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, etc


PIVOINE: Instruction Tuning for Open-world Information Extraction2023.05.24

Keming Lu, Xiaoman Pan, Kaiqiang Song, Hongming Zhang, Dong Yu, etc


Text encoders are performance bottlenecks in contrastive vision-language models2023.05.24

Amita Kamath, Jack Hessel, Kai-Wei Chang


Privacy Implications of Retrieval-Based Language Models2023.05.24

Yangsibo Huang, Samyak Gupta, Zexuan Zhong, Kai Li, Danqi Chen


Interpretable by Design Visual Question Answering2023.05.24

Xingyu Fu, Ben Zhou, Sihao Chen, Mark Yatskar, D. Roth


Leveraging GPT-4 for Automatic Translation Post-Editing2023.05.24

Vikas Raunak, Amr Sharaf, Hany Hassan Awadallah, Arul Menezes


CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering2023.05.24

Weiqi Wang, Tianqing Fang, Wenxuan Ding, Baixuan Xu, Xin Liu, etc


Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers2023.05.24

Zixuan Jiang, Jiaqi Gu, Hanqing Zhu, D. Pan


Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models2023.05.24

Gen Luo, Yiyi Zhou, Tianhe Ren, Shen Chen, Xiaoshuai Sun, etc . - 【arXiv.org】


Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation2023.05.24

Prashant Krishnan, Zilong Wang, Yangkun Wang, Jingbo Shang


Machine Reading Comprehension using Case-based Reasoning2023.05.24

Dung Thai, Dhruv Agarwal, Mudit Chaudhary, R. Das, M. Zaheer, etc


Debiasing Made State-of-the-art: Revisiting the Simple Seed-based Weak Supervision for Text Classification2023.05.24

Chengyu Dong, Zihan Wang, Jingbo Shang


Text Conditional Alt-Text Generation for Twitter Images2023.05.24

Nikita Srivatsan, Sofia Samaniego, Omar Florez, Taylor Berg-Kirkpatrick


SSD-2: Scaling and Inference-time Fusion of Diffusion Language Models2023.05.24

Xiaochuang Han, Sachin Kumar, Yulia Tsvetkov, Marjan Ghazvininejad


UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning2023.05.24

Ahmed Masry, Parsa Kavehzadeh, Xuan Long Do, Enamul Hoque, Shafiq Joty


Trusting Your Evidence: Hallucinate Less with Context-aware Decoding2023.05.24

Weijia Shi, Xiaochuang Han, M. Lewis, Yulia Tsvetkov, Luke Zettlemoyer, etc


In-Context Demonstration Selection with Cross Entropy Difference2023.05.24

Dan Iter, Reid Pryzant, Ruochen Xu, Shuohang Wang, Yang Liu, etc


GlobalBench: A Benchmark for Global Progress in Natural Language Processing2023.05.24

Y. Song, Catherine Cui, Simran Khanuja, Pengfei Liu, FAHIM FAISAL, etc


The student becomes the master: Matching GPT3 on Scientific Factual Error Correction2023.05.24

Dhananjay Ashok, Atharva Kulkarni, Hai Pham, Barnab'as P'oczos


PruMUX: Augmenting Data Multiplexing with Model Compression2023.05.24

Yushan Su, Vishvak S. Murahari, Karthik Narasimhan, Kai Li


Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts2023.05.24

Sheng Shen, Le Hou, Yanqi Zhou, Nan Du, Shayne Longpre, etc


A Causal View of Entity Bias in (Large) Language Models2023.05.24

Fei Wang, Wenjie Mo, Yiwei Wang, Wenxuan Zhou, Muhao Chen


Emergent inabilities? Inverse scaling over the course of pretraining2023.05.24

James A. Michaelov, B. Bergen


InteractiveIE: Towards Assessing the Strength of Human-AI Collaboration in Improving the Performance of Information Extraction2023.05.24

Ishani Mondal, Michelle Yuan, N Anandhavelu, Aparna Garimella, Francis Ferraro, etc


Reinforcement Learning finetuned Vision-Code Transformer for UI-to-Code Generation2023.05.24

Davit Soselia, Khalid Saifullah, Tianyi Zhou


KNN-LM Does Not Improve Open-ended Text Generation2023.05.24

Shufan Wang, Yixiao Song, Andrew Drozdov, Aparna Garimella, Varun Manjunatha, etc


Abductive Commonsense Reasoning Exploiting Mutually Exclusive Explanations2023.05.24

Wenting Zhao, Justin T. Chiu, Claire Cardie, Alexander M. Rush


Language Models with Rationality2023.05.23

Nora Kassner, Oyvind Tafjord, Ashish Sabharwal, Kyle Richardson, Hinrich Schütze, etc


A Trip Towards Fairness: Bias and De-Biasing in Large Language Models2023.05.23

Leonardo Ranaldi, Elena Sofia Ruzzetti, Davide Venditti, Dario Onorati, Fabio Massimo Zanzotto


Question Answering as Programming for Solving Time-Sensitive Questions2023.05.23

Xinyu Zhu, Cheng Yang, Bei Chen, Siheng Li, Jian-Guang Lou, etc


PaD: Program-aided Distillation Specializes Large Models in Reasoning2023.05.23

Xuekai Zhu, Biqing Qi, Kaiyan Zhang, Xingwei Long, Bowen Zhou


Aligning Large Language Models through Synthetic Feedback2023.05.23

Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, etc


LogicLLM: Exploring Self-supervised Logic-enhanced Training for Large Language Models2023.05.23

Fangkai Jiao, Zhiyang Teng, Shafiq Joty, Bosheng Ding, Aixin Sun, etc


Masked Path Modeling for Vision-and-Language Navigation2023.05.23

Zi-Yi Dou, Feng Gao, Nanyun Peng . - 【arXiv.org】


ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models2023.05.23

Z. Chen, Kun Zhou, Beichen Zhang, Zheng Gong, Wayne Xin Zhao, etc


DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules2023.05.22

Yanchen Liu, William Held, Diyi Yang


Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision2023.05.22

Yucheng Cai, Hong Liu, Zhijian Ou, Y. Huang, Junlan Feng


Sentence Representations via Gaussian Embedding2023.05.22

Shohei Yoda, Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda


LM-Switch: Lightweight Language Model Conditioning in Word Embedding Space2023.05.22

Chi Han, Jialiang Xu, Manling Li, Y. Fung, Chenkai Sun, etc


MacLaSa: Multi-Aspect Controllable Text Generation via Efficient Sampling from Compact Latent Space2023.05.22

Hanxing Ding, Liang Pang, Z. Wei, Huawei Shen, Xueqi Cheng, etc


Enhancing Cross-lingual Natural Language Inference by Soft Prompting with Multilingual Verbalizer2023.05.22

Shuang Li, Xuming Hu, Aiwei Liu, Yawen Yang, Fukun Ma, etc


A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting Approaches2023.05.22

Zihan Wang, Tianle Wang, Dheeraj Mekala, Jingbo Shang


Keeping Up with the Language Models: Robustness-Bias Interplay in NLI Data and Models2023.05.22

Ioana Baldini, Chhavi Yadav, Payel Das, K. Varshney


To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis2023.05.22

Fuzhao Xue, Yao Fu, Wangchunshu Zhou, Zangwei Zheng, Yang You


Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance2023.05.22

Yue Zhang, Leyang Cui, Deng Cai, Xinting Huang, Tao Fang, etc


InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT2023.05.22

Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuo Wang, etc


Making Language Models Better Tool Learners with Execution Feedback2023.05.22

Shuofei Qiao, Honghao Gui, Huajun Chen, Ningyu Zhang


GPT-SW3: An Autoregressive Language Model for the Nordic Languages2023.05.22

Ariel Ekgren, Amaru Cuba Gyllensten, F. Stollenwerk, Joey Ohman, Tim Isbister, etc


ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination2023.05.22

Dongfang Li, Jindi Yu, Baotian Hu, Zhenran Xu, Min Zhang


Infor-Coef: Information Bottleneck-based Dynamic Token Downsampling for Compact and Efficient language model2023.05.21

Wenxin Tan


Contrastive Learning with Logic-driven Data Augmentation for Logical Reasoning over Text2023.05.21

Qiming Bao, Alex Yuxuan Peng, Zhenyun Deng, Wanjun Zhong, Neset Tan, etc


Retrieving Texts based on Abstract Descriptions2023.05.21

Shauli Ravfogel, Valentina Pyatkin, Amir D. N. Cohen, Avshalom Manevich, Yoav Goldberg


Pruning Pre-trained Language Models with Principled Importance and Self-regularization2023.05.21

Siyu Ren, Kenny Q. Zhu


Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers2023.05.21

Linyuan Gong, Chenyan Xiong, Xiaodong Liu, Payal Bajaj, Yiqing Xie, etc


Pointwise Mutual Information Based Metric and Decoding Strategy for Faithful Generation in Document Grounded Dialogs2023.05.20

Yatin Nandwani, Vineet Kumar, Dinesh Raghu, Sachindra Joshi, L. Lastras


Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning2023.05.20

Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang


LogiCoT: Logical Chain-of-Thought Instruction-Tuning Data Collection with GPT-42023.05.20

Hanmeng Liu, Zhiyang Teng, Leyang Cui, Chaoli Zhang, Qiji Zhou, etc


Self-QA: Unsupervised Knowledge Guided Language Model Alignment2023.05.19

Xuanyu Zhang, Qing Yang


SelfzCoT: a Self-Prompt Zero-shot CoT from Semantic-level to Code-level for a Better Utilization of LLMs2023.05.19

IokTong Lei, ZhiDong Deng . - 【arXiv.org】


Self-Agreement: A Framework for Fine-tuning Language Models to Find Agreement among Diverse Opinions2023.05.19

Shiyao Ding, Takayuki Ito . - 【arXiv.org】


BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases2023.05.19

Xin Liu, Muhammad Khalifa, Lu Wang


STOAT: Structured Data to Analytical Text With Controls2023.05.19

Deepanway Ghosal, Preksha Nema, A. Raghuveer . - 【arXiv.org】


Decouple knowledge from paramters for plug-and-play language modeling2023.05.19

Xin Cheng, Yankai Lin, Xiuying Chen, Dongyan Zhao, Rui Yan . - 【arXiv.org】


Enhancing Personalized Dialogue Generation with Contrastive Latent Variables: Combining Sparse and Dense Persona2023.05.19

Yihong Tang, Bo Wang, Miao Fang, Dongming Zhao, Kun Huang, etc . - 【arXiv.org】


XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters2023.05.19

Xuanyu Zhang, Qing Yang, Dongliang Xu


Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning2023.05.19

Mustafa Safa Ozdayi, Charith S. Peris, Jack G. M. FitzGerald, Christophe Dupuy, Jimit Majmudar, etc . - 【arXiv.org】


RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought2023.05.19

Tianci Xue, Ziqi Wang, Zhenhailong Wang, Chi Han, Pengfei Yu, etc . - 【arXiv.org】


LLM Itself Can Read and Generate CXR Images2023.05.19

Suhyeon Lee, Won Jun Kim, Jong-Chul Ye . - 【arXiv.org】


Post Hoc Explanations of Language Models Can Improve Language Models2023.05.19

Satyapriya, Krishna, Jiaqi Ma, Dylan Slack, Asma Ghandeharioun, etc . - 【arXiv.org】


Federated Foundation Models: Privacy-Preserving and Collaborative Learning for Large Models2023.05.19

Sixing Yu, J. P. Muñoz, A. Jannesari . - 【arXiv.org】


Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning2023.05.19

Po-Nien Kung, Nanyun Peng . - 【arXiv.org】


AutoTrial: Prompting Language Models for Clinical Trial Design2023.05.19

Zifeng Wang, Cao Xiao, Jimeng Sun . - 【arXiv.org】


Democratized Diffusion Language Model2023.05.18

Nikita Balagansky, Daniil Gavrilov . - 【arXiv.org】


Ahead-of-Time P-Tuning2023.05.18

Daniil Gavrilov, Nikita Balagansky . - 【arXiv.org】


SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation2023.05.18

Junkai Zhou, Liang Pang, Huawei Shen, Xueqi Cheng . - 【arXiv.org】


How does the task complexity of masked pretraining objectives affect downstream performance?2023.05.18

Atsuki Yamaguchi, Hiroaki Ozaki, Terufumi Morishita, Gaku Morio, Yasuhiro Sogawa . - 【arXiv.org】


Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings2023.05.18

Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Chong Deng, etc . - 【arXiv.org】


ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval2023.05.18

Yue Yu, Yuchen Zhuang, Rongzhi Zhang, Yu Meng, Jiaming Shen, etc . - 【arXiv.org】


Efficient Prompting via Dynamic In-Context Learning2023.05.18

Wangchunshu Zhou, Yuchen Jiang, Ryan Cotterell, Mrinmaya Sachan . - 【arXiv.org】


LIMA: Less Is More for Alignment2023.05.18

Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, Jiao Sun, etc . - 【arXiv.org】


SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities2023.05.18

Dong Zhang, Shimin Li, Xin Zhang, Jun Zhan, P. Wang, etc . - 【arXiv.org】


The Web Can Be Your Oyster for Improving Large Language Models2023.05.18

Junyi Li, Tianyi Tang, Wayne Xin Zhao, Jingyuan Wang, J. Nie, etc . - 【arXiv.org】


TOME: A Two-stage Approach for Model-based Retrieval2023.05.18

Ruiyang Ren, Wayne Xin Zhao, J. Liu, Huaqin Wu, Ji-rong Wen, etc . - 【arXiv.org】


When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario2023.05.17

Chengcheng Han, Liqing Cui, Renyu Zhu, J. Wang, Nuo Chen, etc . - 【arXiv.org】


Emergent and Predictable Memorization in Large Language Models2023.04.21

Stella Rose Biderman, Usvsn Sai Prashanth, Lintang Sutawika, Hailey Schoelkopf, Quentin G. Anthony, etc


Improving Multiparty Interactions with a Robot Using Large Language Models2023.04.19

Prasanth Murali, Ian Steenstra, Hye Sun Yun, Ameneh Shamekhi, T. Bickmore . - 【CHI Extended Abstracts】


Large Language Models Can Be Used to Estimate the Latent Positions of Politicians2023.03.21

Patrick Y. Wu, Joshua A. Tucker, Jonathan Nagler, Solomon Messing


SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks2023.03.01

Kai-Wei Chang, Yu-Kai Wang, Hua Shen, Iu-thing Kang, W. Tseng, etc . - 【ArXiv】


Soft Prompt Guided Joint Learning for Cross-Domain Sentiment Analysis2023.03.01

Jingli Shi, Weihua Li, Quan-wei Bai, Yi Yang, Jianhua Jiang . - 【ArXiv】


EvoPrompting: Language Models for Code-Level Neural Architecture Search2023.02.28

Angelica Chen, David Dohan, David R. So . - 【ArXiv】


More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models2023.02.23

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, C. Endres, Thorsten Holz, etc . - 【ArXiv】


Grimm in Wonderland: Prompt Engineering with Midjourney to Illustrate Fairytales2023.02.17

M. Ruskov . - 【ArXiv】


LabelPrompt: Effective Prompt-based Learning for Relation Classification2023.02.16

W. Zhang, Xiaoning Song, Zhenhua Feng, Tianyang Xu, Xiaojun Wu . - 【ArXiv】


Prompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech Recognition2023.02.16

Minsu Kim, Hyungil Kim, Y. Ro . - 【ArXiv】


Prompting for Multimodal Hateful Meme Classification2023.02.08

Rui Cao, R. Lee, Wen-Haw Chong, Jing Jiang . - 【Conference on Empirical Methods in Natural Language Processing】


Toxicity Detection with Generative Prompt-based Inference2022.05.24

Yau-Shian Wang, Y. Chang . - 【ArXiv】


Learning to Transfer Prompts for Text Generation2022.05.03

Junyi Li, Tianyi Tang, J. Nie, Ji-rong Wen, Wayne Xin Zhao . - 【North American Chapter of the Association for Computational Linguistics】


RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction2022.03.17

Yew Ken Chia, Lidong Bing, Soujanya Poria, Luo Si . - 【Findings】


QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition2022.03.03

Andy T. Liu, Wei Xiao, Henghui Zhu, Dejiao Zhang, Shang-Wen Li, etc . - 【ArXiv】


PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts2022.02.02

Stephen H. Bach, Victor Sanh, Zheng Xin Yong, Albert Webson, Colin Raffel, etc . - 【Annual Meeting of the Association for Computational Linguistics】


Few-Shot Bot: Prompt-Based Learning for Dialogue Systems2021.10.15

Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung . - 【ArXiv】


SentiPrompt: Sentiment Knowledge Enhanced Prompt-Tuning for Aspect-Based Sentiment Analysis2021.09.17

Chengxi Li, Feiyu Gao, Jiajun Bu, Lu Xu, Xiang Chen, etc . - 【ArXiv】


LightNER: A Lightweight Tuning Paradigm for Low-resource NER via Pluggable Prompting2021.08.31

Xiang Chen, Lei Li, Shumin Deng, Chuanqi Tan, Changliang Xu, etc . - 【International Conference on Computational Linguistics】


Program Synthesis with Large Language Models2021.08.16

Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, H. Michalewski, etc . - 【ArXiv】


Evaluating Large Language Models Trained on Code2021.07.07

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde, etc . - 【ArXiv】


KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction2021.04.15

Xiang Chen, Ningyu Zhang, Ningyu Zhang, Xin Xie, Shumin Deng, etc . - 【The Web Conference】


Language Models as Knowledge Bases?2019.09.01

Fabio Petroni, Tim Rocktäschel, Patrick Lewis, A. Bakhtin, Yuxiang Wu, etc . - 【Conference on Empirical Methods in Natural Language Processing】


Leveraging Commonsense Knowledge from Large Language Models for Task and Motion Planning

Yan Ding, Xiaohan Zhang


AdaPrompt: Adaptive Prompt-based Finetuning for Relation Extraction

Xiang Chen, Xin Xie, Ningyu Zhang, Jiahuan Yan, Shumin Deng, etc


SmartMoE: Efficiently Training Sparsely-Activated Models through Combining Offline and Online Parallelization

Mingshu Zhai, Jiaao He, Zixuan Ma, Zan Zong, Runqing Zhang, etc . - 【USENIX Annual Technical Conference】


ChatGPT-Based Learning Platform for Creation of Different Attack Model Signatures and Development of Defense Algorithm for Cyberattack Detection

T. Santhi, K. Srinivasan . - 【IEEE Transactions on Learning Technologies】

CONTINUE...