Post-training for Efficient Communication via Convention Formation

Generative AI & LLMs
Published: arXiv: 2508.06482v1
Authors

Yilun Hua Evan Wang Yoav Artzi

Abstract

Humans communicate with increasing efficiency in multi-turn interactions, by adapting their language and forming ad-hoc conventions. In contrast, prior work shows that LLMs do not naturally show this behavior. We develop a post-training process to develop this ability through targeted fine-tuning on heuristically identified demonstrations of convention formation. We evaluate with two new benchmarks focused on this capability. First, we design a focused, cognitively-motivated interaction benchmark that consistently elicits strong convention formation trends in humans. Second, we create a new document-grounded reference completion task that reflects in-the-wild convention formation behavior. Our studies show significantly improved convention formation abilities in post-trained LLMs across the two evaluation methods.

Paper Summary

Key Innovation
This paper proposes a targeted post-training process to develop the ability of LLMs to spontaneously form conventions during linguistic interactions. This is achieved by fine-tuning the models on heuristically identified demonstrations of convention formation, which are extracted from human corpora.
Practical Impact
The practical impact of this research is that it enables LLMs to communicate more efficiently and adaptably in multi-turn interactions. This has significant implications for natural language processing applications, such as chatbots, virtual assistants, and language translation systems.
Analogy / Intuitive Explanation
Imagine you're playing a game where you have to refer to different objects, and each time you need to decide how to best describe it. At first, you might use a lot of words to explain what the object is, but as you continue playing, you start using shortcuts and abbreviations to make communication more efficient. This is similar to what humans do when they form conventions during linguistic interactions. The post-training process proposed in this paper helps LLMs learn to do the same thing, making their language processing abilities more human-like. In summary, this research addresses the problem of LLMs lacking the ability to form conventions during linguistic interactions, proposes a targeted post-training process to develop this ability, and demonstrates its practical impact on improving communication efficiency.
Paper Information
Categories:
cs.CL cs.AI cs.LG
Published Date:

arXiv ID:

2508.06482v1

Quick Actions