From sunblock to softblock: Analyzing the correlates of neology in published writing and on social media

Generative AI & LLMs
Published: arXiv: 2602.13123v1
Authors

Maria Ryskina Matthew R. Gormley Kyle Mahowald David R. Mortensen Taylor Berg-Kirkpatrick Vivek Kulkarni

Abstract

Living languages are shaped by a host of conflicting internal and external evolutionary pressures. While some of these pressures are universal across languages and cultures, others differ depending on the social and conversational context: language use in newspapers is subject to very different constraints than language use on social media. Prior distributional semantic work on English word emergence (neology) identified two factors correlated with creation of new words by analyzing a corpus consisting primarily of historical published texts (Ryskina et al., 2020, arXiv:2001.07740). Extending this methodology to contextual embeddings in addition to static ones and applying it to a new corpus of Twitter posts, we show that the same findings hold for both domains, though the topic popularity growth factor may contribute less to neology on Twitter than in published writing. We hypothesize that this difference can be explained by the two domains favouring different neologism formation mechanisms.

Paper Summary

Problem
The main problem addressed by this research paper is understanding how languages change over time, particularly in the digital age. The authors aim to identify the factors that contribute to the creation of new words, or neology, in different contexts, such as published writing and social media.
Key Innovation
The key innovation of this paper lies in its extension of previous research on neology to a new corpus of Twitter posts, using a more robust estimation of the frequency growth monotonicity measure and additional metrics to test the demand hypothesis. The authors also use contextual embeddings, which are more suitable for social media data, to analyze the relationship between neology and topic popularity.
Practical Impact
This research has significant practical implications for understanding how language changes in the digital age. By identifying the factors that contribute to neology, researchers can gain insights into how language is adapted and created in different contexts, which can inform language teaching, language policy, and language technology development. Additionally, understanding the mechanisms of neology can help researchers identify potential linguistic innovations that may be useful for communication in diverse contexts.
Analogy / Intuitive Explanation
Imagine a language as a living organism that constantly evolves and adapts to its environment. Neology is like the process of mutation, where new words are created to fill gaps in the language or to describe new concepts. Just as a species may adapt to its environment by developing new traits, language users adapt to their context by creating new words. This research helps us understand how this process of linguistic adaptation occurs in different contexts, such as published writing and social media.
Paper Information
Categories:
cs.CL
Published Date:

arXiv ID:

2602.13123v1

Quick Actions