How Neural Networks Detect and Interpret Wordplay: New Insights from HSE Researchers

An international team including researchers from the HSE Faculty of Computer Science has presented KoWit-24, an annotated dataset of 2,700 Russian-language Kommersant news headlines containing wordplay. The dataset enables an assessment of how artificial intelligence detects and interprets wordplay. Experiments with five large language models show that even advanced systems still make mistakes, and that interpreting wordplay is more challenging for them than detecting it. The results were presented at the RANLP conference; the paper is available on Arxiv.org, and the dataset and the code for reproducing the experiments are available on GitHub.
Wordplay refers to deliberate use of language that violates linguistic norms in order to attract attention, entertain, or amuse the reader. It is common in Russian news headlines and can take various forms. For example, the headline ‘Osobo bumazhnye persony’ plays on the phrase ‘Osobo vazhnye persony’ (Russian for ‘very important persons’). The word vazhnye (‘important’) is replaced with bumazhnye (‘paper-related’), which rhymes with the original and shifts the meaning toward the topic of paper production. Another example is ‘Kod naklikal,’ the headline of an article about open-source code. It closely resembles ‘kot naplakal,’ an idiom meaning ‘very little,’ thereby creating a humorous ambiguity.
For human readers, such wordplay in headlines is immediately apparent and requires no explanation. However, large language models such as ChatGPT or GigaChat Max are often at a loss, struggling not only to detect the wordplay but even more so to explain the joke. One reason for this difficulty is the limited humour datasets on which LLMs are trained. In most cases, humour in these datasets is represented by canned internet jokes explicitly labelled as ‘jokes,’ which is insufficient for the models to learn why something is funny. In addition, such datasets contain almost no annotation—there are no machine- or human-readable layers of description indicating whether wordplay is present, what type of technique is used, what the headline refers to, and so on.
Researchers from the HSE Faculty of Computer Science, in collaboration with colleagues from IT:U—Interdisciplinary Transformation University Austria—and independent researchers, have created KoWit-24, a dataset dedicated to wordplay. It comprises 2,700 headlines from the Russian business daily Kommersant published between January 2021 and December 2023, along with contextual information: each headline is accompanied by a short description of the news story (the lead) and a summary. For each instance of wordplay, the authors manually annotated the type of technique, identified the anchors—the words that trigger the wordplay—and, where possible, linked the original expressions to relevant Wikipedia articles.
The authors adopted linguist Alan Scott Partington’s definition of wordplay, according to which wordplay occurs when the same expression can be interpreted in at least two ways and this effect is intentional. Wordplay can arise in several ways. One case involves ambiguity inherent in a word or its sound. For example, in the headline ‘Volgu ne mogut zastavit’ tech’ bystree,’ the word Volgu (Volga) refers both to the river and to a federal highway with the same name. Another case involves a slight modification of a well-known phrase or title, in which the author alters the wording while relying on the reader to recognise the original and complete the joke. For instance, ‘Missiya sokratima’ alludes to ‘Missiya nevypolnima,’ the Russian title of the film Mission: Impossible, while the headline itself suggests that a diplomatic mission can be downsized.
The researchers also distinguished ‘nonce words’—coined for a single occasion—and oxymorons, which combine two contradictory meanings. This approach not only allowed them to collect and describe examples but also to compare the performance of different language models.
After annotation, the authors tested the dataset on five LLMs: GPT-4o, YandexGPT-4, GigaChat Lite, GigaChat Max, and Mistral NeMo. Each model was provided with a headline and the corresponding news lead and asked to perform two tasks: first, to determine whether the headline contained wordplay, and second, to interpret it by identifying the original phrase or reference. The researchers compared the effects of two types of prompts: a simple prompt asking whether the headline contained wordplay, and an extended prompt providing a definition along with examples of different wordplay types. The extended prompt improved performance on the detection task for three of the five models, while GPT-4o demonstrated the strongest performance in both detection and interpretation. For all models, interpreting the source of the joke proved significantly more difficult than simply detecting the presence of wordplay.
Pavel Braslavski
‘KoWit-24 addresses two key limitations of earlier datasets: it provides context for each headline and includes multi-level annotation. This transforms a collection of examples into a full-fledged “testbed” for AI. It now allows for an objective comparison of models—whether a model can detect wordplay, identify the anchor, and correctly recall the original phrase or reference. Such verifiable metrics not only allow for a more accurate evaluation of current systems but also support their intentional improvement through selection of prompts, training examples, and fact-checking strategies. In the future, we plan to investigate whether this dataset can be used to enhance humour generation,’ says Pavel Braslavski, Associate Professor at the HSE Faculty of Computer Science and co-author of the paper.
In addition, the dataset establishes a common and transparent standard for evaluation, as researchers use the same data and experimental scripts. This reduces variability in the results and helps develop models that better understand natural language, rather than merely following the logical structure of the text.
See also:
HSE and Yandex Propose Method to Speed Up Neural Networks for Image Generation
A team of scientists at HSE FCS and Yandex Research has proposed a method that reduces computational costs and accelerates text-to-image generation in diffusion models without compromising quality. These models currently set the standard for text-to-image generation, but their use is limited by high computational loads, the company said in a statement.
HSE Scientists Identify Effective Models for Training Research Personnel for Industry
Experts from the HSE Institute for Statistical Studies and Economics of Knowledge have examined industrial PhD programmes across 19 countries worldwide. The analysis shows that the key components of an effective model include co-funding by universities, industry, and government; dual academic supervision; and flexible intellectual property arrangements. The findings have been published in Foresight and STI Governance.
HSE Biologists Identify Factors That Accelerate Breast Cancer Recurrence
Scientists at HSE University have identified a molecular mechanism underlying aggressive breast cancer. They found that the signals supporting tumour growth originate not from the tumour itself but from its microenvironment. The researchers also demonstrated that reduced levels of the IGFBP6 protein in the tumour microenvironment lead to the accumulation of macrophages—immune cells associated with a higher risk of cancer recurrence. These findings already make it possible to assess patient risk more accurately and may, in the future, enable the development of drugs that target cells of the tumour microenvironment. The study has been published in Current Drug Therapy.
HSE University and Moscow DIT Partner to Advance 5G and 6G Networks
The Moscow Department of Information Technology and HSE University have signed a cooperation agreement in the field of innovative development of the capital’s IT infrastructure. The parties agreed on joint research into modern and promising communication technologies, including 5G and 6G, as well as AI, the Internet of Things, and other smart city technologies.
HSE University Presents Research Results at AI Conference in Oman
In April 2026, the International Conference on Intelligent Systems and Artificial Intelligence Applications (ISAA 2026) was held at the University of Nizwa in the Sultanate of Oman. The event was co-organised by HSE University, the University of Nizwa, and the University of Technology and Applied Sciences–Ibri. Researchers from HSE University were among the key speakers at the conference.
Russian Scientists Propose Method to Speed Up Microwave Filter Design
Researchers at HSE MIEM, in collaboration with colleagues from the Moscow Technical University of Communications and Informatics (MTUCI), have implemented a novel approach to designing microwave filters—generative synthesis using machine learning tools. The proposed method reduces the filter development cycle from several days to just a few minutes and in the future could be applied to the design of other microwave electronic devices. The results were presented at the IEEE International Conference '2026 Systems of Signals Generating and Processing in the Field of on Board Communications.'
Scientists Find That Only Technological Innovations Consistently Advance Environmental Sustainability
Renewable energy and labour productivity do not always contribute to environmental sustainability. Technological innovation is the only factor that consistently has a positive effect. This is the conclusion reached by an international team of researchers, including Natalia Veselitskaya, Leading Research Fellow at the HSE ISSEK Foresight Centre. The study has been published in Sustainable Development.
HSE’s CardioLife Test Among Winners of Data Fusion Awards 2026
The CardioLife genetic test—a development by the Centre for Biomedical Research and Technologies of the AI and Digital Science Institute at HSE University’s Faculty of Computer Science—has won the All-Russian cross-industry Data Fusion Awards, which recognise achievements in data and AI technologies. The project took first place in the Science–Business Partnership category, demonstrating a successful model for transferring technology from university research into the real healthcare sector.
HSE Researchers Train Neural Network to Predict Protein–Protein Interactions More Accurately
Scientists at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a model capable of predicting protein–protein interactions with 95% accuracy. GSMFormer-PPI integrates three types of protein data (including information about protein surface properties) to analyse relationships between proteins, rather than simply combining datasets as in previous models. The solution could accelerate the discovery of disease molecular mechanisms, biomarkers, and potential therapeutic targets. The paper has been published in Scientific Reports.
HSE University Installs Geoscan Station at IIT Bombay
A Russian ground station for receiving SONIKS satellite data has been installed on the campus of the Indian Institute of Technology Bombay (IIT Bombay). Developed by Geoscan, the system will become part of a mirror laboratory project run jointly by HSE University and one of India’s leading universities.


