The development of large language models has accelerated in recent months. The HUN-REN Hungarian Research Centre for Linguistics has now introduced the PULI LlumiX model, which is a system optimised for the Hungarian language, continuously pre-trained and fine-tuned for instruction following. A detailed presentation of the model can be found in the study by Yang et al. (2025), which outlines the pre-training and fine-tuning procedures applied and the evaluation of the model's performance across various benchmarks.
PULI LlumiX is built on the Llama-2 architecture, and 66,000 English and 15,000 Hungarian prompts were used for fine-tuning. The system thus developed achieved outstanding results in tests tailored to the Hungarian language: it showed 66.98% accuracy on the HuCOLA test, 70.06% on the HuSST test, and 74.54% on the HuRTE test, surpassing previous Hungarian language models such as PULI Trio or HILANCO-GPTX. PULI LlumiX achieved these outstanding results on these tests without receiving advanced training for these specific tasks. This means that the model solved the tests by relying on its existing linguistic knowledge and general capabilities without previously seeing examples from them. During the research, special attention was paid to performance in extended text contexts, which was also examined using a "Needle in a Haystack" type test, demonstrating that the model can efficiently retrieve relevant information from large amounts of text.
In addition to the quantitative results, qualitative analyses have also confirmed the model's advanced capabilities. Based on detailed examinations, PULI LlumiX not only follows Hungarian language instructions accurately, but is also capable of adapting to different linguistic registers and properly handling social contexts. The analyses support that through transfer learning, the model can acquire significant knowledge from other languages as well, improving its Hungarian language performance. PULI LlumiX is thus not merely another language model, but could represent a milestone in the development of Hungarian language technology.
Sources:
1.

2.
https://rgai.inf.u-szeged.hu/sites/rgai.inf.u-szeged.hu/files/mszny2025 (1).pdf