Corpus Size vs Quality: New Research on the Efficiency of Hungarian Language Models
Hungarian language technology research has reached a significant milestone: a comprehensive study has revealed that a larger corpus size does not necessarily lead to improved performance in morphological analysis. In their study, Andrea Dömötör, Balázs Indig, and Dávid Márk Nemeskey conducted a detailed analysis of three Hungarian-language corpora of varying