MERGE3: Efficient Evolutionary Merging on Consumer-grade GPUs

Abstract

Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for consumer hardware. We introduce MERGE3, an efficient framework that makes evolutionary merging feasible on a single GPU by reducing fitness computation costs 50× while preserving performance. MERGE3 achieves this by Extracting a reduced dataset for evaluation, Estimating model abilities using Item Response Theory (IRT), and Evolving optimal merges via IRT-based performance estimators. Our method enables state-of-the-art multilingual and cross-lingual merging, transferring knowledge across languages with significantly lower computational overhead. We provide theoretical guarantees and an open-source library, democratizing high-quality model merging.

Publication
Proceedings of the 42nd International Conference on Machine Learning
Tommaso Mencattini
Tommaso Mencattini
Research Intern
Adrian R. Minut
Adrian R. Minut
PhD Student
Donato Crisostomi
Donato Crisostomi
PhD Student

PhD student @ Sapienza, University of Rome | former Applied Science intern @ Amazon Search, Luxembourg | former Research Science intern @ Amazon Alexa, Turin

Andrea Santilli
Andrea Santilli
Research Scientist, Nous Research

PhD Student passionate about natural language processing, representation learning and machine intelligence.

Emanuele Rodolà
Emanuele Rodolà
Full Professor