✨ Meet #ResiDual, a novel perspective on the alignment of multimodal latent spaces!
— Valentino Maiorca (@ValeMaiorca) November 4, 2024
Think of it as a spectral "panning for gold" along the residual stream. It improves text-image alignment by simply amplifying task-related directions! 🌌🔍 https://t.co/UuXoYBBsT5
[1/6] pic.twitter.com/z75Vd7iQYs