AI Dynamics

Global AI News Aggregator

HiDrop: Efficient Visual Token Reduction for Multimodal LLMs

What if MLLMs could process visual data much faster without sacrificing performance? Eastern Institute of Technology, Ningbo, with USTC, SJTU, and LMU Munich presents HiDrop just for that! This new framework intelligently reduces visual tokens by processing them only when active fusion truly begins (Late Injection) and dynamically pruning them across deeper layers (Concave Pyramid Pruning with Early Exit). It focuses computation where it matters most. HiDrop compresses ~90% of visual tokens, matches original MLLM performance, and accelerates training by 1.72x. A new state-of-the-art for efficient MLLM training & inference! HiDrop: Hierarchical Vision Token Reduction in MLLMs via Late Injection, Concave Pyramid Pruning, and Early Exit Paper: arxiv.org/pdf/2602.23699 Code: github.com/EIT-NLP/HiDrop Our report: mp.weixin.qq.com/s/QKGZ7cFi0… 📬 #PapersAccepted by Jiqizhixin

→ View original post on X — @jiqizhixin, 2026-04-06 10:16 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *