Molmo VLM is an interesting open-source family of models from AllenAI that excels in pointing objects, VQA, and analog clock face reading—tasks where even models like GPT-4o struggle. Its success lies in the
— Satya Mallick (@LearnOpenCV) 24 décembre 2024
PixMo dataset, a meticulously curated collection built from the ground… pic.twitter.com/N1fakvTy1a
Molmo VLM is an interesting open-source family of models from AllenAI that excels in pointing objects, VQA, and analog clock face reading—tasks where even models like GPT-4o struggle. Its success lies in the PixMo dataset, a meticulously curated collection built from the ground
Leave a Reply