Could multimodal vision language models (VLMs) help biodiversity researchers retrieve images for their studies? MIT CSAIL, @ucl
, @inaturalist
, @EdinburghUni
, & @UMassAmherst researchers designed a performance test to find out. Each VLM’s task: Locate & reorganize the most
Multimodal VLMs for Biodiversity Image Retrieval Research
By
–
Leave a Reply