I'm impressed with the token efficiency! 258 tokens representing the image and being able to get all that detail out is really good. I recall that GPT4-vision in low quality mode does ~100ish tokens and doesn't get detail out that well. Might be an interesting case for a
Token Efficiency in Vision AI Models Comparison
By
–
Leave a Reply