Right now it's constrained on kontext only taking a single image. You add multiple by concatenating images into a single one. The more images you pass in: – the less information gets kept from each one (fewer pixels per image)
– the harder it gets to prompt
– the more likely
Multimodal AI Image Processing Constraints and Trade-offs
By
–
Leave a Reply