Link to the paper: https://
github.com/deepseek-ai/De
epSeek-OCR/blob/main/DeepSeek_OCR_paper.pdf
… My "Understanding Multimodal LLMs" article with more info on how images are fed to LLMs, how cross-attention works, etc: https://
magazine.sebastianraschka.com/p/understandin
g-multimodal-llms?utm_source=publication-search
…
DeepSeek OCR Paper and Multimodal LLM Article Resources
By
–
Leave a Reply