New blog post: converting 30k @arxiv papers to Markdown using SOTA OCR models to enable chat with paper functionality
— Niels Rogge (@NielsRogge) 7 avril 2026
Includes:
> leveraging an open OCR model (Chandra 2 by @datalabto)
> running on GPU infra – @huggingface Jobs
> using Codex with a SKILL.md pic.twitter.com/jrpin9oq5u
New blog post: converting 30k @arxiv papers to Markdown using SOTA OCR models to enable chat with paper functionality Includes: > leveraging an open OCR model (Chandra 2 by @datalabto) > running on GPU infra – @huggingface Jobs > using Codex with a SKILL.md



