LLM-aided OCR
github.com/Dicklesworthstone/llm_aided_ocr →Tesseract gets it wrong. Let an LLM fix it
WHAT IT SOLVES
Tesseract OCR is free and everywhere, but it mangles characters constantly. Instead of retraining or tuning, this repo hands the messy output to an LLM and says: proofread this
WHY IT'S INTERESTING
Don't rebuild the wheel — patch it
Tesseract writes the rough draft, the LLM does copyediting. Scanned PDF → Tesseract raw text → LLM corrects per-chunk and spits out clean Markdown. The clever bit is smart chunking: you can't stuff a whole page into a context window, so it breaks text semantically, fixes each piece, then stitches it back
Works with local models too
Not locked to a single API — supports both local LLMs and cloud endpoints. For anyone scanning contracts or medical docs, that's not a nice-to-have, it's the whole point. Docker-ready, 63 commits, proper changelog — this isn't a weekend drop-and-forget
TECH GUESS
Python, Tesseract OCR under the hood, OpenAI-compatible API interface, Docker for deployment
Discussion (0)
- No comments yet — be the first.
Related
#028▶ 323Autocomplete for CAD is here
#027▶ 400A Chrome extension that helps founders write cold emails and LinkedIn DMs
#026▶ 499A trading app that went from idea to App Store by vibing
