OCR just got better

15 Jul, 2024 linkedin

🤠 OCR just got better.

Surya: a python package that detects the position of each line of text in an image / pdf.

TLDR:

It gives you the POSITION of each line of text in the document, NOT WHAT the text is
Works on complex layouts, bad captures (blurry/rotated)
Runs on GPU, takes 2sec / image for the CPU version
Doesn’t work on photos
Free to use commercially for companies <$5M revenue / year
Based on NVIDIA’s SegFormer
Named after the Hindu sun god, who has universal vision

👨‍💻Example use case newspaper to markdown, based on the size and position of the bounding boxes you infer the structure of the newspaper, run each block independently through OCR and convert it to a markdown. (or train an LLM on the The New York Times and get sued cc OpenAI)

Thanks Vik Paruchuri !!!

Link in the comments

theo martin

theo martin

OCR just got better