Skip to content

Instantly share code, notes, and snippets.

@dannguyen
Created June 23, 2015 18:42
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dannguyen/3e0daa63fecb75d5151f to your computer and use it in GitHub Desktop.
Save dannguyen/3e0daa63fecb75d5151f to your computer and use it in GitHub Desktop.
Trying pytesseract on a mixed-media screengrab like the New York Times homepage

Python-tesseract, i.e. pytesseract, is a Python wrapper around Google's Tesseract, an optical-character recognition program.

Here's a basic snippet:

from PIL import Image
import pytesseract
fname = './nytimes.com.201506231100.jpg'
img = Image.open(fname)
print(pytesseract.image_to_string(img)))

Trying it on a screengrab of the NYTimes homepage.

nytimes.com

Not too shabby. It messes up on the decorative text, such as the NYT's logo, but catches SHOP THE NEW COLLECTION in the Cartier ad (though not the word, CLOSE):

U.S. INTERNATIONAL

(;,,.,,-H. (7.l1)eNv21n flork Eimes  (;,,.,,.(,,.

!.I.§.

   

Tuesday, June 23, 2015 El Today's Paper I4 Video 87°F Nasdaq 0.00% 1

World US. Politics NewYork Business Opinion Technology Science Health Sports Arts Style Food Travel Magazine Real Estate ALL

Q/we/«
L

(70//er/1'0/'1 P0/'/Lu bi/1-if’//’ ll?/gz/6

> SHOP THE NEW COLLECTION

 

The Opinion Pages

Senate Vote

Puts Trade Bill,
Obama Priority,

on Path to Pass

A crucial vote cleared the way
for legislation to be on the
president’s desk this week
giving him the power to
complete the Trans-Pacific

Partnership.
' 63 Comments

Renewed Campaigns

Against Confederate
Flag Across South

   

OP-DOCS

Op-Docs: ‘Who Sounds
Gay?’

This short
documentary
explores the reasons that some

men sound stereotypically gay,
whether they are or not.

 

~ Brooks: Fracking and the
Franciscans

~ Editorial: Take Down the
Confederate Flag

OP-ED CONTRIBUTOR

Why Women Apologize
and Should Stop

Women say ‘sorry? \, vi"!
too much, whether l,‘ A‘
they mean it or not. Why?

~ Taking Note: Skyrocketing
Extinctions Put Humans at
Risk

~ Op-Ed: A Republican Case for
Obama's Cuba Policy

~ Op-Ed: The Iran Deal’s Fatal
Flaw

A Woman on a $10 Bill? Readers Respond Eimcs

Disney Has No Comment on the Recent Reversal of

Layoffs

-;‘;‘i‘i'r.im’
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment