Skip to content

Latest commit

 

History

History
15 lines (7 loc) · 1.94 KB

README.md

File metadata and controls

15 lines (7 loc) · 1.94 KB

The following files are here:

PDF Wrangling: A look at various tools for dealing with PDF documents. During the presentation, I will show some example files -- a file successfully parsed in Comet Docs , a purposely redundant example from Able2Extract and an experimental text file scanned from my phone and what it looks like parsed; a PDF file of New Jersey crime statistics; and an example of the the problems you run into with one program that magically gets fixed when using another.

I also have the powerpoint from a lightning talk I'm giving about standardizing data with z-scores, along with a sample spreadsheet.

And I will help Crina Boros teach an SQL class -- the exercise is here.

During the conference, I will also be helping Nils Mulvad teach classes on using Document Cloud and Google Refine. You can find his material here.

Please send you comments/complaints/updates to me at rgebeloff@nytimes.com