« Open Mike: The Indestructible Squeaky Toy (OT for Dog Lovers) | Main | Leicaphilia (From the Vaults) »

Monday, 27 January 2014

Comments

I think you could very likely scan them a then optical recognition software would turn them into digital words.

Mike,

There must be a lot of good OCR software out there that can do this for you in a jiffy.

If you can email me the photos of the pages, and if they are clear enough, and depending how many there are, I could put them through the scanner with an OCR (optical character recognition) program and return them to you as word files. I wouldn't charge, but might not have time if there is a lot of stuff.

Another option might be for you to OCR them, although that requires a lot of proofreading/editing afterwards. Besides OCR programs, there are online services such as the free www.newocr.com.

Well I'm on a roll I guess. Not sure whether you were kidding about getting someone to help with text input??

A google search for "Scanned document to text" yields several possible options for taking an image of a page and doing OCR to created a file of text. I don't use OCR a lot but I know it works. Nad there are lots of cheap options for doing the deed - aka free.

Lots faster than tipi-typee.

Again - hope you are on the mend and feeling better.

Actually, probably better you scan the pages, rather than use a camera - just a thought - send as jpgs or whatever.

OCR software has gotten pretty good. I had occasion to us the OCR function built into Acrobat recently and was pleasantly surprised. Some reviews here: http://ocr-software-review.toptenreviews.com/

There is some really excellent scanner software that will convert your printed pages into digital documents. Not sure what is available for MAC.

Damn, too late. I work for sushi... :-)

A few thousand words would only be an hour or so... I'd be happy to volunteer for the site.

Hi have you thought about using some objective character recognition software? It's pretty good theses days and only requires editing to fix the errors.

Consider running the older documents through an OCR (optical character recognition) program and then cleaning it up. That will be much faster and more efficient.

Why give it to one person? Divide it up and give small sections to a bunch of us.

I'll take a piece.

You are kidding?

OCR is the way to go. I've tossed all sorts of printed stuff into my scanner, scanned to OCR.

With really poor and/or dirty originals, it may do only 90%. With clean originals, it only has trouble with words not in its dictionary, which may be added.

Moose

"it's an extended article about how to pull together a portfolio"

I can't wait to read this!

Either way, I am enjoying your 'repeats'. As others have offered, scan them and I'll OCR them and return them as plain text.

I'm late to the party, but I was going to suggest OCR too: apparently, Google docs will do it - http://www.labnol.org/internet/perform-ocr-with-google-docs/10059/

Best wishes,

Craig

If you care about the quality of the text don't OCR it unless you can get it proofread. And by "get it proofread" I mean "get it read by someone who has access to the original, knows how to proofread and who is not you".

I have now read enough ebooks which have been OCRd and have not been properly proofread to get to the point where I am seriously considering going back to paper[1]: quite apart from the sadly (because not inherently) dismal typographic quality of ebooks I find the rate of OCR errors in a lot of them to be depressingly high, especially as you get past the first chapter or so where the "proofreader" got bored.

And I know (as, I'm sure, do you!) from first-hand experience that proofreading your own text is a hopeless task: you don't spot the erors, even when they are glaring. You need someone else to do it, and someone who won't just read the first hundred words and decide it's all fine.

[1] Actually, because I am interested in typography, I never really left paper of course, any more than I have stopped making prints on paper.

The comments to this entry are closed.