IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User

Welcome to IWETHEY!

New Why PDF?
The requirements of your situation are not clear.

PDF allows exact replication of a page on most any system or printer. That's a good thing. However, PDF also generally allows things like a "find" operation for some text and may show the documents structure. If you are just scanning, you will not have this. What you are suggesting sounds like the only thing a PDF document will have is the image of a single scanned page. "Find" would find nothing. So, do you only want the image or are you interested in the text data as well? If you need access to the text (e.g. for Find), character recognition is a required step. The text would then needed to be formatted into a PDF file, but may not look like the original.

If it's only an image, other file types, say TIFF would work as well.

What are the requirements?

[link|http://www.transformmag.com/index.shtml|Here is a site] for ideas.

"Let others praise ancient times; I am glad I was born in these."\t-- Ovid (43 B.C.-A.D. 18)
New 'Cause that's how we're doing it.
I know, I know, but mine is not to ask why,
Mine is just to do or die.

I'm just making it work, and collecting my paycheck.
Gimli's Rules for Surviving in Middle Earth #43: When attempting to destroy an artifact, remember to use somebody else's axe.
New Actually, you still need to answer his questions
If the requirement is based on someone's assumption that PDFs can be searched, then you're going to need to OCR them. If they want exact reproduction and searchability, you're going to need to OCR them then attach keywords to the image file you save as a PDF.

I've seen dedicated archiving systems that did something like this, except at the time (~1994) they didn't have high-speed automated OCR.
Microsoft offers them the one thing most business people will pay any price for - the ability to say "we had no choice - everyone's doing it that way." -- [link|http://z.iwethey.org/forums/render/content/show?contentid=38978|Andrew Grygus]
New Acrobat Capture
Scans, OCRs, and indexes text. Very cool.
New We're not searching the documents.
There's a requirement for the documents to be held for 7 years post creation. We are keeping them (loosely) organized - if somebody actually does need a document, then we'll print it out and forward it to them.

We're using the system to clear out a fhuge storeroom of paper. All of our new documents are going into PDF format - so we're just bumping up the pipeline, and not OCRing the old documents for speed purposes.
Gimli's Rules for Surviving in Middle Earth #43: When attempting to destroy an artifact, remember to use somebody else's axe.
New Wakarimashita!

"No man's life, liberty, or property are safe while the legislature is in session."\t-- Mark Twain
New Acrobat Batch Mode
New And of course, this is when things get interesting...
Today we went over to Ikon office solutions to press some salesflesh. My boss and I went together, felt WAY out of place. Turns out he's looking to address the issues we've got at a much higher level than he'd indicated to me - he's looking at some serious ($20k+) hardware to handle scanning and document retention.

Don't think we're going to get it from Ikon, though - there's a guy there who was more than willing to tell us all about what our local competitor is up to, and we have a feeling he'd do the same for them. :P
Gimli's Rules for Surviving in Middle Earth #43: When attempting to destroy an artifact, remember to use somebody else's axe.
     High-speed document scanners? - (inthane-chan) - (9)
         Here ya go. - (pwhysall)
         Why PDF? - (a6l6e6x) - (7)
             'Cause that's how we're doing it. - (inthane-chan) - (6)
                 Actually, you still need to answer his questions - (drewk) - (5)
                     Acrobat Capture - (deSitter)
                     We're not searching the documents. - (inthane-chan) - (3)
                         Wakarimashita! -NT - (a6l6e6x)
                         Acrobat Batch Mode -NT - (deSitter)
                         And of course, this is when things get interesting... - (inthane-chan)

49 ms