Thursday, February 28, 2013

How to hack Scansnap Organizer to use any PDF documents

To avoid clutter, I scan everything: bank statements, business cards, brochures, receipts, anything. Then I OCR them (convert the image into text), store them in the cloud, share them in all my computing devices, and, in my laptop, heavily index them so I can find anything I ever touch, or so the idea. (hmm... what was the phone number of that car rental company I got a brochure last trip to bali... just run search! done)

Thus, one of my pride and joy, (and nearly single most expensive device acquisition outside of a laptop or a smart phone) is my Fujitsu Scansnap S1500. This work horse scanner can scan double-sided up to 50 page per minute. And for someone who scan just about everything, this is a must. And at $500, the Scansnap is a necessary evil. (Don't try it unless you can afford to fall in love with it, its dangerously convenient).

Sadly, the ABBYY Finereader that comes with it (that OCR docs better than Adobe PDF Pro)  only works for documents I scan on the Scansnap. Same with the Scansnap Organizer and Viewer (that can rearrange PDF documents since my Adobe PDF Pro stops working). 

And since I convince a lot of people to send documents to me already in PDF from various scanners, camera, etc. This is annoying. So here is the hack that I found on the web (see http://www.jsilence.org/blog/2011/01/25/edit-pdf-metadata/ ) to allow me to trick MOST pdf documents to be usable.

The trick is to add/edit the "CREATOR" tag of the PDF document to say "ScanSnap Manager #S1500M".

Step 1. Get and install a free software called PDFTk.

The software installation should install the software in 

C:\Program Files (x86)\PDF Labs\PDFtk Server\bin

Step 2. Create a directory in Window. Mine is called FixPDF.

Step 3. In the above directory, Create a text file: scansnap_meta.txt containing:

InfoBegin
InfoKey: Creator
InfoValue: ScanSnap Manager #S1500M


This file contains the meta data of the PDF, and will be used by PDFtk to inject into the pdf file to be fixed.
 
Step 4. Then create also a script text file: FixPDF.bat containing:

for %%a in (*.pdf) do (
"C:\Program Files (x86)\PDF Labs\PDFtk Server\bin\pdftk.exe" "%%a" update_info scansnap_meta.txt output "..\%%a"
)

This script will run the PDFtk software on all '*.pdf ' files in the directory and replace the original copy as the output.

Step 5. Put all PDF files to fix in the directory.

Step 6. Run the FixPDF.bat script.

Done.

Personal Computing - 2013 edition - Jakarta in the Cloud

Back to bloggin after all this time...

Well, i'm here to update the world how I configure my computer to face 2013 and onward.

Same setting, but now more cloud integrated.Now, Jakarta is much more Internet enabled, and it is finally possible to shift massive quantity of my personal data in the cloud to survive crashes, obsolescence, and just plain simple I forgot my computer at home, issue.

First, my personal computers revolves around a Dell Studio 14 (old) laptop, a Blackberry Onyx, and a Samsung Galaxy Tab, (and various mp3 players, TV and DVD Recorder with DivX player, my wife's iPad and my daughters' kindle, android, netbook, laptop, etc etc.)

A 3Mbps cable modem (12Mbps this month only, due to a promo by Firstmedia) connects my house to the world, sadly with only 100Kbps upload capability.This connection is shared via 300m+ of UTP cables and no less than 4 wifi access points accross my house and my inlaw's. I must have the most networking hardware in my entire apartment complex.

In my offices I have 5Mbps leased lines but shared among heavy gaming and torrent users. Needless to say, I get better bandwidth personally at home.

Now, as 2013 rolls around. I want to upgrade my systems and those of my family to allow us to share more data in the cloud (like 400MB of just about every documents that my family comes accross, bills, bank statements, passport, house deeds, brochure for car rentals, etc.), watch movie (Samsung Allshare?), and backup the ever-crashing kid's computers.

Anyways, in various other posts I will be discussing various aspect of my computing platform. 

Anyone interested? Care to share yours?