Textifying Your PDFs

There are several reasons why you might want to textify your PDFs. For example, if you textify everything (including your emails), you can have all of your data available at a tiny fraction of the original size. The process of textifying PDFs is relatively quick and painless on a Mac.

PDF

Automator

Automator is supposed to be your personal assistant. By giving “Otto” the automator instructions called “workflows,” you can automate a lot of tasks that would be quite unpleasant to do manually. Thanks to Otto, textifying thousands of PDFs is something you can finish before breakfast.

Apple has a pretty good explanation of the Automator app on its website. There are also a lot of resources available on the web if you want to learn more.

The Recipe

Here is a screenshot of the Automator recipe you will need to textify your PDFs: “Get Specified Finder Items” and “Extract PDF Text.” You “Add” the PDFs and press the “Run” button. That’s it. Once you are finished, you can put the .txt files into Evernote (select them all and drag them into a notebook), nvALT, or VoodooPad (import them).

PDF

One Problem with Lots of PDFs

If you have hundreds or thousands of PDFs, you might need to do this in batches. In my case, I can usually only do about 450 at a time, because Automator seems to make use of your available disk space to run the workflow, and unless you have a huge amount of local storage, you will crash the computer. It is no big deal, but a minor annoyance. Restarting the computer (after the batch is complete or if you crash the computer) will clear the memory. If you know of a better solution, please drop me a line.

One Problem with Importing into Evernote

Unfortunately, I have not figured out how to get the text file to be created in an Evernote-compatible format. Automator creates a UTF-16 file, which can be dragged into a notebook, but will end up as an attachment to a note, not as a note with the content of the PDF in the note body.

One solution for this is to import the text files into an app (like VoodooPad) that will automatically export them as UTF-8 (Evernote-compatible) text files. It is a hassle, but it only takes a few minutes. Of course, if you are using VoodooPad already, then you are all set! If you know of a better solution, please drop me a line.