Tuesday 9 November 2010

Quickly search several PDFs at once

Takes PDFs as arguments, and can be used with nautilus-actions or thunar's custom actions. The script uses grep and thus grep regex, but i've added easy conditional searching at the expense of some limitations regarding normal grep conditionals. You're presented with this when you run it, which just about covers things:

The search ignores case. Conditionals cannot be joined. Let me know if this is a problem.
Wildcards:
A single . represents any character, and an asterisk represents zero or more occurances of the preceeding character eg:
'c...h' will find catch, clash, cloth, coach etc.
'c.*h' will find caliph, cash, catch, cheesecloth etc.
Conditionals:
OR : '(this|that|the)' will show lines containing 'this', 'that', or 'the'.
AND : '(this&&that&&the)' will search for lines that contain 'this', 'that' and 'the'.
NOT : '(this!that)' will search for lines that contain 'this' but don't contain 'that'

The script can be downloaded here. You'll need pdftotext for this to work.

No comments:

Post a Comment