[sword-devel] Re: [sword-support] An awesome Bible
Leon Brooks
sword-devel@crosswire.org
Sun, 4 May 2003 13:13:10 +0800
On Fri, 2 May 2003 10:24, Keith Ralston wrote:
> PDF has a plain text version imbedded. You can use the PDF API to
> extract text from the documents.
Not exactly true. PDF is just a fancified version of PostScript. YOu
might think that this makes it easy to extract the text, but in real
life many PDF generators do amazingly silly things like storing the
text as a long list of decimal numbers instead of as raw strings.
pdftotext may do the conversion you're looking for, but it entirely
depends on how silly the PDF creation software was.
In short: worth a try, but don't bet your life on it.
Cheers; Leon