[sword-devel] a new source for modules?
Chris Little
chrislit at crosswire.org
Sat Aug 29 14:41:36 MST 2009
Peter von Kaehne wrote:
> Peter von Kaehne wrote:
>> Just started to look around Google Books and saw the huge collection of
>> public domain books scanned, OCRd and transformed into epub books.
>>
>
> E.g. here Wesley's complete works:
>
> http://books.google.co.uk/books?id=2tdhAAAAIAAJ&printsec=frontcover&dq=subject:%22+theology+%22&lr=&as_brr=1&ei=opmZSu6qDqCGygTxpIjODg&rview=1#v=onepage&q=&f=false
>
> Scanned, OCRed and as epub. A rudimentary genbook should be creatable
> within a couple of hours and once references etc are inserted it could
> be a valuable resource beyond the ability of an epub reader.
>
> Peter
No doubt this is a step in the right direction, but I have the same
misgivings as Matthew regarding OCRd and unproofed content.
I popped the book you cite into Adobe Digital Editions to check the
quality, and found most of the OCR problems we would expect to see:
weird layout, non-Latin text appears as gibberish, and one (text) page I
spotted was just presented inline as a scanned image.
So, it's a good step, but the quality is pretty bad.
--Chris
More information about the sword-devel
mailing list