[sword-devel] a new source for modules?

Dan Blake danblake at tcdr.com
Sun Aug 30 07:15:57 MST 2009


Jonathan Morgan wrote:
> On Sun, Aug 30, 2009 at 7:41 AM, Chris Little<chrislit at crosswire.org> wrote:
>   
>> Peter von Kaehne wrote:
>>     
>>> Peter von Kaehne wrote:
>>>       
>>>> Just started to look around Google Books and saw the huge collection of
>>>> public domain books scanned, OCRd and transformed into epub books.
>>>>
>>>>         
>>> E.g. here Wesley's complete works:
>>>
>>>
>>> http://books.google.co.uk/books?id=2tdhAAAAIAAJ&printsec=frontcover&dq=subject:%22+theology+%22&lr=&as_brr=1&ei=opmZSu6qDqCGygTxpIjODg&rview=1#v=onepage&q=&f=false
>>>
>>> Scanned, OCRed and as epub. A rudimentary genbook should be creatable
>>> within a couple of hours and once references etc are inserted it could
>>> be a valuable resource beyond the ability of an epub reader.
>>>
>>> Peter
>>>       
>> No doubt this is a step in the right direction, but I have the same
>> misgivings as Matthew regarding OCRd and unproofed content.
>>
>> I popped the book you cite into Adobe Digital Editions to check the quality,
>> and found most of the OCR problems we would expect to see:
>> weird layout, non-Latin text appears as gibberish, and one (text) page I
>> spotted was just presented inline as a scanned image.
>>
>> So, it's a good step, but the quality is pretty bad.
>>     
>
> I agree too.  I am involved with a website that distributes a lot of
> scanned and OCR'd works, and when I read some of them I think "How
> could you seriously present that document to the world?"  For what
> it's worth, Logos say that it is faster and better to type the content
> in yourself than to OCR and then proofread and correct, and Logos
> produces a lot of content.  I suspect that would be certain for
> reasonably complex scripts and layouts, and quite possible even for
> reasonably simple content if you have good typists.
>
> Jon
>
>   
The Gutenberg Project might be a better source for Public Domain texts 
in many different formats.
http://www.gutenberg.org/

Daniel Blake




More information about the sword-devel mailing list