[sword-devel] a new source for modules?

Matthew Talbert ransom1982 at gmail.com
Sat Aug 29 14:10:54 MST 2009


On Sat, Aug 29, 2009 at 5:07 PM, Peter von Kaehne<refdoc at gmx.net> wrote:
> Just started to look around Google Books and saw the huge collection of
> public domain books scanned, OCRd and transformed into epub books.
>
> epub is essentially a zip file with html, images and a table of contents
> inside. I am sure a fairly straight forward standard conversion could be
> created.
>
> I assume here that the legalities of this are ok as the books are public
> domain and not further edited. Not a lawyer though
>
> Pete
>

In my experience with Google Books, the OCR is...less than perfect.
I'd expect many would need a great deal of hand-editing to be
readable. For searching, it's fine because it doesn't need to be
perfect for that. The google reader overlays the OCR text in many
cases with the actual printed material, and I think perhaps some of
the downloads (at least for pdf) are done the same way.

Matthew



More information about the sword-devel mailing list