[sword-devel] Spelling (was Versification/Encoding issues)
David Haslam
d.haslam at ukonline.co.uk
Fri Jan 9 02:31:28 MST 2009
Using Tessaract to help the Irish New Testament project is suggested.
See
http://www.crosswire.org/wiki/Non-CrossWire_Text-Development_Projects#Individual_Works
http://www.crosswire.org/wiki/Non-CrossWire_Text-Development_Projects#Individual_Works
We should try and establish personal contact with Pastor Craig Ledbetter.
http://www.biblebc.com/Projects/irish_new_testament_project.htm
http://www.biblebc.com/Projects/irish_new_testament_project.htm
I think CrossWire could provide some useful technical help.
-- David
Peter von Kaehne wrote:
>
> Mike Hart wrote:
>> That's interesting, because ancle is one of the words I corrected in
>> JSFB -- the OCR had ancle, but the PDF itself, my paper KJV copy, and
>> my JPS complete Tanach (individual volumes) had ankle... I can't say
>> what verse it was, at the time I was hunting for e's that had been
>> OCR'd into c's (search for 'regular expression'
>> [bcdfghjklmnpqrstvwxy]c[bcdfgjklmnpqrstvwx] in kwrite)
>
> You should have a look at Troy's work with tesseract. Rather than search
> and replace a text badly ocred he seems to have figured out how to
> "educate" tesseract with one or two sample pages until it does the right
> thing. That might be way easier and with a better outcome in the long
> term for you too.
>
> Peter
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
>
--
View this message in context: http://www.nabble.com/Versification-Encoding-Issues-tp21341395p21368903.html
Sent from the SWORD Dev mailing list archive at Nabble.com.
More information about the sword-devel
mailing list