[sword-devel] Repository of uncompiled modules? Errors in
Vietnamese bible !
Chris Little
sword-devel@crosswire.org
Sat, 3 May 2003 09:42:50 -0700 (MST)
On Sat, 3 May 2003, Nguyen Ly wrote:
> I manage to find a mod2osis utility from the bibletechnologies.net
> (OSIS) website. Though the link on the main page is broken .. Try this
> page instead: http://www.bibletechnologieswg.org/osis/tools/
You should use mod2vpl to turn the module into a plaintext file. We don't
have tools to turn an OSIS document into a Bible module yet, but we can
use the output of mod2vpl as input to vpl2mod. (You can find mod2vpl &
vpl2mod in ftp://ftp.crosswire.org/pub/sword/utils/win32/ .)
> Is there currently a repository of submitted uncompiled modules
> accessible from somewhere?
Not currently, though I would like there to be one eventually, at least
for our developers' private use. As you figured out, the best way to get
the text is by exporting from the module, but you can also find the
Vietnamese text at http://www.unboundbible.com/zips/index.cfm . Their
UTF-8 text is what ours is based on, so you might have an easier time
working with their non-UTF-8 text if they made encoding errors.
> I've been using the Vietnamese bible and found some minor errors in the
> text. For eg, some characters don't display properly - (John 1:19) --
> this is due to a conversion problem from the original text file being
> converted from VNI to unicode format. Another issue is with the spelling
> of the word "Jesus" in Viet. In some places it's spelt Je^sus and in
> others it's Gie^-xu. Technically, the later is more correct since
> there's no "J" in the Viet alphabet. This may make searches inaccurate.
Do you have a print version of the same Bible? If you do, and it
consistently uses Gie^-xu instead of Je^sus, then this change is
appropriate. But from the source files, it appears that Je^sus is far
more common than Gie^-xu since Je^sus has 1168 occurrences and Gie^-xu has
only 3. Google also shows a preference for Je^sus (2280 occurrences) over
Gie^-xu (910 occurrences). So the spelling change is does not appear to
me to be appropriate.
--Chris