[sword-devel] French ligatures in Louis SÉGOND’s text

Troy A. Griffitts scribe at crosswire.org
Mon Jul 16 15:57:34 MST 2007


Just a quick note.  Our lucene indexing code does call all our strip
filters.  The solution and example I provided in my last email was using
lucene indexes.

Chris Little <chrislit at crosswire.org> wrote: 
>
>
>DM Smith wrote:
>> Doesn't ICU have locale sensitive decomposition (or transliteration)?  
>> If it does then why can't we use the language of the module to set  
>> the locale then decompose. This is what we are planning to do for  
>> JSword (it has been on the todo list for years).
>
>I don't see anything like this in ICU. I couldn't find anything in the 
>API docs and there's nothing in the locale files themselves.
>
>I think our best option may be to tag words on a per module basis with 
>alternative forms and then index the forms as alternates with Lucene, as 
>  your last post suggested. For non-Lucene searches we can normalize the 
>text & search strings via the strip filters as Troy suggests.
>
>Someone else would have to provide the code side of things, but in terms 
>of markup, I think we just want to do something along the lines of:
>
><w xlit="basic:coeur">cœur</w>
>
>And the strip filter (for non-Lucene searches) will just replace that 
>with "couer".
>
>--Chris
>
>
>
>_______________________________________________
>sword-devel mailing list: sword-devel at crosswire.org
>http://www.crosswire.org/mailman/listinfo/sword-devel
>Instructions to unsubscribe/change your settings at above page
>
>





More information about the sword-devel mailing list