[sword-devel] Localized parsing symbols [was: C++ volunteer]
Cyrille
lafricain79 at gmail.com
Tue May 28 09:24:39 MST 2019
Il 28/05/2019 17:40, Troy A. Griffitts ha scritto:
>
> So, a little background surrounding why the logic is difficult to work
> out a solution for this problem:
>
> The current verse parser, which works fairly well, always has 3 sets
> of possibilities in view:
>
> OSISRef
> Current Locale
> English
>
> The parser needs to handle any of these three, typically in the
> preference order listed above. The issue with changing out symbols
> while parsing is that some symbols (notoriously the comma) are used
> for different purposes across these 3 sets.
>
> One might think that localized output might be easier than parsing,
> e.g., once parsed, we could at least output the reference: Jn 3,16.
> The problem here is that what the engine outputs it also expects to be
> able to parse.
>
> While we would like to solve this problem, it isn't as simple as
> adding to the locale files:
>
> ChapterVerseSeparator=,
>
> RangeSeparator=-
>
> ListSeparator=.
>
> This would be enough to define the locale, but not solve the problem.
> We would need a fundamental change in how parsing is done, e.g.,
> explicitly telling the parser, "Hey, I'm sending you localized input,
> so don't guess. You can count on the symbols I'm sending you to be
> localized" Right now everyone has the convenience of just passing any
> of the 3 sets of parsing text listed above and theparser just figuring
> it out-- with the caveat that chapter, range, and list separators are
> not localizable.
>
> Hope this gives some background,
>
Yes thank you, but I just don't understand why it is already possible
with two separator (. and : ) and then not only with one? Maybe I can't
understand it because it is too much hard (technicaly) for me ;)
>
> Troy
>
>
> On 5/28/19 6:10 AM, David Haslam wrote:
>> OK - but my observations were not entirely irrelevant.
>>
>> Some front-ends never need the user to enter a reference in an edit
>> box. Navigation is done entirely via menu selections or clicking
>> search results etc.
>> AFAICT this is true of PocketSword.
>>
>> Other front-ends are designed at the opposite extreme. All navigation
>> is done through an edit box. This is true (eg) of STEP Bible.
>>
>> Best regards,
>>
>> David.
>>
>> Sent from ProtonMail Mobile
>>
>>
>> On Tue, May 28, 2019 at 13:54, refdoc at gmx.net <refdoc at gmx.net
>> <mailto:refdoc at gmx.net>> wrote:
>>> Sorry, David, that is a complete misunderstanding. Modules need
>>> osisref. There is and will be no need to do anything to the modules.
>>> This is about the engine parser to read references locale
>>> appropriately.
>>>
>>> Sent from my mobile. Please forgive shortness, typos and weird
>>> autocorrects.
>>>
>>>
>>> -------- Original Message --------
>>> Subject: Re: [sword-devel] C++ volunteer
>>> From: David Haslam
>>> To: SWORD Developers' Collaboration Forum
>>> CC:
>>>
>>>
>>> Parsing native references is not a simple task, as we know from
>>> the fact that adyeths orefs.py was kicked into touch indefinitely.
>>>
>>> And that’s even when punctuation marks are defined in the
>>> specified configuration file.
>>>
>>> Unless we might consider the possibility of adding keys to
>>> module .conf files that define the module specific
>>> native reference punctuation marks and separators.
>>>
>>> That could be a huge undertaking, considering the need to
>>> maintain backwards compatibility.
>>>
>>> And it’s not as if it really is module specific entirely. A user
>>> can be switching between modules with different languages, yet
>>> would need the current reference to always work, no matter what.
>>>
>>> Best regards
>>>
>>> David
>>>
>>> Sent from ProtonMail Mobile
>>>
>>>
>>> On Tue, May 28, 2019 at 12:10, refdoc at gmx.net <refdoc at gmx.net
>>> <mailto:refdoc at gmx.net>> wrote:
>>>> The improvement request for allowing commas in references...
>>>> adding commas in the suggested form would make millions of
>>>> currently valid Anglo references invalid. The problem is a much
>>>> wider one, references should be localised in their punctuation
>>>> too. I am not sure how difficult this would be, but I guess we
>>>> could make a start by defining what punctuation is used for
>>>> which purpose , and then take it from there.
>>>>
>>>> Cyrille, maybe start a page on the wiki and start thinking there.
>>>>
>>>> Sent from my mobile. Please forgive shortness, typos and weird
>>>> autocorrects.
>>>>
>>>>
>>>> -------- Original Message --------
>>>> Subject: Re: [sword-devel] C++ volunteer
>>>> From: Cyrille
>>>> To: SWORD Developers' Collaboration Forum
>>>> CC:
>>>>
>>>>
>>>> Hello Richard,
>>>> Welcome!
>>>> May I make a very selfish proposal to Richard who offers
>>>> his help. There are two issues that I really want to be
>>>> resolved. One of which particularly handicaps Catholic
>>>> users, (but I discovered today that the issue wasn't been
>>>> reported!!! I just did it):
>>>> https://tracker.crosswire.org/browse/API-216
>>>> And the second:
>>>> https://tracker.crosswire.org/projects/API/issues/API-180
>>>>
>>>> If there are more important things that I am not able to
>>>> estimate not being a developer, I would have tried my luck ;)
>>>>
>>>> Il 28/05/2019 01:38, Troy A. Griffitts ha scritto:
>>>>> Richard, sorry, I meant to give you the link to our tracker:
>>>>>
>>>>> https://tracker.crosswire.org
>>>>>
>>>>>
>>>>> On 5/27/19 4:32 PM, Troy A. Griffitts wrote:
>>>>>> Welcome, Richard!
>>>>>>
>>>>>> I would start at 2 places:
>>>>>>
>>>>>> First, have a look at our tracker here. We are not very (very not)
>>>>>> disciplined at keeping it current. Skimming through there and
>>>>>> commenting on anything that looks interesting, or even cleaning a few
>>>>>> things up in there that you confirm are no longer a problem might be a
>>>>>> useful exercise to get you poking around at internals and would be a
>>>>>> blessing for us. Our modus operandi as of late is to create a new unit
>>>>>> test in sword/tests/testssuite/ which fails at the bug and then once
>>>>>> fixed, the test should pass and we leave the test around to be sure we
>>>>>> don't regress. We can always use more tests in our tests suite.
>>>>>>
>>>>>> Next, we have the intention to modularize our search engines support and
>>>>>> search types. Right now, SWModule (which represents a Bible) implements
>>>>>> our SWSearchable interface, which is fine, but right now it has a bunch
>>>>>> of #ifdef logic and switch statements to take different code paths
>>>>>> depending on which search engine is compiled into SWORD and which search
>>>>>> type is specified. This was fine initially, but has grown to such that
>>>>>> we now support spaghetti in there. It should probably simply have a set
>>>>>> of SWSearchable objects in a map<SEARCH_TYPE, SWSearchable> and proxy
>>>>>> the search request to the appropriate SWSearchable impl based on what
>>>>>> types are registered for the module. This would allow us to implement
>>>>>> new types and register them with modules which support special search
>>>>>> types, e.g., advanced Hebrew Morphology searching. That's the general
>>>>>> idea anyway.
>>>>>>
>>>>>> You should probably become familiar with SWFilter and how we use these
>>>>>> throughout the engine. These prepare a buffer for particular
>>>>>> objectives. We have RenderFilters, EncodingFilters, StripFilters, ...
>>>>>> The last prepares an SWModule entry for searching by, typically,
>>>>>> stripping out all markup and leaving only a plaintext buffer which can
>>>>>> be searched. We have some special code in the SWModule::search
>>>>>> spaghetti which takes Greek and Hebrew modules and turns buffers into a
>>>>>> series of Strongs#@MorphCode Strong#@MorphCode ... which allows regex
>>>>>> searches to do some advanced morph searching... like: Find this strongs
>>>>>> number, any morphology, followed by a any verb withing 2 words. You
>>>>>> have to be pretty familiar with the Strong#@MorphCode syntax to
>>>>>> formulate something like that, but the idea is that a frontend could
>>>>>> have a nice UI to help a user come up with some creative searches.
>>>>>> Anyway, these should all be probably modulized out by renaming the
>>>>>> StripFilter concept to SearchFilter, and then pushing all this special
>>>>>> code out to SearchFilter impls which do these special things...
>>>>>>
>>>>>> Finally, an objective of all this search modularization is also to break
>>>>>> out the code required to create search indexes for each of the search
>>>>>> engines we support. Ideally, we should be able to support the same
>>>>>> searches either as an indexed or brute force search. The same code
>>>>>> which iterates a module, prepares each entry, and pushes that entry to
>>>>>> the search engine, building the search index, should also work for a
>>>>>> brute force search-- iterating the module, preparing each entry for the
>>>>>> search engine.. and then performing a check on that buffer to see if it
>>>>>> matches the search expression.
>>>>>>
>>>>>> I hope this gives you a few things to think about. It has been good for
>>>>>> me to refresh thoughts on all of this. Have a look and let me know what
>>>>>> you think.
>>>>>>
>>>>>> Welcome! Looking forward to sharing in service together,
>>>>>>
>>>>>> Troy
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 5/27/19 1:09 PM, Richard Smith wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> My name's Richard Smith. I'm a C++ software engineer with 10 years
>>>>>>> experience in various industries. I was wondering if there was any
>>>>>>> space for a volunteer. I've started taking a look at things (building
>>>>>>> repos on Win/unix), but if there are specific things that are
>>>>>>> required, within my ability, I'm happy to do that.
>>>>>>>
>>>>>>> Best Regards
>>>>>>> Richard Smith
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>> _______________________________________________
>>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>>> Instructions to unsubscribe/change your settings at above page
>>>>> _______________________________________________
>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>> Instructions to unsubscribe/change your settings at above page
>>>>
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20190528/49313658/attachment-0001.html>
More information about the sword-devel
mailing list