IRISLIB database
DefaultStemmer Class Reference
Inheritance diagram for DefaultStemmer:
Collaboration diagram for DefaultStemmer:

Public Member Functions

_.Library.Status OnNew (_.Library.String pDefaultLanguage)
 
_.Library.Status Reload ()
 Reloads underlying stemmer implementations and rules.
 

Public Attributes

 DefaultLanguage
 Default language to use when not specified in calls to <method>Stem</method> More...
 

Private Member Functions

_.Library.Status __LoadRules (_.Library.String pLanguage, _.Library.String pPlugin)
 Retrieves a set of rules to customize or overrule plugin output, based on default rules. More...
 
_.Library.String __StemWordHunspell (_.iKnow.Stemming.HunspellStemmer pStemmer, _.Library.String pToken, _.Library.String pLanguage, _.Library.Integer pLexType, _.Library.String pEntity)
 Starting point for advanced resolution of hunspell stemming results. More...
 
_.Library.String __StemWordHunspellRules (_.iKnow.Stemming.HunspellStemmer pStemmer, _.Library.String pToken, _.Library.String pLanguage, _.Library.Integer pLexType, _.Library.String pEntity, _.Library.Boolean pHasMatch)
 For a given token, goes through all the results presented by Hunspell and then decides. More...
 

Detailed Description

This class encapsulates logic to instantiate, use and amend stemmers for different languages. Plugin selection behavior per language is as follows: if a valid Hunspell affix and dictionary file is found in the /dev/hunspell subdirectory of your installation location (either named [language code]_*.aff or in a subdirectory named after the language code), a <class>HunspellStemmer</class> object will be instantiated to treat stemming requests for that language. If no such library is found, the corresponding <class>TextStemmer</class> will be instantiated.

If the <method>StemWord</method> method is invoked for a particular language, this class will first look up the supplied string in the list of exceptions. If no exceptions are found (either default exceptions supplied with iKnow or custom exceptions in the <class>Rule</class> table for this namespace), the StemWord method of the instantiated Stemmer plugin object will be invoked. If the plugin supports returning multiple results, these will be filtered and only the first result satisfying the corresponding rules (stored in the iKnow language model or the <class>Rule</class>) will be returned.

Member Function Documentation

◆ OnNew()

_.Library.Status OnNew ( _.Library.String  pDefaultLanguage)

This class encapsulates logic to instantiate, use and amend stemmers for different languages. Plugin selection behavior per language is as follows: if a valid Hunspell affix and dictionary file is found in the /dev/hunspell subdirectory of your installation location (either named [language code]_*.aff or in a subdirectory named after the language code), a <class>HunspellStemmer</class> object will be instantiated to treat stemming requests for that language. If no such library is found, the corresponding <class>TextStemmer</class> will be instantiated.

If the <method>StemWord</method> method is invoked for a particular language, this class will first look up the supplied string in the list of exceptions. If no exceptions are found (either default exceptions supplied with iKnow or custom exceptions in the <class>Rule</class> table for this namespace), the StemWord method of the instantiated Stemmer plugin object will be invoked. If the plugin supports returning multiple results, these will be filtered and only the first result satisfying the corresponding rules (stored in the iKnow language model or the <class>Rule</class>) will be returned.

◆ __LoadRules()

_.Library.Status __LoadRules ( _.Library.String  pLanguage,
_.Library.String  pPlugin 
)
private

Retrieves a set of rules to customize or overrule plugin output, based on default rules.

returned by <method>GetDefaultRules</method> and the content of the <class>iKnow.Stemming.Rule</class> table. Any result retrieved by a plugin will have to pass these rules (where applicable) or it will not be returned. Note that this may result in no results to be passed back at all!

◆ __StemWordHunspell()

_.Library.String __StemWordHunspell ( _.iKnow.Stemming.HunspellStemmer  pStemmer,
_.Library.String  pToken,
_.Library.String  pLanguage,
_.Library.Integer  pLexType,
_.Library.String  pEntity 
)
private

Starting point for advanced resolution of hunspell stemming results.

Stems pToken using pStemmer by testing it first in the capitalization supplied initially and then with initcaps and all-caps in case no stem was found. Relays to <method>StemWordHunspellRules</method> for the actual stemming.

◆ __StemWordHunspellRules()

_.Library.String __StemWordHunspellRules ( _.iKnow.Stemming.HunspellStemmer  pStemmer,
_.Library.String  pToken,
_.Library.String  pLanguage,
_.Library.Integer  pLexType,
_.Library.String  pEntity,
_.Library.Boolean  pHasMatch 
)
private

For a given token, goes through all the results presented by Hunspell and then decides.

which option to return (if any at all), based on the rules returned by <method>GetHunspellRules</method>, using context information such as pLexType

Member Data Documentation

◆ DefaultLanguage

DefaultLanguage

Default language to use when not specified in calls to <method>Stem</method>