IRISLIB database
Japanese Class Reference

See <CLASS>Text.Text</CLASS> More...

Inheritance diagram for Japanese:
Collaboration diagram for Japanese:

Static Public Member Functions

_.Library.Status ExcludeCommonTerms (nWords)
 Classifies the most common nTerms words in the current language as noise words. More...
 
_.Library.String SeparateWords (_.Library.String rawText)
 Separates individual terms with whitespace, for languages such as Japanese.
 
- Static Public Member Functions inherited from Text
_.Library.Status AddDocToDictionary (_.Library.String document, _.Library.String category)
 Add words of the specified document to the ^SYSDict global. More...
 
_.Library.Status AddToDictionary (_.Library.String word, _.Library.Integer wordType, _.Library.String category, _.Library.Integer wCount)
 Add the specified word or phrase to the current dictionary. More...
 
_.Library.Status BuildValueArray (_.Library.Binary document, _.Library.Binary valueArray)
 The <METHOD>BuildValueArray</METHOD> method tokenizes a text string into a collection of. More...
 
_.Library.String ChooseSearchKey (_.Library.String document)
 If we must choose exactly one indexable search string from a pattern that. More...
 
_.Library.List Classify (_.Library.String document, _.Library.Integer topN, maxDocFreq)
 Classify document into one of the known categories using a semi-naive Bayesian classification algorithm. More...
 
_.Library.List CreateQList (_.Library.String document, _.Library.String coll)
 Internal method used by the <METHOD>Similarity</METHOD> and <METHOD>SimilarityIdx</METHOD> More...
 
_.Library.String DecompressOffsets (_.Library.String compressed)
 Converts the offsets from compressed to uncompressed form.
 
 DropDictionary ()
 Deletes all of the words, noisewords, etc. More...
 
_.Library.List MakeSearchTerms (_.Library.String searchPattern, _.Library.Integer ngramlen)
 Convert a string into a list of search terms, such that each search term contains no. More...
 
_.Library.Numeric Similarity (_.Library.String document, _.Library.List qList)
 See also <METHOD>SimilarityIdx</METHOD>
 
_.Library.Numeric SimilarityIdx (_.Library.String ID, _.Library.String textIndex, _.Library.List qList)
 
_.Library.String Standardize (_.Library.String document, _.Library.Boolean origtext)
 Returns the specified string in standardized form, that is: stemmed, filtered, translated,. More...
 
 setto (_.Library.String b, _.Library.String s, _.Library.Integer j, _.Library.Integer k)
 setto(s) sets (j+1),...k to the characters in the string s, readjusting k.
 
- Static Public Member Functions inherited from String
_.Library.String DisplayToLogical (_, _.Library.String val)
 Converts the input value val, which is a string, into the logical string format. More...
 
_.Library.Status IsValid (_, _.Library.RawString val)
 Tests if the logical value val, which is a string, is valid. More...
 
_.Library.String JSONToLogical (_, _.Library.String val)
 If JSONLISTPARAMETER is specified, XSDToLogical is generated which imports using the list specified by JSONLISTPARAMETER.
 
_.Library.String LogicalToDisplay (_, _.Library.String val)
 Converts the value of val, which is in logical format, into a display string. More...
 
_.Library.String LogicalToJSON (_, _.Library.String val)
 If JSONLISTPARAMETER is specified, XSDToLogical is generated which exports using the list specified by JSONLISTPARAMETER.
 
_.Library.String LogicalToXSD (_, _.Library.String val)
 If XMLLISTPARAMETER is specified, XSDToLogical is generated which exports using the list specified by XMLLISTPARAMETER.
 
_.Library.String Normalize (_, _.Library.RawString val)
 Truncates value val to MAXLEN, characters.
 
_.Library.String XSDToLogical (_, _.Library.String val)
 If XMLLISTPARAMETER is specified, XSDToLogical is generated which imports using the list specified by XMLLISTPARAMETER.
 

Static Public Attributes

 CASEINSENSITIVE = None
 See <CLASS>Text.Text</CLASS> More...
 
- Static Public Attributes inherited from Text
 CASEINSENSITIVE = None
 The Text.Text data type class implements the methods used by InterSystems IRIS for full text indexing, text search, similarity scoring, automatic classification, dictionary management, word stemming, n-gram key creation, and noise word filtering. More...
 
 DICTIONARY = None
 The default dictionary for properties of this class. More...
 
 FILTERNOISEWORDS = None
 <PARAMETER>FILTERNOISEWORDS</PARAMETER> controls whether common-word filtering is enabled. More...
 
 IGNOREMARKUP = None
 <PARAMETER>IGNOREMARKUP</PARAMETER> is a Boolean (0/1) flag. More...
 
 MAXLEN = None
 By default, there is no default MAXLEN; that is, it must be specified wherever a Text.Text. More...
 
 MAXOCCURS = None
 Text search applications sometimes need to highlight the matching terms found. More...
 
 MAXWORDLEN = None
 <PARAMETER>MAXWORDLEN</PARAMETER> specifies the maximum word length that will be retained. More...
 
 MINWORDLEN = None
 MINWORDLEN specifies the minimum length word that will be retained. More...
 
 NGRAMLEN = None
 <PARAMETER>NGRAMLEN</PARAMETER> is the maximum number of words that will be regarded as a single More...
 
 NOISEWORDS100 = None
 NOISEWORDSnnn lists the most common words in the language, in order of their frequency of occurrence. More...
 
 NUMCHARS = None
 <PARAMETER>NUMCHARS</PARAMETER> specifies the characters other than digits that may appear More...
 
 NUMERIC = None
 <PARAMETER>NUMERIC</PARAMETER> specifies whether numeric terms will be retained(1) or ignored(0).
 
 OKAPIBM25B = None
 See <METHOD>SimilarityIdx</METHOD>
 
 OKAPIBM25K1 = None
 See <METHOD>SimilarityIdx</METHOD>
 
 OKAPIBM25K3 = None
 See <METHOD>SimilarityIdx</METHOD>
 
 SEPARATEWORDS = None
 Languages such as Japanese require the raw document text to be parsed and. More...
 
 SOURCELANGUAGE = None
 <PARAMETER>SOURCELANGUAGEUAGE</PARAMETER> specifies the default source language to translate More...
 
 STEMMING = None
 <PARAMETER>STEMMING</PARAMETER> replaces each word by its language-specific stem to improve the More...
 
 TARGETLANGUAGE = None
 <PARAMETER>TARGETLANGUAGE</PARAMETER> specifies the default target language to translate More...
 
 TARGETLANGUAGECLASS = None
 <PARAMETER>TARGETLANGUAGECLASS</PARAMETER> specifies the class to use when <PARAMETER>TARGETLANGUAGE</PARAMETER> More...
 
 THESAURUS = None
 <PARAMETER>THESAURUS</PARAMETER> specifies that a language-specific thesaurus is to be used in place of, More...
 
 WORDCHARS = None
 <PARAMETER>WORDCHARS</PARAMETER> specifies the characters other than alphabetic that may More...
 
- Static Public Attributes inherited from String
 COLLATION = None
 The default collation value used for this data type. More...
 
 CONTENT = None
 XML element content "MIXED" for mixed="true" and "STRING" or "ESCAPE" for mixed="false". More...
 
 DISPLAYLIST = None
 Used for enumerated (multiple-choice) attributes. More...
 
 ESCAPE = None
 Controls the translate table used to escape content when CONTENT="MIXED" is specified.
 
 JSONLISTPARAMETER = None
 Used to specify the name of the parameter which contains the enumeration list for JSON values. More...
 
 JSONTYPE = None
 JSONTYPE is JSON type used for this datatype.
 
 MAXLEN = None
 The maximum number of characters the string can contain. More...
 
 MINLEN = None
 The minimum number of characters the string can contain.
 
 PATTERN = None
 A pattern which the string should match. More...
 
 TRUNCATE = None
 Determines whether to truncate the string to MAXLEN characters.
 
 VALUELIST = None
 Used for enumerated (multiple-choice) attributes. More...
 
 XMLLISTPARAMETER = None
 Used to specify the name of the parameter which contains the enumeration list for XML values. More...
 
 XSDTYPE = None
 Declares the XSD type used when projecting XML Schemas.
 
- Static Public Attributes inherited from DataType
 INDEXNULLMARKER = None
 Override this parameter value to specify what value should be used as a null marker when a property of the type is used in a subscript of an index map. More...
 

Detailed Description

See <CLASS>Text.Text</CLASS>

The <CLASS>Text.Japanese</CLASS> class implements (or calls) the Japanese language-specific stemming algorithm and initializes the language-specific list of noise words.

Member Function Documentation

◆ ExcludeCommonTerms()

_.Library.Status ExcludeCommonTerms (   nTerms)
static

Classifies the most common nTerms words in the current language as noise words.

The words specified

in <PARAMETER>NOISEWORDS100</PARAMETER>, <PARAMETER>NOISEWORDS200</PARAMETER>, and <PARAMETER>NOISEWORDS300</PARAMETER>, list the most common 300 words of the current language, in order of their frequency. Similarly, <PARAMETER>NOISEBIGRAMSn00</PARAMETER> lists the most common 300 bigrams of the current language that would not typically be considered useful for searching.

Reimplemented from Text.

Member Data Documentation

◆ CASEINSENSITIVE

CASEINSENSITIVE = None
static

See <CLASS>Text.Text</CLASS>

The <CLASS>Text.Japanese</CLASS> class implements (or calls) the Japanese language-specific stemming algorithm and initializes the language-specific list of noise words.