See <CLASS>Text.Text</CLASS> More...

Inheritance diagram for Japanese:

[legend]

Collaboration diagram for Japanese:

[legend]

Static Public Member Functions
_.Library.Status	ExcludeCommonTerms (nWords)
	Classifies the most common nTerms words in the current language as noise words. More...

_.Library.String	SeparateWords (_.Library.String rawText)
	Separates individual terms with whitespace, for languages such as Japanese.

Static Public Member Functions inherited from Text
_.Library.Status	AddDocToDictionary (_.Library.String document, _.Library.String category)
	Add words of the specified document to the ^SYSDict global. More...

_.Library.Status	AddToDictionary (_.Library.String word, _.Library.Integer wordType, _.Library.String category, _.Library.Integer wCount)
	Add the specified word or phrase to the current dictionary. More...

_.Library.Status	BuildValueArray (_.Library.Binary document, _.Library.Binary valueArray)
	The <METHOD>BuildValueArray</METHOD> method tokenizes a text string into a collection of. More...

_.Library.String	ChooseSearchKey (_.Library.String document)
	If we must choose exactly one indexable search string from a pattern that. More...

_.Library.List	Classify (_.Library.String document, _.Library.Integer topN, maxDocFreq)
	Classify document into one of the known categories using a semi-naive Bayesian classification algorithm. More...

_.Library.List	CreateQList (_.Library.String document, _.Library.String coll)
	Internal method used by the <METHOD>Similarity</METHOD> and <METHOD>SimilarityIdx</METHOD> More...

_.Library.String	DecompressOffsets (_.Library.String compressed)
	Converts the offsets from compressed to uncompressed form.

	DropDictionary ()
	Deletes all of the words, noisewords, etc. More...

_.Library.List	MakeSearchTerms (_.Library.String searchPattern, _.Library.Integer ngramlen)
	Convert a string into a list of search terms, such that each search term contains no. More...

_.Library.Numeric	Similarity (_.Library.String document, _.Library.List qList)
	See also <METHOD>SimilarityIdx</METHOD>

_.Library.Numeric	SimilarityIdx (_.Library.String ID, _.Library.String textIndex, _.Library.List qList)

_.Library.String	Standardize (_.Library.String document, _.Library.Boolean origtext)
	Returns the specified string in standardized form, that is: stemmed, filtered, translated,. More...

	setto (_.Library.String b, _.Library.String s, _.Library.Integer j, _.Library.Integer k)
	setto(s) sets (j+1),...k to the characters in the string s, readjusting k.

Static Public Member Functions inherited from String
_.Library.String	DisplayToLogical (_, _.Library.String val)
	Converts the input value val, which is a string, into the logical string format. More...

_.Library.Status	IsValid (_, _.Library.RawString val)
	Tests if the logical value val, which is a string, is valid. More...

_.Library.String	JSONToLogical (_, _.Library.String val)
	If JSONLISTPARAMETER is specified, XSDToLogical is generated which imports using the list specified by JSONLISTPARAMETER.

_.Library.String	LogicalToDisplay (_, _.Library.String val)
	Converts the value of val, which is in logical format, into a display string. More...

_.Library.String	LogicalToJSON (_, _.Library.String val)
	If JSONLISTPARAMETER is specified, XSDToLogical is generated which exports using the list specified by JSONLISTPARAMETER.

_.Library.String	LogicalToXSD (_, _.Library.String val)
	If XMLLISTPARAMETER is specified, XSDToLogical is generated which exports using the list specified by XMLLISTPARAMETER.

_.Library.String	Normalize (_, _.Library.RawString val)
	Truncates value val to MAXLEN, characters.

_.Library.String	XSDToLogical (_, _.Library.String val)
	If XMLLISTPARAMETER is specified, XSDToLogical is generated which imports using the list specified by XMLLISTPARAMETER.

Static Public Attributes
	CASEINSENSITIVE = None
	See <CLASS>Text.Text</CLASS> More...

Static Public Attributes inherited from Text
	CASEINSENSITIVE = None
	The Text.Text data type class implements the methods used by InterSystems IRIS for full text indexing, text search, similarity scoring, automatic classification, dictionary management, word stemming, n-gram key creation, and noise word filtering. More...

	DICTIONARY = None
	The default dictionary for properties of this class. More...

	FILTERNOISEWORDS = None
	<PARAMETER>FILTERNOISEWORDS</PARAMETER> controls whether common-word filtering is enabled. More...

	IGNOREMARKUP = None
	<PARAMETER>IGNOREMARKUP</PARAMETER> is a Boolean (0/1) flag. More...

	MAXLEN = None
	By default, there is no default MAXLEN; that is, it must be specified wherever a Text.Text. More...

	MAXOCCURS = None
	Text search applications sometimes need to highlight the matching terms found. More...

	MAXWORDLEN = None
	<PARAMETER>MAXWORDLEN</PARAMETER> specifies the maximum word length that will be retained. More...

	MINWORDLEN = None
	MINWORDLEN specifies the minimum length word that will be retained. More...

	NGRAMLEN = None
	<PARAMETER>NGRAMLEN</PARAMETER> is the maximum number of words that will be regarded as a single More...

	NOISEWORDS100 = None
	NOISEWORDSnnn lists the most common words in the language, in order of their frequency of occurrence. More...

	NUMCHARS = None
	<PARAMETER>NUMCHARS</PARAMETER> specifies the characters other than digits that may appear More...

	NUMERIC = None
	<PARAMETER>NUMERIC</PARAMETER> specifies whether numeric terms will be retained(1) or ignored(0).

	OKAPIBM25B = None
	See <METHOD>SimilarityIdx</METHOD>

	OKAPIBM25K1 = None
	See <METHOD>SimilarityIdx</METHOD>

	OKAPIBM25K3 = None
	See <METHOD>SimilarityIdx</METHOD>

	SEPARATEWORDS = None
	Languages such as Japanese require the raw document text to be parsed and. More...

	SOURCELANGUAGE = None
	<PARAMETER>SOURCELANGUAGEUAGE</PARAMETER> specifies the default source language to translate More...

	STEMMING = None
	<PARAMETER>STEMMING</PARAMETER> replaces each word by its language-specific stem to improve the More...

	TARGETLANGUAGE = None
	<PARAMETER>TARGETLANGUAGE</PARAMETER> specifies the default target language to translate More...

	TARGETLANGUAGECLASS = None
	<PARAMETER>TARGETLANGUAGECLASS</PARAMETER> specifies the class to use when <PARAMETER>TARGETLANGUAGE</PARAMETER> More...

	THESAURUS = None
	<PARAMETER>THESAURUS</PARAMETER> specifies that a language-specific thesaurus is to be used in place of, More...

	WORDCHARS = None
	<PARAMETER>WORDCHARS</PARAMETER> specifies the characters other than alphabetic that may More...

Static Public Attributes inherited from String
	COLLATION = None
	The default collation value used for this data type. More...

	CONTENT = None
	XML element content "MIXED" for mixed="true" and "STRING" or "ESCAPE" for mixed="false". More...

	DISPLAYLIST = None
	Used for enumerated (multiple-choice) attributes. More...

	ESCAPE = None
	Controls the translate table used to escape content when CONTENT="MIXED" is specified.

	JSONLISTPARAMETER = None
	Used to specify the name of the parameter which contains the enumeration list for JSON values. More...

	JSONTYPE = None
	JSONTYPE is JSON type used for this datatype.

	MAXLEN = None
	The maximum number of characters the string can contain. More...

	MINLEN = None
	The minimum number of characters the string can contain.

	PATTERN = None
	A pattern which the string should match. More...

	TRUNCATE = None
	Determines whether to truncate the string to MAXLEN characters.

	VALUELIST = None
	Used for enumerated (multiple-choice) attributes. More...

	XMLLISTPARAMETER = None
	Used to specify the name of the parameter which contains the enumeration list for XML values. More...

	XSDTYPE = None
	Declares the XSD type used when projecting XML Schemas.

Static Public Attributes inherited from DataType
	INDEXNULLMARKER = None
	Override this parameter value to specify what value should be used as a null marker when a property of the type is used in a subscript of an index map. More...

Detailed Description

See <CLASS>Text.Text</CLASS>

The <CLASS>Text.Japanese</CLASS> class implements (or calls) the Japanese language-specific stemming algorithm and initializes the language-specific list of noise words.

Member Function Documentation

◆ ExcludeCommonTerms()

_.Library.Status ExcludeCommonTerms ( nTerms )

static

Classifies the most common nTerms words in the current language as noise words.

The words specified

in <PARAMETER>NOISEWORDS100</PARAMETER>, <PARAMETER>NOISEWORDS200</PARAMETER>, and <PARAMETER>NOISEWORDS300</PARAMETER>, list the most common 300 words of the current language, in order of their frequency. Similarly, <PARAMETER>NOISEBIGRAMSn00</PARAMETER> lists the most common 300 bigrams of the current language that would not typically be considered useful for searching.

Reimplemented from Text.

Member Data Documentation

◆ CASEINSENSITIVE

CASEINSENSITIVE = None