IRISLIB database
IKnowBuilder Class Reference

Parent class for any iKnow-based <class>iKnow.Classification.Builder</class> implementations, providing common infrastructure abstracting a few iKnow API calls. More...

Inheritance diagram for IKnowBuilder:
Collaboration diagram for IKnowBuilder:

Public Member Functions

_.Library.Status OnCreateExportTable (_.Dictionary.ClassDefinition pClassDef, _.Library.Boolean pVerbose)
 Callback invoked by <method>ExportDataTable</method> when creating the export table definition.
 
_.Library.Status OnExportTable (_.Library.String pClassName, _.Library.Boolean pVerbose, _.Library.Boolean pTracking)
 Callback invoked by <method>ExportDataTable</method> to load the data into export table <class>pClassName</class>.
 
_.Library.Status OnGenerateClassifier (_.iKnow.Classification.Definition.Classifier pDefinition, _.Library.Boolean pVerbose, _.Library.Boolean pIncludeBuilderInfo)
 Appends the ClassificationMethod element for this type of classifier.
 
- Public Member Functions inherited from RegisteredObject
_.Library.Status OnAddToSaveSet (_.Library.Integer depth, _.Library.Integer insert, _.Library.Integer callcount)
 This callback method is invoked when the current object is added to the SaveSet,. More...
 
_.Library.Status OnClose ()
 This callback method is invoked by the <METHOD>Close</METHOD> method to. More...
 
_.Library.Status OnConstructClone (_.Library.RegisteredObject object, _.Library.Boolean deep, _.Library.String cloned)
 This callback method is invoked by the <METHOD>ConstructClone</METHOD> method to. More...
 
_.Library.Status OnNew ()
 This callback method is invoked by the <METHOD>New</METHOD> method to. More...
 
_.Library.Status OnValidateObject ()
 This callback method is invoked by the <METHOD>ValidateObject</METHOD> method to. More...
 

Public Attributes

 DomainId
 The iKnow domain this categorization model is built from. More...
 
 MetadataField
 If set, this metadata field contains the actual category value for each source. More...
 
 TestSet
   More...
 
 TrainingSet
 The sample set of the domain to be used for training this model. More...
 
- Public Attributes inherited from Builder
 ClassificationMethod
 The general method used for classification: More...
 
 Description
 Optional description for the Classifier. More...
 
 DocumentVectorLocalWeights
 Local Term Weights for the document vector to register in the ClassificationMethod element. More...
 
 DocumentVectorNormalization
 Document vector normalization method to register in the Classification element. More...
 
 MinimumSpread
 The minimum number of records in the training set that should contain a term before it. More...
 
 MinimumSpreadPercent
 The minimum fraction of records in the training set that should contain a term before it. More...
 

Private Member Functions

_.Library.Status GetCategoryInfo (pCategories)
 Returns all categories added so far: More...
 
_.Library.Status LoadMetadataCategories (_.Library.String pFieldName)
 
_.Library.Status PopulateTerms (_.Library.Integer pCount, _.Library.String pType, _.Library.String pMetric, _.Library.Boolean pPerCategory)
 
_.Library.Status TestClassifier (_.Library.RawString pTestSet, pResult, _.Library.Double pAccuracy, _.Library.String pCategorySpec, _.Library.Boolean pVerbose)
 

Additional Inherited Members

- Static Public Attributes inherited from RegisteredObject
 CAPTION = None
 Optional name used by the Form Wizard for a class when generating forms. More...
 
 JAVATYPE = None
 The Java type to be used when exported.
 
 PROPERTYVALIDATION = None
 This parameter controls the default validation behavior for the object. More...
 

Detailed Description

Parent class for any iKnow-based <class>iKnow.Classification.Builder</class> implementations, providing common infrastructure abstracting a few iKnow API calls.

IKnowBuilder implementations assume category specs are <class>iKnow.Filters.Filter</class> instances in their string representation.

Member Function Documentation

◆ GetCategoryInfo()

_.Library.Status GetCategoryInfo (   pCategories)
private

Returns all categories added so far:

   pCategories(n) = $lb([name], [record count])

Reimplemented from Builder.

◆ LoadMetadataCategories()

_.Library.Status LoadMetadataCategories ( _.Library.String  pFieldName)
private

Creates (appends) categories for each of the available values of a given metadata field

pFieldName in the full domain (thus ignoring <property>TrainingSet</property>).

Note: as category names are case sensitive, it is highly recommended to use a case-sensitive metadata field.

◆ PopulateTerms()

_.Library.Status PopulateTerms ( _.Library.Integer  pCount,
_.Library.String  pType,
_.Library.String  pMetric,
_.Library.Boolean  pPerCategory 
)
private

This PopulateTerms implementation accepts "BM25" and "TFIDF" as acceptable values for

pMetric. See also the class reference for this method in <class>iKnow.Classification.Builder</class>.

Reimplemented from Builder.

◆ TestClassifier()

_.Library.Status TestClassifier ( _.Library.RawString  pTestSet,
  pResult,
_.Library.Double  pAccuracy,
_.Library.String  pCategorySpec,
_.Library.Boolean  pVerbose 
)
private

Utility method to batch-test the classifier against a test set pTestSet, which can

be supplied as an <class>iKnow.Filters.Filter</class> object or its serialized form. Per-record results are returned through pResult:
pResult(n) = $lb([record ID], [actual category], [predicted category])

pAccuracy will contain the raw accuracy (# of records predicted correctly) of the current model. Use <class>iKnow.Classificaton.Utils</class> for more advanced model testing.

If the current model's category options were added through <method>AddCategory</method> without providing an appropriate category filter specification, rather than through a call to <class>LoadMetadataCategories</class> (which sets <property>MetadataField</property>), supply a metadata field through pCategorySpec where the actual category values to test against can be found.

Reimplemented from Builder.

Member Data Documentation

◆ DomainId

DomainId

The iKnow domain this categorization model is built from.

 

◆ MetadataField

MetadataField

If set, this metadata field contains the actual category value for each source.

 

◆ TestSet

TestSet

 

 

◆ TrainingSet

TrainingSet

The sample set of the domain to be used for training this model.