IRISLIB database
TextTransformation Class Reference
Inheritance diagram for TextTransformation:
Collaboration diagram for TextTransformation:

Public Member Functions

_.Library.Status BufferString (_.Library.String data)
 
_.Library.Status Convert ()
 
_.Library.String NextConvertedPart ()
 
- Public Member Functions inherited from Converter
_.Library.Status SetParams (_.Library.String params)
 
- Public Member Functions inherited from RegisteredObject
_.Library.Status OnAddToSaveSet (_.Library.Integer depth, _.Library.Integer insert, _.Library.Integer callcount)
 This callback method is invoked when the current object is added to the SaveSet,. More...
 
_.Library.Status OnClose ()
 This callback method is invoked by the <METHOD>Close</METHOD> method to. More...
 
_.Library.Status OnConstructClone (_.Library.RegisteredObject object, _.Library.Boolean deep, _.Library.String cloned)
 This callback method is invoked by the <METHOD>ConstructClone</METHOD> method to. More...
 
_.Library.Status OnNew ()
 This callback method is invoked by the <METHOD>New</METHOD> method to. More...
 
_.Library.Status OnValidateObject ()
 This callback method is invoked by the <METHOD>ValidateObject</METHOD> method to. More...
 

Static Public Member Functions

_.Library.List GetMetadataKeys (_.Library.String params)
 If the Converter extracts metadata, this method should return a list of keys of the metadata fields that are. More...
 
- Static Public Member Functions inherited from Converter
_.Library.String Test (_.Library.String pInput, _.Library.List pParams, _.Library.Status pSC)
 Utility method to test a converter class. More...
 

Private Attributes

 __Buffer
   More...
 
 __OutputText
   More...
 

Additional Inherited Members

- Public Attributes inherited from Converter
 Params
   More...
 
- Static Public Attributes inherited from RegisteredObject
 CAPTION = None
 Optional name used by the Form Wizard for a class when generating forms. More...
 
 JAVATYPE = None
 The Java type to be used when exported.
 
 PROPERTYVALIDATION = None
 This parameter controls the default validation behavior for the object. More...
 

Detailed Description

This <class>iKnow.Source.Converter</class> implementation wraps around a Text Transformation model and will extract sections and key-value pairs as defined in the model. Select sections will be concatenated and used as text input for indexing by the iKnow engine, while select key-value pairs can be saved as metadata values.

Converter parameters:

  1. Model class name (String): name of the <class>iKnow.TextTransformation.Definition</class> class containing the TT model definition. This parameter is required.
  2. Section headers to index (String, default = ""): comma-separated list of section headers whose contents is to be indexed. Leaving this parameter blank (default) will cause all sections to be indexed. Header names are case-insensitive.
  3. Include headers in sections (Boolean, default = 0): whether or not to include the header itself to be indexed as well. Setting this value to 1 will ensure section contents is always prepended with the title.
  4. Keys to extract for metadata (String, default = ""): comma-separated list of keys the model extracts that need to be saved as metadata values. Leaving this parameter blank (default) will result in no key-value pairs being saved as metadata. Key names are case-insensitive.
  5. Metadata field names (String, default = ""): comma-separated list of metadata field names corresponding to the key names in the third parameter. If left blank, it is assumed the key names themselves are valid metadata field names.

Member Function Documentation

◆ BufferString()

_.Library.Status BufferString ( _.Library.String  data)

This method takes the raw input text and buffers it internally in the converter. The text is provided in

chunks of 32k. Every custom converter will need to implement this method so that it can take in the raw data.

Reimplemented from Converter.

◆ Convert()

_.Library.Status Convert ( )

This method is called after all data has been buffered. In this method the converter will need to parse the

raw data and extract/convert it into plain text data. If any metadata is present within the document the converter can extract that metadata here, and provide it to the system. Metadata can be reported by using the <method>SetCurrentMetadataValues</method> function.

Reimplemented from Converter.

◆ GetMetadataKeys()

_.Library.List GetMetadataKeys ( _.Library.String  params)
static

If the Converter extracts metadata, this method should return a list of keys of the metadata fields that are.

extracted from the contents. The values will be exposed in the <method>Convert</method> method in the same order as they are reported here.

Reimplemented from Converter.

◆ NextConvertedPart()

_.Library.String NextConvertedPart ( )

When conversion is done, this method will be called to fetch the converted data back from the converter. The method

should return the converted text in chuncks of maximum 32k in size. When no more data is available, the method should return the empty string ("") to signal that all data has been transferred.

Reimplemented from Converter.

Member Data Documentation

◆ __Buffer

__Buffer
private

 

 

◆ __OutputText

__OutputText
private