This class provides an implemantation of Partitioning Around Medoids (PAM) algorithm, a.k.a. More...
Public Member Functions | |
_.Library.Double | ClusterCost (_.Library.Integer k) |
This class provides an implemantation of Partitioning Around Medoids (PAM) algorithm, a.k.a. More... | |
_.Library.Boolean | IsPrepared () |
Checks whether the model is ready for an analysis to be executed. More... | |
![]() | |
_.Library.Integer | ById (_.Library.RawString id) |
Returns the ordinal number of the point with the given ID id. More... | |
_.Library.Double | Distance (_.Library.Integer i, _.Library.Integer j, _.Library.Double p, _.Library.Boolean normalize) |
Returns the dissimilarity measure between two data points of the model. More... | |
_.Library.Double | Distance1 (_.Library.Integer i, z, _.Library.Double p, _.Library.Boolean normalize) |
Returns the dissimilarity measure between a data points of the model and a point with given coordinates. More... | |
_.Library.Double | Distance12 (z1, z2, _.Library.Double p, _.Library.Boolean normalize) |
Returns the dissimilarity measure between two points with given coordinates. More... | |
_.DeepSee.extensions.clusters.ASW | GetASWIndex () |
Returns an object that can calculate an index used in Cluster Validation. More... | |
_.DeepSee.extensions.clusters.CalinskiHarabasz | GetCalinskiHarabaszIndex (_.Library.Integer normalize) |
Returns an object that can calculate an index used in Cluster Validation. More... | |
GetCentroid (_.Library.Integer k, z) | |
Returns the coordinates for the centroid for a given cluster. More... | |
_.Library.Integer | GetCluster (_.Library.Integer point) |
Returns the cluster ordinal for a given point. More... | |
GetClusterSize (_.Library.Integer k) | |
Returns the number of data points assigned to a given cluster. More... | |
_.Library.Integer | GetCost (_.Library.Integer i, _.Library.Integer j) |
Returns the dissimilarity measure as used by this clustering algorithm. More... | |
_.Library.Integer | GetCount () |
Returns the number of all data points in the model. | |
_.Library.Integer | GetDimensions () |
Returns the dimensionality of the model. | |
_.Library.String | GetId (_.Library.Integer i) |
Returns the unque Id of the point with the ordinal number specified by i. More... | |
_.Library.Integer | GetNumberOfClusters () |
Returns the number of clusters in the model. | |
_.DeepSee.extensions.clusters.PearsonGamma | GetPearsonGammaIndex () |
Returns an object that can calculate an index used in Cluster Validation. More... | |
GlobalCentroid (z) | |
Returns the coordinates for the centroid for the whole dataset. More... | |
_.Library.Double | RelativeClusterCost (_.Library.Integer k, _.Library.Integer m) |
Returns the realtive cost of a given cluster relative to a medoid point m. More... | |
Reset () | |
Kills all the data associated with this model. | |
_.Library.Status | SetData (_.Library.IResultSet rs, _.Library.Integer dim, _.Library.Double nullReplacement) |
Sets the data to be associated with this model. More... | |
iterateCluster (_.Library.Integer k, _.Library.Integer i, _.Library.String id, coordinates) | |
Iterates over all the data points assigned to a given cluster. More... | |
printAll () | |
Convenience method. More... | |
printCluster (_.Library.Integer k) | |
Convenience method. More... | |
![]() | |
_.Library.Status | OnAddToSaveSet (_.Library.Integer depth, _.Library.Integer insert, _.Library.Integer callcount) |
This callback method is invoked when the current object is added to the SaveSet,. More... | |
_.Library.Status | OnClose () |
This callback method is invoked by the <METHOD>Close</METHOD> method to. More... | |
_.Library.Status | OnConstructClone (_.Library.RegisteredObject object, _.Library.Boolean deep, _.Library.String cloned) |
This callback method is invoked by the <METHOD>ConstructClone</METHOD> method to. More... | |
_.Library.Status | OnNew () |
This callback method is invoked by the <METHOD>New</METHOD> method to. More... | |
_.Library.Status | OnValidateObject () |
This callback method is invoked by the <METHOD>ValidateObject</METHOD> method to. More... | |
Public Attributes | |
K | |
The number of clusters to create. More... | |
![]() | |
DSName | |
More... | |
Dim | |
More... | |
Normalize | |
Whether to normalize distance across multiple dimensions. More... | |
P | |
The power to use in calculation of dissimilarity. More... | |
Verbose | |
More... | |
Additional Inherited Members | |
![]() | |
_.Library.Status | Delete (_.Library.String dataset) |
Deletes a model for a dataset with the name given by dataset argument. | |
_.Library.Boolean | Exists (_.Library.String dataset) |
Checks whether a model for a dataset with the name given by dataset argument already exists. | |
![]() | |
CAPTION = None | |
Optional name used by the Form Wizard for a class when generating forms. More... | |
JAVATYPE = None | |
The Java type to be used when exported. | |
PROPERTYVALIDATION = None | |
This parameter controls the default validation behavior for the object. More... | |
This class provides an implemantation of Partitioning Around Medoids (PAM) algorithm, a.k.a.
K-Medoids (do not mix with K-Means).
The PAM algorithm was developed by Leonard Kaufman and Peter J. Rousseeuw, and this algorithm is very similar to K-means, mostly because both are partitional algorithms, in other words, both break the datasets into groups, and both works trying to minimize the error, but PAM works with Medoids, that are an entity of the dataset that represent the group in which it is inserted, and K-means works with Centroids, that are artificially created entity that represent its cluster.
The PAM algorithm partitionates a dataset of n objects into a number k of clusters, where both the dataset and the number k is an input of the algorithm. This algorithm works with a matrix of dissimilarity, where its goal is to minimize the overall dissimilarity between the representants of each cluster and its members.
Pure PAM algorithm only works when a dataset is well partitioned by its nature. It first generates a random solution and then uses the steepest descent to optimize it. Therefore it is prone to falling into local minimum. Two modifications implemented by subclasses <CLASS>PAMSA</CLASS> (PAM with Simulated Annealing) and <CLASS>CLARA</CLASS> (Clustering for Large Applications) try to alleviate this deficiency.
_.Library.Double ClusterCost | ( | _.Library.Integer | k | ) |
This class provides an implemantation of Partitioning Around Medoids (PAM) algorithm, a.k.a.
K-Medoids (do not mix with K-Means).
The PAM algorithm was developed by Leonard Kaufman and Peter J. Rousseeuw, and this algorithm is very similar to K-means, mostly because both are partitional algorithms, in other words, both break the datasets into groups, and both works trying to minimize the error, but PAM works with Medoids, that are an entity of the dataset that represent the group in which it is inserted, and K-means works with Centroids, that are artificially created entity that represent its cluster.
The PAM algorithm partitionates a dataset of n objects into a number k of clusters, where both the dataset and the number k is an input of the algorithm. This algorithm works with a matrix of dissimilarity, where its goal is to minimize the overall dissimilarity between the representants of each cluster and its members.
Pure PAM algorithm only works when a dataset is well partitioned by its nature. It first generates a random solution and then uses the steepest descent to optimize it. Therefore it is prone to falling into local minimum. Two modifications implemented by subclasses <CLASS>PAMSA</CLASS> (PAM with Simulated Annealing) and <CLASS>CLARA</CLASS> (Clustering for Large Applications) try to alleviate this deficiency.
_.Library.Boolean IsPrepared | ( | ) |
Checks whether the model is ready for an analysis to be executed.
This is dependent on a
specific algorithm and therefore this method is overriden by subclasses.
Reimplemented from AbstractModel.
Reimplemented in CLARA.
K |
The number of clusters to create.