findSvcModel {svcR}R Documentation

Computation of clustering model by support vector machine

Description

SvcR implements a clustering algorithm based on separator search in a feature space between points described in a data space. Data format is defined by an attribute/value table (matrix). The data are transformed within a kernel to a feature space into a unic cluster bounded with a ball radius and support vectors. We can used the radius of this ball in the data space to reconstruct the boundary shaped now in several clusters.

Usage

## S4 method for signature 'list':
findSvcModel( x=iris, MetOpt="optimStoch", MetLab="gridLabeling", KernChoice="KernGaussian", Nu=0.8, q=20, K=1, G=10, Cx=1, Cy=2 )

## S4 method for signature 'list':
findSvcModel.loadMat( x=iris )

## S4 method for signature 'matrix':
findSvcModel.Eval( x=matrix() )

## S4 method for signature 'numeric':
findSvcModel.Test()

## S4 method for signature 'findSvcModel':
getNumPoints( object=new("findSvcModel") )

## S4 method for signature 'findSvcModel':
getClassPoints( object=new("findSvcModel") )

## S4 method for signature 'findSvcModel':
getMatriceK( object=new("findSvcModel") )

## S4 method for signature 'findSvcModel':
getlagrangeCoeff( object=new("findSvcModel") )

Arguments

x means dataFrame parameter in standard use means dataFrame in chargeMatrix use means DatMat in Eval use, a Matrix given as unic argument
MetOpt option taking value "optimStoch" (stochastic way of optimization) or "optimQuad" (quadratic way of optimization)
MetLab option taking value "gridLabeling" (grid labelling) or "mstLabeling" (mst labelling) or "knnLabeling" (knn labelling)
KernChoice option taking value "KernLinear" (Euclidian) or "KernGaussian" (RBF) or "KernGaussianDist" (Exponential) or "KernDist" (Matrix data as Kernel value)
Nu kernel parameter
q kernel parameter
K number of neigbours on the grid
G size of the grid
Cx 1st data coordinate to plot for 2D cluster extraction
Cy 2nd data coordinate to plot for 2D cluster extraction
object a findSvcModel object

Details

The main function of the package is called findSvcModel. It takes a data set as first argument. This dataset can be either (1) a data.frame() structure , or (2) text files stored on hard disk drive.

In case (1) data.frame is classical standard list format (see Iris data).

In case (2) format of ‘dataName_mat.txt’ (data matrix): 1 1 5.1 1 2 3.5 2 3 1.4 it mean mat[1, 1] = 5.1, mat[1, 2] = 3.5, mat[2, 3] = 1.4

format of ‘dataName_att.txt’ : X1 X2 it mean X1 is the name of first column of the data matrix, X2 is the name of the second column of the data matrix

format of ‘dataName_var.txt’ : v1 v2 it mean v1 is the name of first line of the data matrix, v2 is the name of the second line of the data matrix

For the labeling parameter (MetLab) three choice are available: "gridLabeling", "mstLabeling" and "knnLabeling".

For the kernel parameter (KernelChoice) four choices are available: "KernLinear", "KernGaussian", "KernGaussianDist" and "KernDist".

Value

An S4 object of class findSvcModel The object is the svc model along with the slots :

lagrangeCoeff lagrange coefficients : getlagrangeCoeff$A
Matrice variables names Matrice$var, attributes names Matrice$Att and data Matrice$Mat
MatriceK kernel matrix
Data Data Matrix
MinMaxXY min max values for first and second coordinates
MisClass missclassfied points
dataFrame prefix name of data for files decoding
ClassPoints class of grid points
Cx x column id of data matrix
Cy y column id of data matrix
Nu nu value of the svc model
KNN knn value for labelling
SizeGrid size grid for labelling
AroundNullVA almost null value for lagrange coefficient estimation
NumPoints value fo grid points


slots can be accessed by getlagrangeCoeff(object), getMatriceK(object), getClassPoints(object), getNumPoints(object)

Author(s)

Nicolas Turenne - INRA France nicolas.turenne@jouy.inra.fr

References

N.Turenne , Some Heuristics to speed-up Support Vector Clustering , technical report 2006, INRA, France http://migale.jouy.inra.fr/~turenne/svc.pdf

Examples


## exemple with iris data

MetOpt     = "optimStoch";      #  optimisation method with randomization
MetLab     = "gridLabeling";    #  grid labelling
KernChoice = "KernGaussian";    #  radial kernel
Nu         = 1.0; 
q          = 2000;   # lot of clusters
K          = 1;      # only 1  nearest neighbour for clustering
Cx = Cy    = 0;      # we use principal component analysis factors
G          = 20;     # size of the grid for cluster labelling

# usage example with a data frame 
data(iris);
fmc = findSvcModel( iris, MetOpt, MetLab, KernChoice, Nu, q, K, G, Cx, Cy); 
plot(fmc);


[Package svcR version 1.6.3 Index]