weka.filters.unsupervised.attribute
Class PrincipalComponents

java.lang.Object
  extended by weka.filters.Filter
      extended by weka.filters.unsupervised.attribute.PrincipalComponents
All Implemented Interfaces:
java.io.Serializable, CapabilitiesHandler, OptionHandler, RevisionHandler, UnsupervisedFilter

public class PrincipalComponents
extends Filter
implements OptionHandler, UnsupervisedFilter

Performs a principal components analysis and transformation of the data.
Dimensionality reduction is accomplished by choosing enough eigenvectors to account for some percentage of the variance in the original data -- default 0.95 (95%).
Based on code of the attribute selection scheme 'PrincipalComponents' by Mark Hall and Gabi Schmidberger.

Valid options are:

 -D
  Don't normalize input data.
 -R <num>
  Retain enough PC attributes to account
  for this proportion of variance in the original data.
  (default: 0.95)
 -A <num>
  Maximum number of attributes to include in 
  transformed attribute names.
  (-1 = include all, default: 5)
 -M <num>
  Maximum number of PC attributes to retain.
  (-1 = include all, default: -1)

Version:
$Revision: 5543 $
Author:
Mark Hall (mhall@cs.waikato.ac.nz) -- attribute selection code, Gabi Schmidberger (gabi@cs.waikato.ac.nz) -- attribute selection code, fracpete (fracpete at waikato dot ac dot nz) -- filter code
See Also:
Serialized Form

Constructor Summary
PrincipalComponents()
           
 
Method Summary
 boolean batchFinished()
          Signify that this batch of input to the filter is finished.
 Capabilities getCapabilities()
          Returns the capabilities of this evaluator.
 int getMaximumAttributeNames()
          Gets maximum number of attributes to include in transformed attribute names.
 int getMaximumAttributes()
          Gets maximum number of PC attributes to retain.
 boolean getNormalize()
          Gets whether or not input data is to be normalized.
 java.lang.String[] getOptions()
          Gets the current settings of the filter.
 java.lang.String getRevision()
          Returns the revision string.
 double getVarianceCovered()
          Gets the proportion of total variance to account for when retaining principal components.
 java.lang.String globalInfo()
          Returns a string describing this filter.
 boolean input(Instance instance)
          Input an instance for filtering.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] args)
          Main method for running this filter.
 java.lang.String maximumAttributeNamesTipText()
          Returns the tip text for this property.
 java.lang.String maximumAttributesTipText()
          Returns the tip text for this property.
 java.lang.String normalizeTipText()
          Returns the tip text for this property.
 boolean setInputFormat(Instances instanceInfo)
          Sets the format of the input instances.
 void setMaximumAttributeNames(int value)
          Sets maximum number of attributes to include in transformed attribute names.
 void setMaximumAttributes(int value)
          Sets maximum number of PC attributes to retain.
 void setNormalize(boolean value)
          Set whether input data will be normalized.
 void setOptions(java.lang.String[] options)
          Parses a list of options for this object.
 void setVarianceCovered(double value)
          Sets the amount of variance to account for when retaining principal components.
 java.lang.String varianceCoveredTipText()
          Returns the tip text for this property.
 
Methods inherited from class weka.filters.Filter
batchFilterFile, filterFile, getCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, numPendingOutput, output, outputPeek, toString, useFilter, wekaStaticWrapper
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

PrincipalComponents

public PrincipalComponents()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this filter.

Returns:
a description of the filter suitable for displaying in the explorer/experimenter gui

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a list of options for this object.

Valid options are:

 -D
  Don't normalize input data.
 -R <num>
  Retain enough PC attributes to account
  for this proportion of variance in the original data.
  (default: 0.95)
 -A <num>
  Maximum number of attributes to include in 
  transformed attribute names.
  (-1 = include all, default: 5)
 -M <num>
  Maximum number of PC attributes to retain.
  (-1 = include all, default: -1)

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the filter.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

normalizeTipText

public java.lang.String normalizeTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setNormalize

public void setNormalize(boolean value)
Set whether input data will be normalized.

Parameters:
value - true if input data is to be normalized

getNormalize

public boolean getNormalize()
Gets whether or not input data is to be normalized.

Returns:
true if input data is to be normalized

varianceCoveredTipText

public java.lang.String varianceCoveredTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setVarianceCovered

public void setVarianceCovered(double value)
Sets the amount of variance to account for when retaining principal components.

Parameters:
value - the proportion of total variance to account for

getVarianceCovered

public double getVarianceCovered()
Gets the proportion of total variance to account for when retaining principal components.

Returns:
the proportion of variance to account for

maximumAttributeNamesTipText

public java.lang.String maximumAttributeNamesTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setMaximumAttributeNames

public void setMaximumAttributeNames(int value)
Sets maximum number of attributes to include in transformed attribute names.

Parameters:
value - the maximum number of attributes

getMaximumAttributeNames

public int getMaximumAttributeNames()
Gets maximum number of attributes to include in transformed attribute names.

Returns:
the maximum number of attributes

maximumAttributesTipText

public java.lang.String maximumAttributesTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setMaximumAttributes

public void setMaximumAttributes(int value)
Sets maximum number of PC attributes to retain.

Parameters:
value - the maximum number of attributes

getMaximumAttributes

public int getMaximumAttributes()
Gets maximum number of PC attributes to retain.

Returns:
the maximum number of attributes

getCapabilities

public Capabilities getCapabilities()
Returns the capabilities of this evaluator.

Specified by:
getCapabilities in interface CapabilitiesHandler
Overrides:
getCapabilities in class Filter
Returns:
the capabilities of this evaluator
See Also:
Capabilities

setInputFormat

public boolean setInputFormat(Instances instanceInfo)
                       throws java.lang.Exception
Sets the format of the input instances.

Overrides:
setInputFormat in class Filter
Parameters:
instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
Returns:
true if the outputFormat may be collected immediately
Throws:
java.lang.Exception - if the input format can't be set successfully

input

public boolean input(Instance instance)
              throws java.lang.Exception
Input an instance for filtering. Filter requires all training instances be read before producing output.

Overrides:
input in class Filter
Parameters:
instance - the input instance
Returns:
true if the filtered instance may now be collected with output().
Throws:
java.lang.IllegalStateException - if no input format has been set
java.lang.Exception - if conversion fails

batchFinished

public boolean batchFinished()
                      throws java.lang.Exception
Signify that this batch of input to the filter is finished.

Overrides:
batchFinished in class Filter
Returns:
true if there are instances pending output
Throws:
java.lang.NullPointerException - if no input structure has been defined,
java.lang.Exception - if there was a problem finishing the batch.

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Overrides:
getRevision in class Filter
Returns:
the revision

main

public static void main(java.lang.String[] args)
Main method for running this filter.

Parameters:
args - should contain arguments to the filter: use -h for help