public class Bagging extends RandomizableParallelIteratedSingleClassifierEnhancer implements WeightedInstancesHandler, AdditionalMeasureProducer, TechnicalInformationHandler, PartitionGenerator, Aggregateable<Bagging>
@article{Breiman1996,
author = {Leo Breiman},
journal = {Machine Learning},
number = {2},
pages = {123-140},
title = {Bagging predictors},
volume = {24},
year = {1996}
}
Valid options are:
-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-print Print the individual classifiers in the output
-store-out-of-bag-predictions Whether to store out of bag predictions in internal evaluation object.
-output-out-of-bag-complexity-statistics Whether to output complexity-based statistics when out-of-bag evaluation is performed.
-represent-copies-using-weights Represent copies of instances using weights rather than explicitly.
-S <num> Random number seed. (default 1)
-num-slots <num> Number of execution slots. (default 1 - i.e. no parallelism)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)
-I Initial class value count (default 0)
-R Spread initial count over all class values (i.e. don't use 1 per value)Options after -- are passed to the designated classifier.
BATCH_SIZE_DEFAULT, NUM_DECIMAL_PLACES_DEFAULT| Constructor and Description |
|---|
Bagging()
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
Bagging |
aggregate(Bagging toAggregate)
Aggregate an object with this one
|
java.lang.String |
bagSizePercentTipText()
Returns the tip text for this property
|
void |
buildClassifier(Instances data)
Bagging method.
|
java.lang.String |
calcOutOfBagTipText()
Returns the tip text for this property
|
double[] |
distributionForInstance(Instance instance)
Calculates the class membership probabilities for the given test
instance.
|
java.util.Enumeration<java.lang.String> |
enumerateMeasures()
Returns an enumeration of the additional measure names.
|
void |
finalizeAggregation()
Call to complete the aggregation process.
|
void |
generatePartition(Instances data)
Builds the classifier to generate a partition.
|
int |
getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size.
|
boolean |
getCalcOutOfBag()
Get whether the out of bag error is calculated.
|
double |
getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure.
|
double[] |
getMembershipValues(Instance inst)
Computes an array that indicates leaf membership
|
java.lang.String[] |
getOptions()
Gets the current settings of the Classifier.
|
Evaluation |
getOutOfBagEvaluationObject()
Returns the out-of-bag evaluation object.
|
boolean |
getOutputOutOfBagComplexityStatistics()
Gets whether complexity statistics are output when OOB estimation is performed.
|
boolean |
getPrintClassifiers()
Get whether to print the individual ensemble classifiers in the output
|
boolean |
getRepresentCopiesUsingWeights()
Get whether copies of instances are represented using weights rather than explicitly.
|
java.lang.String |
getRevision()
Returns the revision string.
|
boolean |
getStoreOutOfBagPredictions()
Get whether the out of bag predictions are stored.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
java.lang.String |
globalInfo()
Returns a string describing classifier
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] argv)
Main method for testing this class.
|
double |
measureOutOfBagError()
Gets the out of bag error that was calculated as the classifier
was built.
|
int |
numElements()
Returns the number of elements in the partition.
|
java.lang.String |
outputOutOfBagComplexityStatisticsTipText()
Returns the tip text for this property
|
java.lang.String |
printClassifiersTipText()
Returns the tip text for this property
|
java.lang.String |
representCopiesUsingWeightsTipText()
Returns the tip text for this property
|
void |
setBagSizePercent(int newBagSizePercent)
Sets the size of each bag, as a percentage of the training set size.
|
void |
setCalcOutOfBag(boolean calcOutOfBag)
Set whether the out of bag error is calculated.
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setOutputOutOfBagComplexityStatistics(boolean b)
Sets whether complexity statistics are output when OOB estimation is performed.
|
void |
setPrintClassifiers(boolean print)
Set whether to print the individual ensemble classifiers in the output
|
void |
setRepresentCopiesUsingWeights(boolean representUsingWeights)
Set whether copies of instances are represented using weights rather than explicitly.
|
void |
setStoreOutOfBagPredictions(boolean storeOutOfBag)
Set whether the out of bag predictions are stored.
|
java.lang.String |
storeOutOfBagPredictionsTipText()
Returns the tip text for this property
|
java.lang.String |
toString()
Returns description of the bagged classifier.
|
getSeed, seedTipText, setSeedgetNumExecutionSlots, numExecutionSlotsTipText, setNumExecutionSlotsgetNumIterations, numIterationsTipText, setNumIterationsclassifierTipText, getCapabilities, getClassifier, postExecution, preExecution, setClassifierbatchSizeTipText, classifyInstance, debugTipText, distributionsForInstances, doNotCheckCapabilitiesTipText, forName, getBatchSize, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, implementsMoreEfficientBatchPrediction, makeCopies, makeCopy, numDecimalPlacesTipText, run, runClassifier, setBatchSize, setDebug, setDoNotCheckCapabilities, setNumDecimalPlacesequals, getClass, hashCode, notify, notifyAll, wait, wait, waitgetCapabilitiespublic java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation in interface TechnicalInformationHandlerpublic java.util.Enumeration<Option> listOptions()
listOptions in interface OptionHandlerlistOptions in class RandomizableParallelIteratedSingleClassifierEnhancerpublic void setOptions(java.lang.String[] options)
throws java.lang.Exception
-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-print Print the individual classifiers in the output
-store-out-of-bag-predictions Whether to store out of bag predictions in internal evaluation object.
-output-out-of-bag-complexity-statistics Whether to output complexity-based statistics when out-of-bag evaluation is performed.
-represent-copies-using-weights Represent copies of instances using weights rather than explicitly.
-S <num> Random number seed. (default 1)
-num-slots <num> Number of execution slots. (default 1 - i.e. no parallelism)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)
-I Initial class value count (default 0)
-R Spread initial count over all class values (i.e. don't use 1 per value)Options after -- are passed to the designated classifier.
setOptions in interface OptionHandlersetOptions in class RandomizableParallelIteratedSingleClassifierEnhanceroptions - the list of options as an array of stringsjava.lang.Exception - if an option is not supportedpublic java.lang.String[] getOptions()
getOptions in interface OptionHandlergetOptions in class RandomizableParallelIteratedSingleClassifierEnhancerpublic java.lang.String bagSizePercentTipText()
public int getBagSizePercent()
public void setBagSizePercent(int newBagSizePercent)
newBagSizePercent - the bag size, as a percentage.public java.lang.String representCopiesUsingWeightsTipText()
public void setRepresentCopiesUsingWeights(boolean representUsingWeights)
representUsingWeights - whether to represent copies using weightspublic boolean getRepresentCopiesUsingWeights()
public java.lang.String storeOutOfBagPredictionsTipText()
public void setStoreOutOfBagPredictions(boolean storeOutOfBag)
storeOutOfBag - whether the out of bag predictions are storedpublic boolean getStoreOutOfBagPredictions()
public java.lang.String calcOutOfBagTipText()
public void setCalcOutOfBag(boolean calcOutOfBag)
calcOutOfBag - whether to calculate the out of bag errorpublic boolean getCalcOutOfBag()
public java.lang.String outputOutOfBagComplexityStatisticsTipText()
public boolean getOutputOutOfBagComplexityStatistics()
public void setOutputOutOfBagComplexityStatistics(boolean b)
b - whether statistics are calculatedpublic java.lang.String printClassifiersTipText()
public void setPrintClassifiers(boolean print)
print - true if the individual classifiers are to be printedpublic boolean getPrintClassifiers()
public double measureOutOfBagError()
public java.util.Enumeration<java.lang.String> enumerateMeasures()
enumerateMeasures in interface AdditionalMeasureProducerpublic double getMeasure(java.lang.String additionalMeasureName)
getMeasure in interface AdditionalMeasureProduceradditionalMeasureName - the name of the measure to query for its valuejava.lang.IllegalArgumentException - if the named measure is not supportedpublic Evaluation getOutOfBagEvaluationObject()
public void buildClassifier(Instances data) throws java.lang.Exception
buildClassifier in interface ClassifierbuildClassifier in class ParallelIteratedSingleClassifierEnhancerdata - the training data to be used for generating the
bagged classifier.java.lang.Exception - if the classifier could not be built successfullypublic double[] distributionForInstance(Instance instance) throws java.lang.Exception
distributionForInstance in interface ClassifierdistributionForInstance in class AbstractClassifierinstance - the instance to be classifiedjava.lang.Exception - if distribution can't be computed successfullypublic java.lang.String toString()
toString in class java.lang.Objectpublic void generatePartition(Instances data) throws java.lang.Exception
generatePartition in interface PartitionGeneratorjava.lang.Exceptionpublic double[] getMembershipValues(Instance inst) throws java.lang.Exception
getMembershipValues in interface PartitionGeneratorjava.lang.Exceptionpublic int numElements()
throws java.lang.Exception
numElements in interface PartitionGeneratorjava.lang.Exceptionpublic java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class AbstractClassifierpublic static void main(java.lang.String[] argv)
argv - the optionspublic Bagging aggregate(Bagging toAggregate) throws java.lang.Exception
aggregate in interface Aggregateable<Bagging>toAggregate - the object to aggregatejava.lang.Exception - if the supplied object can't be aggregated for some
reasonpublic void finalizeAggregation()
throws java.lang.Exception
finalizeAggregation in interface Aggregateable<Bagging>java.lang.Exception - if the aggregation can't be finalized for some reason