public class CSVLoader extends AbstractFileLoader implements BatchConverter, IncrementalConverter, OptionHandler
-H No header row present in the data.
-N <range> The range of attributes to force type to be NOMINAL. 'first' and 'last' are accepted as well. Examples: "first-last", "1,4,5-27,50-last" (default: -none-)
-L <nominal label spec> Optional specification of legal labels for nominal attributes. May be specified multiple times. Batch mode can determine this automatically (and so can incremental mode if the first in memory buffer load of instances contains an example of each legal value). The spec contains two parts separated by a ":". The first part can be a range of attribute indexes or a comma-separated list off attruibute names; the second part is a comma-separated list of labels. E.g "1,2,4-6:red,green,blue" or "att1,att2:red,green,blue"
-S <range> The range of attribute to force type to be STRING. 'first' and 'last' are accepted as well. Examples: "first-last", "1,4,5-27,50-last" (default: -none-)
-D <range> The range of attribute to force type to be DATE. 'first' and 'last' are accepted as well. Examples: "first-last", "1,4,5-27,50-last" (default: -none-)
-format <date format> The date formatting string to use to parse date values. (default: "yyyy-MM-dd'T'HH:mm:ss")
-R <range> The range of attribute to force type to be NUMERIC. 'first' and 'last' are accepted as well. Examples: "first-last", "1,4,5-27,50-last" (default: -none-)
-M <str> The string representing a missing value. (default: ?)
-F <separator> The field separator to be used. '\t' can be used as well. (default: ',')
-E <enclosures> The enclosure character(s) to use for strings. Specify as a comma separated list (e.g. ",' (default: ",')
-B <num> The size of the in memory buffer (in rows). (default: 100)
Loader.StructureNotReadyException| Modifier and Type | Field and Description |
|---|---|
static java.lang.String |
FILE_EXTENSION
the file extension.
|
FILE_EXTENSION_COMPRESSEDBATCH, INCREMENTAL, NONE| Constructor and Description |
|---|
CSVLoader()
default constructor.
|
| Modifier and Type | Method and Description |
|---|---|
java.lang.String |
bufferSizeTipText()
Returns the tip text for this property.
|
java.lang.String |
dateAttributesTipText()
Returns the tip text for this property.
|
java.lang.String |
dateFormatTipText()
Returns the tip text for this property.
|
java.lang.String |
enclosureCharactersTipText()
Returns the tip text for this property.
|
java.lang.String |
fieldSeparatorTipText()
Returns the tip text for this property.
|
int |
getBufferSize()
Get the buffer size to use - i.e.
|
Instances |
getDataSet()
Return the full data set.
|
java.lang.String |
getDateAttributes()
Returns the current attribute range to be forced to type date.
|
java.lang.String |
getDateFormat()
Get the format to use for parsing date values.
|
java.lang.String |
getEnclosureCharacters()
Get the character(s) to use/recognize as string enclosures
|
java.lang.String |
getFieldSeparator()
Returns the character used as column separator.
|
java.lang.String |
getFileDescription()
Get a one line description of the type of file
|
java.lang.String |
getFileExtension()
Get the file extension used for this type of file
|
java.lang.String[] |
getFileExtensions()
Gets all the file extensions used for this type of file
|
java.lang.String |
getMissingValue()
Returns the current placeholder for missing values.
|
Instance |
getNextInstance(Instances structure)
Read the data set incrementally---get the next instance in the data set or
returns null if there are no more instances to get.
|
boolean |
getNoHeaderRowPresent()
Get whether there is no header row in the data.
|
java.lang.String |
getNominalAttributes()
Returns the current attribute range to be forced to type nominal.
|
java.lang.Object[] |
getNominalLabelSpecs()
Get label specifications for nominal attributes.
|
java.lang.String |
getNumericAttributes()
Gets the attribute range to be forced to type numeric
|
java.lang.String[] |
getOptions()
Gets the current option settings for the OptionHandler.
|
java.lang.String |
getRevision()
Returns the revision string.
|
java.lang.String |
getStringAttributes()
Returns the current attribute range to be forced to type string.
|
Instances |
getStructure()
Determines and returns (if possible) the structure (internally the header)
of the data set as an empty set of instances.
|
java.lang.String |
globalInfo()
Returns a string describing this attribute evaluator.
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration of all the available options..
|
static void |
main(java.lang.String[] args)
Main method.
|
java.lang.String |
missingValueTipText()
Returns the tip text for this property.
|
java.lang.String |
noHeaderRowPresentTipText()
Returns the tip text for this property.
|
java.lang.String |
nominalAttributesTipText()
Returns the tip text for this property.
|
java.lang.String |
nominalLabelSpecsTipText()
Returns the tip text for this property.
|
java.lang.String |
numericAttributesTipText()
Returns the tip text for this property.
|
void |
reset()
Resets the loader ready to read a new data set
|
void |
setBufferSize(int buff)
Set the buffer size to use - i.e.
|
void |
setDateAttributes(java.lang.String value)
Set the attribute range to be forced to type date.
|
void |
setDateFormat(java.lang.String value)
Set the format to use for parsing date values.
|
void |
setEnclosureCharacters(java.lang.String enclosure)
Set the character(s) to use/recognize as string enclosures
|
void |
setFieldSeparator(java.lang.String value)
Sets the character used as column separator.
|
void |
setMissingValue(java.lang.String value)
Sets the placeholder for missing values.
|
void |
setNoHeaderRowPresent(boolean b)
Set whether there is no header row in the data.
|
void |
setNominalAttributes(java.lang.String value)
Sets the attribute range to be forced to type nominal.
|
void |
setNominalLabelSpecs(java.lang.Object[] specs)
Set label specifications for nominal attributes.
|
void |
setNumericAttributes(java.lang.String value)
Sets the attribute range to be forced to type numeric
|
void |
setOptions(java.lang.String[] options)
Sets the OptionHandler's options using the given list.
|
void |
setSource(java.io.File file)
Resets the Loader object and sets the source of the data set to be the
supplied File object.
|
void |
setSource(java.io.InputStream input)
Resets the Loader object and sets the source of the data set to be the
supplied Stream object.
|
void |
setStringAttributes(java.lang.String value)
Sets the attribute range to be forced to type string.
|
java.lang.String |
stringAttributesTipText()
Returns the tip text for this property.
|
getUseRelativePath, retrieveFile, runFileLoader, setEnvironment, setFile, setUseRelativePath, useRelativePathTipTextsetRetrievalpublic static void main(java.lang.String[] args)
args - should contain the name of an input file.public java.lang.String globalInfo()
public java.lang.String getFileExtension()
FileSourcedConvertergetFileExtension in interface FileSourcedConverterpublic java.lang.String[] getFileExtensions()
FileSourcedConvertergetFileExtensions in interface FileSourcedConverterpublic java.lang.String getFileDescription()
FileSourcedConvertergetFileDescription in interface FileSourcedConverterpublic java.lang.String getRevision()
RevisionHandlergetRevision in interface RevisionHandlerpublic java.lang.String noHeaderRowPresentTipText()
public boolean getNoHeaderRowPresent()
public void setNoHeaderRowPresent(boolean b)
b - true if there is no header row in the datapublic java.lang.String getMissingValue()
public void setMissingValue(java.lang.String value)
value - the placeholderpublic java.lang.String missingValueTipText()
public java.lang.String getStringAttributes()
public void setStringAttributes(java.lang.String value)
value - the rangepublic java.lang.String stringAttributesTipText()
public java.lang.String getNominalAttributes()
public void setNominalAttributes(java.lang.String value)
value - the rangepublic java.lang.String nominalAttributesTipText()
public java.lang.String getNumericAttributes()
public void setNumericAttributes(java.lang.String value)
value - the rangepublic java.lang.String numericAttributesTipText()
public java.lang.String getDateFormat()
public void setDateFormat(java.lang.String value)
value - the format to use.public java.lang.String dateFormatTipText()
public java.lang.String getDateAttributes()
public void setDateAttributes(java.lang.String value)
value - the rangepublic java.lang.String dateAttributesTipText()
public java.lang.String enclosureCharactersTipText()
public java.lang.String getEnclosureCharacters()
public void setEnclosureCharacters(java.lang.String enclosure)
enclosure - the characters to use as string enclosurespublic java.lang.String getFieldSeparator()
public void setFieldSeparator(java.lang.String value)
value - the character to usepublic java.lang.String fieldSeparatorTipText()
public int getBufferSize()
public void setBufferSize(int buff)
buff - the buffer size (number of rows)public java.lang.String bufferSizeTipText()
public java.lang.Object[] getNominalLabelSpecs()
public void setNominalLabelSpecs(java.lang.Object[] specs)
specs - an array of label specificationspublic java.lang.String nominalLabelSpecsTipText()
public java.util.Enumeration<Option> listOptions()
OptionHandlerlistOptions in interface OptionHandlerpublic java.lang.String[] getOptions()
OptionHandlergetOptions in interface OptionHandlerpublic void setOptions(java.lang.String[] options)
throws java.lang.Exception
OptionHandlersetOptions in interface OptionHandleroptions - the list of options as an array of stringsjava.lang.Exception - if an option is not supportedpublic Instance getNextInstance(Instances structure) throws java.io.IOException
LoadergetNextInstance in interface LoadergetNextInstance in class AbstractLoaderstructure - the dataset header information, will get updated in case
of string or relational attributesjava.io.IOException - if there is an error during parsing or if getDataSet
has been called on this source (either incremental or batch
loading can be used, not both).public Instances getDataSet() throws java.io.IOException
LoadergetDataSet in interface LoadergetDataSet in class AbstractLoaderjava.io.IOException - if there is an error during parsing or if
getNextInstance has been called on this source (either
incremental or batch loading can be used, not both).
public_normal_behavior requires: model_sourceSupplied == true && (* successful parse *); modifiable: model_structureDetermined; ensures: \result != null && \result.numInstances() >= 0 && model_structureDetermined == true; also public_exceptional_behavior requires: model_sourceSupplied == false || (* unsuccessful parse *); signals: (IOException);
public void setSource(java.io.InputStream input)
throws java.io.IOException
setSource in interface LoadersetSource in class AbstractLoaderinput - the input streamjava.io.IOException - if an error occurspublic void setSource(java.io.File file)
throws java.io.IOException
setSource in interface LoadersetSource in class AbstractFileLoaderfile - the source file.java.io.IOException - if an error occurspublic Instances getStructure() throws java.io.IOException
LoadergetStructure in interface LoadergetStructure in class AbstractLoaderjava.io.IOException - if there is no source or parsing fails
public_normal_behavior requires: model_sourceSupplied == true && model_structureDetermined == false && (* successful parse *); modifiable: model_structureDetermined; ensures: \result != null && \result.numInstances() == 0 && model_structureDetermined == true; also public_exceptional_behavior requires: model_sourceSupplied == false || (* unsuccessful parse *); signals: (IOException);
public void reset()
throws java.io.IOException
AbstractFileLoaderreset in interface Loaderreset in class AbstractFileLoaderjava.io.IOException - if something goes wrong