Data
mfeat-pixel

mfeat-pixel

active ARFF Publicly available Visibility: public Uploaded 06-04-2014 by Jan van Rijn
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • study_14 study_1 study_579 study_615 study_560 study_757 study_117 study_366 study_293 study_366 study_480 study_182
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: Source: Unknown - Please cite: The multi-feature digit dataset ------------------------------- Oowned and donated by: ---------------------- Robert P.W. Duin Department of Applied Physics Delft University of Technology P.O. Box 5046, 2600 GA Delft The Netherlands email: duin@ph.tn.tudelft.nl http : //www.ph.tn.tudelft.nl/~duin tel +31 15 2786143 Usage ----- A slightly different version of the database is used in M. van Breukelen, R.P.W. Duin, D.M.J. Tax, and J.E. den Hartog, Handwritten digit recognition by combined classifiers, Kybernetika, vol. 34, no. 4, 1998, 381-386. M. van Breukelen and R.P.W. Duin, Neural Network Initialization by Combined Classifiers, in: A.K. Jain, S. Venkatesh, B.C. Lovell (eds.), ICPR'98, Proc. 14th Int. Conference on Pattern Recognition (Brisbane, Aug. 16-20), The database as it is is used in: A.K. Jain, R.P.W. Duin, J. Mao, Statisitcal Pattern Recognition: A Review, in preparation Description ----------- This dataset consists of features of handwritten numerals (`0'--`9') extracted from a collection of Dutch utility maps. 200 patterns per class (for a total of 2,000 patterns) have been digitized in binary images. These digits are represented in terms of the following six feature sets (files): 1. mfeat-fou: 76 Fourier coefficients of the character shapes; 2. mfeat-fac: 216 profile correlations; 3. mfeat-kar: 64 Karhunen-Love coefficients; 4. mfeat-pix: 240 pixel averages in 2 x 3 windows; 5. mfeat-zer: 47 Zernike moments; 6. mfeat-mor: 6 morphological features. In each file the 2000 patterns are stored in ASCI on 2000 lines. The first 200 patterns are of class `0', followed by sets of 200 patterns for each of the classes `1' - `9'. Corresponding patterns in different feature sets (files) correspond to the same original character. The source image dataset is lost. Using the pixel-dataset (mfeat-pix) sampled versions of the original images may be obtained (15 x 16 pixels). Total number of instances: -------------------------- 2000 (200 instances per class) Total number of attributes: --------------------------- 649 (distributed over 6 datasets,see above) no missing attributes Total number of classes: ------------------------ 10 Format: ------ 6 files, see above. Each file contains 2000 lines, one for each instance. Attributes are SPACE separated and can be loaded by Matlab as > load filename No missing attributes. Some are integer, others are real. Information about the dataset CLASSTYPE: nominal CLASSINDEX: last

241 features

class (target)nominal10 unique values
0 missing
att1nominal6 unique values
0 missing
att2nominal7 unique values
0 missing
att3nominal7 unique values
0 missing
att4nominal7 unique values
0 missing
att5nominal7 unique values
0 missing
att6nominal7 unique values
0 missing
att7nominal7 unique values
0 missing
att8nominal7 unique values
0 missing
att9nominal7 unique values
0 missing
att10nominal7 unique values
0 missing
att11nominal7 unique values
0 missing
att12nominal7 unique values
0 missing
att13nominal7 unique values
0 missing
att14nominal7 unique values
0 missing
att15nominal6 unique values
0 missing
att16nominal7 unique values
0 missing
att17nominal7 unique values
0 missing
att18nominal7 unique values
0 missing
att19nominal7 unique values
0 missing
att20nominal7 unique values
0 missing
att21nominal7 unique values
0 missing
att22nominal7 unique values
0 missing
att23nominal7 unique values
0 missing
att24nominal7 unique values
0 missing
att25nominal7 unique values
0 missing
att26nominal7 unique values
0 missing
att27nominal7 unique values
0 missing
att28nominal7 unique values
0 missing
att29nominal7 unique values
0 missing
att30nominal7 unique values
0 missing
att31nominal7 unique values
0 missing
att32nominal7 unique values
0 missing
att33nominal7 unique values
0 missing
att34nominal7 unique values
0 missing
att35nominal7 unique values
0 missing
att36nominal7 unique values
0 missing
att37nominal7 unique values
0 missing
att38nominal7 unique values
0 missing
att39nominal7 unique values
0 missing
att40nominal7 unique values
0 missing
att41nominal7 unique values
0 missing
att42nominal7 unique values
0 missing
att43nominal7 unique values
0 missing
att44nominal7 unique values
0 missing
att45nominal7 unique values
0 missing
att46nominal7 unique values
0 missing
att47nominal7 unique values
0 missing
att48nominal7 unique values
0 missing
att49nominal7 unique values
0 missing
att50nominal7 unique values
0 missing
att51nominal7 unique values
0 missing
att52nominal7 unique values
0 missing
att53nominal7 unique values
0 missing
att54nominal7 unique values
0 missing
att55nominal7 unique values
0 missing
att56nominal7 unique values
0 missing
att57nominal7 unique values
0 missing
att58nominal7 unique values
0 missing
att59nominal7 unique values
0 missing
att60nominal7 unique values
0 missing
att61nominal7 unique values
0 missing
att62nominal7 unique values
0 missing
att63nominal7 unique values
0 missing
att64nominal7 unique values
0 missing
att65nominal7 unique values
0 missing
att66nominal7 unique values
0 missing
att67nominal7 unique values
0 missing
att68nominal7 unique values
0 missing
att69nominal7 unique values
0 missing
att70nominal7 unique values
0 missing
att71nominal7 unique values
0 missing
att72nominal7 unique values
0 missing
att73nominal7 unique values
0 missing
att74nominal7 unique values
0 missing
att75nominal7 unique values
0 missing
att76nominal7 unique values
0 missing
att77nominal7 unique values
0 missing
att78nominal7 unique values
0 missing
att79nominal7 unique values
0 missing
att80nominal7 unique values
0 missing
att81nominal7 unique values
0 missing
att82nominal7 unique values
0 missing
att83nominal7 unique values
0 missing
att84nominal7 unique values
0 missing
att85nominal7 unique values
0 missing
att86nominal7 unique values
0 missing
att87nominal7 unique values
0 missing
att88nominal7 unique values
0 missing
att89nominal7 unique values
0 missing
att90nominal7 unique values
0 missing
att91nominal7 unique values
0 missing
att92nominal7 unique values
0 missing
att93nominal7 unique values
0 missing
att94nominal7 unique values
0 missing
att95nominal7 unique values
0 missing
att96nominal7 unique values
0 missing
att97nominal7 unique values
0 missing
att98nominal7 unique values
0 missing
att99nominal7 unique values
0 missing
att100nominal7 unique values
0 missing
att101nominal7 unique values
0 missing
att102nominal7 unique values
0 missing
att103nominal7 unique values
0 missing
att104nominal7 unique values
0 missing
att105nominal7 unique values
0 missing
att106nominal7 unique values
0 missing
att107nominal7 unique values
0 missing
att108nominal7 unique values
0 missing
att109nominal7 unique values
0 missing
att110nominal7 unique values
0 missing
att111nominal7 unique values
0 missing
att112nominal7 unique values
0 missing
att113nominal7 unique values
0 missing
att114nominal7 unique values
0 missing
att115nominal7 unique values
0 missing
att116nominal7 unique values
0 missing
att117nominal7 unique values
0 missing
att118nominal7 unique values
0 missing
att119nominal7 unique values
0 missing
att120nominal7 unique values
0 missing
att121nominal7 unique values
0 missing
att122nominal7 unique values
0 missing
att123nominal7 unique values
0 missing
att124nominal7 unique values
0 missing
att125nominal7 unique values
0 missing
att126nominal7 unique values
0 missing
att127nominal7 unique values
0 missing
att128nominal7 unique values
0 missing
att129nominal7 unique values
0 missing
att130nominal7 unique values
0 missing
att131nominal7 unique values
0 missing
att132nominal7 unique values
0 missing
att133nominal7 unique values
0 missing
att134nominal7 unique values
0 missing
att135nominal7 unique values
0 missing
att136nominal7 unique values
0 missing
att137nominal7 unique values
0 missing
att138nominal7 unique values
0 missing
att139nominal7 unique values
0 missing
att140nominal7 unique values
0 missing
att141nominal7 unique values
0 missing
att142nominal7 unique values
0 missing
att143nominal7 unique values
0 missing
att144nominal7 unique values
0 missing
att145nominal7 unique values
0 missing
att146nominal7 unique values
0 missing
att147nominal7 unique values
0 missing
att148nominal7 unique values
0 missing
att149nominal7 unique values
0 missing
att150nominal7 unique values
0 missing
att151nominal7 unique values
0 missing
att152nominal7 unique values
0 missing
att153nominal7 unique values
0 missing
att154nominal7 unique values
0 missing
att155nominal7 unique values
0 missing
att156nominal7 unique values
0 missing
att157nominal7 unique values
0 missing
att158nominal7 unique values
0 missing
att159nominal7 unique values
0 missing
att160nominal7 unique values
0 missing
att161nominal7 unique values
0 missing
att162nominal7 unique values
0 missing
att163nominal7 unique values
0 missing
att164nominal7 unique values
0 missing
att165nominal7 unique values
0 missing
att166nominal7 unique values
0 missing
att167nominal7 unique values
0 missing
att168nominal7 unique values
0 missing
att169nominal7 unique values
0 missing
att170nominal7 unique values
0 missing
att171nominal7 unique values
0 missing
att172nominal7 unique values
0 missing
att173nominal7 unique values
0 missing
att174nominal7 unique values
0 missing
att175nominal7 unique values
0 missing
att176nominal7 unique values
0 missing
att177nominal7 unique values
0 missing
att178nominal7 unique values
0 missing
att179nominal7 unique values
0 missing
att180nominal7 unique values
0 missing
att181nominal7 unique values
0 missing
att182nominal7 unique values
0 missing
att183nominal7 unique values
0 missing
att184nominal7 unique values
0 missing
att185nominal7 unique values
0 missing
att186nominal7 unique values
0 missing
att187nominal7 unique values
0 missing
att188nominal7 unique values
0 missing
att189nominal7 unique values
0 missing
att190nominal7 unique values
0 missing
att191nominal7 unique values
0 missing
att192nominal7 unique values
0 missing
att193nominal7 unique values
0 missing
att194nominal7 unique values
0 missing
att195nominal7 unique values
0 missing
att196nominal7 unique values
0 missing
att197nominal7 unique values
0 missing
att198nominal7 unique values
0 missing
att199nominal7 unique values
0 missing
att200nominal7 unique values
0 missing
att201nominal7 unique values
0 missing
att202nominal7 unique values
0 missing
att203nominal7 unique values
0 missing
att204nominal7 unique values
0 missing
att205nominal7 unique values
0 missing
att206nominal7 unique values
0 missing
att207nominal7 unique values
0 missing
att208nominal7 unique values
0 missing
att209nominal7 unique values
0 missing
att210nominal7 unique values
0 missing
att211nominal7 unique values
0 missing
att212nominal7 unique values
0 missing
att213nominal7 unique values
0 missing
att214nominal7 unique values
0 missing
att215nominal7 unique values
0 missing
att216nominal7 unique values
0 missing
att217nominal7 unique values
0 missing
att218nominal7 unique values
0 missing
att219nominal7 unique values
0 missing
att220nominal7 unique values
0 missing
att221nominal7 unique values
0 missing
att222nominal7 unique values
0 missing
att223nominal7 unique values
0 missing
att224nominal7 unique values
0 missing
att225nominal7 unique values
0 missing
att226nominal5 unique values
0 missing
att227nominal5 unique values
0 missing
att228nominal5 unique values
0 missing
att229nominal5 unique values
0 missing
att230nominal5 unique values
0 missing
att231nominal5 unique values
0 missing
att232nominal5 unique values
0 missing
att233nominal5 unique values
0 missing
att234nominal5 unique values
0 missing
att235nominal5 unique values
0 missing
att236nominal5 unique values
0 missing
att237nominal5 unique values
0 missing
att238nominal5 unique values
0 missing
att239nominal5 unique values
0 missing
att240nominal5 unique values
0 missing

107 properties

2000
Number of instances (rows) of the dataset.
241
Number of attributes (columns) of the dataset.
10
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
0
Number of numeric attributes.
241
Number of nominal attributes.
1
Average class difference between consecutive instances.
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.24
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.74
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.24
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.74
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.24
Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.74
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
3.32
Entropy of the target attribute values.
0.72
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump
0.81
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump
0.1
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump
0.12
Number of attributes divided by the number of instances.
9.55
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001
0.25
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001
0.73
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001
0.25
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001
0.73
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001
0.25
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001
0.73
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001
10
Percentage of instances belonging to the most frequent class.
200
Number of instances belonging to the most frequent class.
2.49
Maximum entropy among attributes.
Maximum kurtosis among attributes of the numeric type.
Maximum of means among attributes of the numeric type.
0.7
Maximum mutual information between the nominal attributes and the target attribute.
10
The maximum number of distinct values among attributes of the nominal type.
Maximum skewness among attributes of the numeric type.
Maximum standard deviation of attributes of the numeric type.
1.78
Average entropy of the attributes.
Mean kurtosis among attributes of the numeric type.
Mean of means among attributes of the numeric type.
0.35
Average mutual information between the nominal attributes and the target attribute.
4.13
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
6.88
Average number of distinct values among the attributes of the nominal type.
Mean skewness among attributes of the numeric type.
Mean standard deviation of attributes of the numeric type.
0.53
Minimal entropy among attributes.
Minimum kurtosis among attributes of the numeric type.
Minimum of means among attributes of the numeric type.
0.04
Minimal mutual information between the nominal attributes and the target attribute.
5
The minimal number of distinct values among attributes of the nominal type.
Minimum skewness among attributes of the numeric type.
Minimum standard deviation of attributes of the numeric type.
10
Percentage of instances belonging to the least frequent class.
200
Number of instances belonging to the least frequent class.
0.99
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0.07
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0.92
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0
Number of binary attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
0
Percentage of numeric attributes.
100
Percentage of nominal attributes.
1.63
First quartile of entropy among attributes.
First quartile of kurtosis among attributes of the numeric type.
First quartile of means among attributes of the numeric type.
0.27
First quartile of mutual information between the nominal attributes and the target attribute.
First quartile of skewness among attributes of the numeric type.
First quartile of standard deviation of attributes of the numeric type.
1.79
Second quartile (Median) of entropy among attributes.
Second quartile (Median) of kurtosis among attributes of the numeric type.
Second quartile (Median) of means among attributes of the numeric type.
0.35
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
Second quartile (Median) of skewness among attributes of the numeric type.
Second quartile (Median) of standard deviation of attributes of the numeric type.
1.96
Third quartile of entropy among attributes.
Third quartile of kurtosis among attributes of the numeric type.
Third quartile of means among attributes of the numeric type.
0.42
Third quartile of mutual information between the nominal attributes and the target attribute.
Third quartile of skewness among attributes of the numeric type.
Third quartile of standard deviation of attributes of the numeric type.
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.29
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.67
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.29
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.67
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.89
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.29
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.67
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.8
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.41
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.55
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.8
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.41
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.55
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.8
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.41
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.55
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.53
Standard deviation of the number of distinct values among attributes of the nominal type.
0.98
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk
0.05
Error rate achieved by the landmarker weka.classifiers.lazy.IBk
0.95
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk

11 tasks

0 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: class
0 runs - estimation_procedure: 10 times 10-fold Crossvalidation - target_feature: class
0 runs - estimation_procedure: 33% Holdout set - target_feature: class
0 runs - estimation_procedure: 20% Holdout (Ordered) - target_feature: class
0 runs - estimation_procedure: Test on Training Data - target_feature: class
0 runs - estimation_procedure: Leave one out - target_feature: class
0 runs - estimation_procedure: 5 times 2-fold Crossvalidation - target_feature: class
0 runs - estimation_procedure: 10% Holdout set - target_feature: class
0 runs - estimation_procedure: 10 times 10-fold Learning Curve - target_feature: class
0 runs - estimation_procedure: 10-fold Learning Curve - target_feature: class
0 runs - estimation_procedure: Interleaved Test then Train - target_feature: class
Define a new task