Data
semeion

semeion

active ARFF Publicly available Visibility: public Uploaded 25-05-2015 by Rafael Gomes Mantovani
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • study_14 study_1 study_382 study_382 study_165 study_40 study_40
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: Semeion Research Center of Sciences of Communication Source: UCI Please cite: Semeion Research Center of Sciences of Communication, via Sersale 117, 00128 Rome, Italy Tattile Via Gaetano Donizetti, 1-3-5,25030 Mairano (Brescia), Italy. * Title: Semeion Handwritten Digit Data Set * Abstract: 1593 handwritten digits from around 80 persons were scanned, stretched in a rectangular box 16x16 in a gray scale of 256 values. * Source: The dataset was created by Tactile Srl, Brescia, Italy (http://www.tattile.it) and donated in 1994 to Semeion Research Center of Sciences of Communication, Rome, Italy (http://www.semeion.it), for machine learning research. For any questions, e-mail Massimo Buscema (m.buscema '@' semeion.it) or Stefano Terzi (s.terzi '@' semeion.it) * Data Set Information: 1593 handwritten digits from around 80 persons were scanned, stretched in a rectangular box 16x16 in a gray scale of 256 values.Then each pixel of each image was scaled into a bolean (1/0) value using a fixed threshold. Each person wrote on a paper all the digits from 0 to 9, twice. The commitment was to write the digit the first time in the normal way (trying to write each digit accurately) and the second time in a fast way (with no accuracy). The best validation protocol for this dataset seems to be a 5x2CV, 50% Tune (Train +Test) and completly blind 50% Validation * Attribute Information: This dataset consists of 1593 records (rows) and 256 attributes (columns). Each record represents a handwritten digit, orginally scanned with a resolution of 256 grays scale (28). Each pixel of the each original scanned image was first stretched, and after scaled between 0 and 1 (setting to 0 every pixel whose value was under tha value 127 of the grey scale (127 included) and setting to 1 each pixel whose orinal value in the grey scale was over 127). Finally, each binary image was scaled again into a 16x16 square box (the final 256 binary attributes). * Relevant Papers: M Buscema, MetaNet: The Theory of Independent Judges, in Substance Use & Misuse 33(2)1998, pp 439-461.

257 features

Class (target)nominal10 unique values
0 missing
V1numeric2 unique values
0 missing
V2numeric2 unique values
0 missing
V3numeric2 unique values
0 missing
V4numeric2 unique values
0 missing
V5numeric2 unique values
0 missing
V6numeric2 unique values
0 missing
V7numeric2 unique values
0 missing
V8numeric2 unique values
0 missing
V9numeric2 unique values
0 missing
V10numeric2 unique values
0 missing
V11numeric2 unique values
0 missing
V12numeric2 unique values
0 missing
V13numeric2 unique values
0 missing
V14numeric2 unique values
0 missing
V15numeric2 unique values
0 missing
V16numeric2 unique values
0 missing
V17numeric2 unique values
0 missing
V18numeric2 unique values
0 missing
V19numeric2 unique values
0 missing
V20numeric2 unique values
0 missing
V21numeric2 unique values
0 missing
V22numeric2 unique values
0 missing
V23numeric2 unique values
0 missing
V24numeric2 unique values
0 missing
V25numeric2 unique values
0 missing
V26numeric2 unique values
0 missing
V27numeric2 unique values
0 missing
V28numeric2 unique values
0 missing
V29numeric2 unique values
0 missing
V30numeric2 unique values
0 missing
V31numeric2 unique values
0 missing
V32numeric2 unique values
0 missing
V33numeric2 unique values
0 missing
V34numeric2 unique values
0 missing
V35numeric2 unique values
0 missing
V36numeric2 unique values
0 missing
V37numeric2 unique values
0 missing
V38numeric2 unique values
0 missing
V39numeric2 unique values
0 missing
V40numeric2 unique values
0 missing
V41numeric2 unique values
0 missing
V42numeric2 unique values
0 missing
V43numeric2 unique values
0 missing
V44numeric2 unique values
0 missing
V45numeric2 unique values
0 missing
V46numeric2 unique values
0 missing
V47numeric2 unique values
0 missing
V48numeric2 unique values
0 missing
V49numeric2 unique values
0 missing
V50numeric2 unique values
0 missing
V51numeric2 unique values
0 missing
V52numeric2 unique values
0 missing
V53numeric2 unique values
0 missing
V54numeric2 unique values
0 missing
V55numeric2 unique values
0 missing
V56numeric2 unique values
0 missing
V57numeric2 unique values
0 missing
V58numeric2 unique values
0 missing
V59numeric2 unique values
0 missing
V60numeric2 unique values
0 missing
V61numeric2 unique values
0 missing
V62numeric2 unique values
0 missing
V63numeric2 unique values
0 missing
V64numeric2 unique values
0 missing
V65numeric2 unique values
0 missing
V66numeric2 unique values
0 missing
V67numeric2 unique values
0 missing
V68numeric2 unique values
0 missing
V69numeric2 unique values
0 missing
V70numeric2 unique values
0 missing
V71numeric2 unique values
0 missing
V72numeric2 unique values
0 missing
V73numeric2 unique values
0 missing
V74numeric2 unique values
0 missing
V75numeric2 unique values
0 missing
V76numeric2 unique values
0 missing
V77numeric2 unique values
0 missing
V78numeric2 unique values
0 missing
V79numeric2 unique values
0 missing
V80numeric2 unique values
0 missing
V81numeric2 unique values
0 missing
V82numeric2 unique values
0 missing
V83numeric2 unique values
0 missing
V84numeric2 unique values
0 missing
V85numeric2 unique values
0 missing
V86numeric2 unique values
0 missing
V87numeric2 unique values
0 missing
V88numeric2 unique values
0 missing
V89numeric2 unique values
0 missing
V90numeric2 unique values
0 missing
V91numeric2 unique values
0 missing
V92numeric2 unique values
0 missing
V93numeric2 unique values
0 missing
V94numeric2 unique values
0 missing
V95numeric2 unique values
0 missing
V96numeric2 unique values
0 missing
V97numeric2 unique values
0 missing
V98numeric2 unique values
0 missing
V99numeric2 unique values
0 missing
V100numeric2 unique values
0 missing
V101numeric2 unique values
0 missing
V102numeric2 unique values
0 missing
V103numeric2 unique values
0 missing
V104numeric2 unique values
0 missing
V105numeric2 unique values
0 missing
V106numeric2 unique values
0 missing
V107numeric2 unique values
0 missing
V108numeric2 unique values
0 missing
V109numeric2 unique values
0 missing
V110numeric2 unique values
0 missing
V111numeric2 unique values
0 missing
V112numeric2 unique values
0 missing
V113numeric2 unique values
0 missing
V114numeric2 unique values
0 missing
V115numeric2 unique values
0 missing
V116numeric2 unique values
0 missing
V117numeric2 unique values
0 missing
V118numeric2 unique values
0 missing
V119numeric2 unique values
0 missing
V120numeric2 unique values
0 missing
V121numeric2 unique values
0 missing
V122numeric2 unique values
0 missing
V123numeric2 unique values
0 missing
V124numeric2 unique values
0 missing
V125numeric2 unique values
0 missing
V126numeric2 unique values
0 missing
V127numeric2 unique values
0 missing
V128numeric2 unique values
0 missing
V129numeric2 unique values
0 missing
V130numeric2 unique values
0 missing
V131numeric2 unique values
0 missing
V132numeric2 unique values
0 missing
V133numeric2 unique values
0 missing
V134numeric2 unique values
0 missing
V135numeric2 unique values
0 missing
V136numeric2 unique values
0 missing
V137numeric2 unique values
0 missing
V138numeric2 unique values
0 missing
V139numeric2 unique values
0 missing
V140numeric2 unique values
0 missing
V141numeric2 unique values
0 missing
V142numeric2 unique values
0 missing
V143numeric2 unique values
0 missing
V144numeric2 unique values
0 missing
V145numeric2 unique values
0 missing
V146numeric2 unique values
0 missing
V147numeric2 unique values
0 missing
V148numeric2 unique values
0 missing
V149numeric2 unique values
0 missing
V150numeric2 unique values
0 missing
V151numeric2 unique values
0 missing
V152numeric2 unique values
0 missing
V153numeric2 unique values
0 missing
V154numeric2 unique values
0 missing
V155numeric2 unique values
0 missing
V156numeric2 unique values
0 missing
V157numeric2 unique values
0 missing
V158numeric2 unique values
0 missing
V159numeric2 unique values
0 missing
V160numeric2 unique values
0 missing
V161numeric2 unique values
0 missing
V162numeric2 unique values
0 missing
V163numeric2 unique values
0 missing
V164numeric2 unique values
0 missing
V165numeric2 unique values
0 missing
V166numeric2 unique values
0 missing
V167numeric2 unique values
0 missing
V168numeric2 unique values
0 missing
V169numeric2 unique values
0 missing
V170numeric2 unique values
0 missing
V171numeric2 unique values
0 missing
V172numeric2 unique values
0 missing
V173numeric2 unique values
0 missing
V174numeric2 unique values
0 missing
V175numeric2 unique values
0 missing
V176numeric2 unique values
0 missing
V177numeric2 unique values
0 missing
V178numeric2 unique values
0 missing
V179numeric2 unique values
0 missing
V180numeric2 unique values
0 missing
V181numeric2 unique values
0 missing
V182numeric2 unique values
0 missing
V183numeric2 unique values
0 missing
V184numeric2 unique values
0 missing
V185numeric2 unique values
0 missing
V186numeric2 unique values
0 missing
V187numeric2 unique values
0 missing
V188numeric2 unique values
0 missing
V189numeric2 unique values
0 missing
V190numeric2 unique values
0 missing
V191numeric2 unique values
0 missing
V192numeric2 unique values
0 missing
V193numeric2 unique values
0 missing
V194numeric2 unique values
0 missing
V195numeric2 unique values
0 missing
V196numeric2 unique values
0 missing
V197numeric2 unique values
0 missing
V198numeric2 unique values
0 missing
V199numeric2 unique values
0 missing
V200numeric2 unique values
0 missing
V201numeric2 unique values
0 missing
V202numeric2 unique values
0 missing
V203numeric2 unique values
0 missing
V204numeric2 unique values
0 missing
V205numeric2 unique values
0 missing
V206numeric2 unique values
0 missing
V207numeric2 unique values
0 missing
V208numeric2 unique values
0 missing
V209numeric2 unique values
0 missing
V210numeric2 unique values
0 missing
V211numeric2 unique values
0 missing
V212numeric2 unique values
0 missing
V213numeric2 unique values
0 missing
V214numeric2 unique values
0 missing
V215numeric2 unique values
0 missing
V216numeric2 unique values
0 missing
V217numeric2 unique values
0 missing
V218numeric2 unique values
0 missing
V219numeric2 unique values
0 missing
V220numeric2 unique values
0 missing
V221numeric2 unique values
0 missing
V222numeric2 unique values
0 missing
V223numeric2 unique values
0 missing
V224numeric2 unique values
0 missing
V225numeric2 unique values
0 missing
V226numeric2 unique values
0 missing
V227numeric2 unique values
0 missing
V228numeric2 unique values
0 missing
V229numeric2 unique values
0 missing
V230numeric2 unique values
0 missing
V231numeric2 unique values
0 missing
V232numeric2 unique values
0 missing
V233numeric2 unique values
0 missing
V234numeric2 unique values
0 missing
V235numeric2 unique values
0 missing
V236numeric2 unique values
0 missing
V237numeric2 unique values
0 missing
V238numeric2 unique values
0 missing
V239numeric2 unique values
0 missing
V240numeric2 unique values
0 missing
V241numeric2 unique values
0 missing
V242numeric2 unique values
0 missing
V243numeric2 unique values
0 missing
V244numeric2 unique values
0 missing
V245numeric2 unique values
0 missing
V246numeric2 unique values
0 missing
V247numeric2 unique values
0 missing
V248numeric2 unique values
0 missing
V249numeric2 unique values
0 missing
V250numeric2 unique values
0 missing
V251numeric2 unique values
0 missing
V252numeric2 unique values
0 missing
V253numeric2 unique values
0 missing
V254numeric2 unique values
0 missing
V255numeric2 unique values
0 missing
V256numeric2 unique values
0 missing

107 properties

1593
Number of instances (rows) of the dataset.
257
Number of attributes (columns) of the dataset.
10
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
256
Number of numeric attributes.
1
Number of nominal attributes.
0.93
Average class difference between consecutive instances.
0.85
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.3
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.66
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.85
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.3
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.66
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.85
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.3
Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.66
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
3.32
Entropy of the target attribute values.
0.68
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump
0.8
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump
0.11
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump
0.16
Number of attributes divided by the number of instances.
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
0.86
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001
0.28
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001
0.69
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001
0.86
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001
0.28
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001
0.69
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001
0.86
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001
0.28
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001
0.69
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001
10.17
Percentage of instances belonging to the most frequent class.
162
Number of instances belonging to the most frequent class.
Maximum entropy among attributes.
19.62
Maximum kurtosis among attributes of the numeric type.
0.65
Maximum of means among attributes of the numeric type.
Maximum mutual information between the nominal attributes and the target attribute.
10
The maximum number of distinct values among attributes of the nominal type.
4.65
Maximum skewness among attributes of the numeric type.
0.5
Maximum standard deviation of attributes of the numeric type.
Average entropy of the attributes.
-0.96
Mean kurtosis among attributes of the numeric type.
0.33
Mean of means among attributes of the numeric type.
Average mutual information between the nominal attributes and the target attribute.
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
10
Average number of distinct values among the attributes of the nominal type.
0.81
Mean skewness among attributes of the numeric type.
0.46
Mean standard deviation of attributes of the numeric type.
Minimal entropy among attributes.
-2
Minimum kurtosis among attributes of the numeric type.
0.04
Minimum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
10
The minimal number of distinct values among attributes of the nominal type.
-0.61
Minimum skewness among attributes of the numeric type.
0.2
Minimum standard deviation of attributes of the numeric type.
9.73
Percentage of instances belonging to the least frequent class.
155
Number of instances belonging to the least frequent class.
0.99
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0.15
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0.84
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes
0
Number of binary attributes.
0
Percentage of binary attributes.
0
Percentage of instances having missing values.
0
Percentage of missing values.
99.61
Percentage of numeric attributes.
0.39
Percentage of nominal attributes.
First quartile of entropy among attributes.
-1.79
First quartile of kurtosis among attributes of the numeric type.
0.26
First quartile of means among attributes of the numeric type.
First quartile of mutual information between the nominal attributes and the target attribute.
0.45
First quartile of skewness among attributes of the numeric type.
0.44
First quartile of standard deviation of attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
-1.44
Second quartile (Median) of kurtosis among attributes of the numeric type.
0.32
Second quartile (Median) of means among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
0.75
Second quartile (Median) of skewness among attributes of the numeric type.
0.47
Second quartile (Median) of standard deviation of attributes of the numeric type.
Third quartile of entropy among attributes.
-0.79
Third quartile of kurtosis among attributes of the numeric type.
0.39
Third quartile of means among attributes of the numeric type.
Third quartile of mutual information between the nominal attributes and the target attribute.
1.1
Third quartile of skewness among attributes of the numeric type.
0.49
Third quartile of standard deviation of attributes of the numeric type.
0.88
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.34
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.63
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.88
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.34
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.63
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.88
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.34
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.63
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.8
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.35
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.61
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.8
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.35
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.61
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.8
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.35
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.61
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0
Standard deviation of the number of distinct values among attributes of the nominal type.
0.94
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk
0.11
Error rate achieved by the landmarker weka.classifiers.lazy.IBk
0.87
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk

11 tasks

0 runs - estimation_procedure: Test on Training Data - target_feature: Class
0 runs - estimation_procedure: 20% Holdout (Ordered) - target_feature: Class
0 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: Class
0 runs - estimation_procedure: 10 times 10-fold Crossvalidation - target_feature: Class
0 runs - estimation_procedure: Leave one out - target_feature: Class
0 runs - estimation_procedure: 33% Holdout set - target_feature: Class
0 runs - estimation_procedure: 5 times 2-fold Crossvalidation - target_feature: Class
0 runs - estimation_procedure: 10% Holdout set - target_feature: Class
0 runs - estimation_procedure: 10-fold Learning Curve - target_feature: Class
0 runs - estimation_procedure: 10 times 10-fold Learning Curve - target_feature: Class
0 runs - estimation_procedure: Interleaved Test then Train - target_feature: Class
Define a new task