Data
phoneme

phoneme

active ARFF Publicly available Visibility: public Uploaded 25-05-2015 by Rafael G. Mantovani
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
  • study_14 study_1 study_668 study_1212 study_1629 study_2436 study_3029 study_4051 study_6416 study_8366 study_360 study_504 study_1011 study_7260 study_8366 study_10965 study_11643 study_12215 study_12831 study_1081 study_1651 study_2079 study_2102 study_6622 study_8366 study_13073 study_8366 study_12215 study_4676 study_7452 study_8366 study_382 study_1055 study_2751 study_3218 study_4248 study_6824 study_6901 study_8366 study_12466 study_5035 study_7634 study_10167 study_10889 study_1876 study_2681 study_4714 study_10167 study_12660 study_13025 study_122 study_4067 study_6474 study_122 study_734 study_1998 study_2728 study_10840 study_11067 study_734 study_3811 study_12622 study_12831 study_12848 study_1696 study_2229 study_2729 study_2888 study_3179 study_5407 study_7330 study_10182 study_12849 study_13026
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Author: Source: KEEL Please cite: * Title: Phoneme dataset * Abstract: The aim of this dataset is to distinguish between nasal (class 0) and oral sounds (class 1). The class distribution is 3,818 samples in class 0 and 1,586 samples in class 1. The phonemes are transcribed as follows: sh as in she, dcl as in dark, iy as the vowel in she, aa as the vowel in dark, and ao as the first vowel in water. * Attributes information: @relation phoneme @attribute Aa real [-1.7, 4.107] @attribute Ao real [-1.327, 4.378] @attribute Dcl real [-1.823, 3.199] @attribute Iy real [-1.581, 2.826] @attribute Sh real [-1.284, 2.719] @attribute Class {0, 1} @inputs Aa, Ao, Dcl, Iy, Sh @outputs Class

6 features

Class (target)nominal2 unique values
0 missing
V1numeric5336 unique values
0 missing
V2numeric5312 unique values
0 missing
V3numeric5308 unique values
0 missing
V4numeric5336 unique values
0 missing
V5numeric4499 unique values
0 missing

107 properties

5404
Number of instances (rows) of the dataset.
6
Number of attributes (columns) of the dataset.
2
Number of distinct values of the target attribute (if it is nominal).
0
Number of missing values in the dataset.
0
Number of instances with at least one value missing.
5
Number of numeric attributes.
1
Number of nominal attributes.
Third quartile of entropy among attributes.
0.16
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
2
The maximum number of distinct values among attributes of the nominal type.
0.21
Minimum skewness among attributes of the numeric type.
0
Percentage of instances having missing values.
1.66
Third quartile of kurtosis among attributes of the numeric type.
0.59
Average class difference between consecutive instances.
0.62
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.87
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001
1.48
Maximum skewness among attributes of the numeric type.
1
Minimum standard deviation of attributes of the numeric type.
0
Percentage of missing values.
0
Third quartile of means among attributes of the numeric type.
0.86
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.81
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.17
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001
1
Maximum standard deviation of attributes of the numeric type.
29.35
Percentage of instances belonging to the least frequent class.
83.33
Percentage of numeric attributes.
Third quartile of mutual information between the nominal attributes and the target attribute.
0.18
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.16
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.59
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001
Average entropy of the attributes.
1586
Number of instances belonging to the least frequent class.
16.67
Percentage of nominal attributes.
1.36
Third quartile of skewness among attributes of the numeric type.
0.56
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.62
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
0.87
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001
0.34
Mean kurtosis among attributes of the numeric type.
0.82
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes
First quartile of entropy among attributes.
1
Third quartile of standard deviation of attributes of the numeric type.
0.86
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.81
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.17
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001
-0
Mean of means among attributes of the numeric type.
0.24
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes
-0.66
First quartile of kurtosis among attributes of the numeric type.
0.88
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.18
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.16
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.59
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001
Average mutual information between the nominal attributes and the target attribute.
0.46
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes
-0
First quartile of means among attributes of the numeric type.
0.17
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.56
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.62
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
0.87
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
1
Number of binary attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
0.59
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1
0.86
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0
Standard deviation of the number of distinct values among attributes of the nominal type.
0.17
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001
2
Average number of distinct values among the attributes of the nominal type.
0.34
First quartile of skewness among attributes of the numeric type.
0.88
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.18
Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.84
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk
0.59
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001
0.78
Mean skewness among attributes of the numeric type.
1
First quartile of standard deviation of attributes of the numeric type.
0.17
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.56
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
0.13
Error rate achieved by the landmarker weka.classifiers.lazy.IBk
70.65
Percentage of instances belonging to the most frequent class.
1
Mean standard deviation of attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
0.59
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2
0.87
Entropy of the target attribute values.
0.69
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk
3818
Number of instances belonging to the most frequent class.
Minimal entropy among attributes.
-0.31
Second quartile (Median) of kurtosis among attributes of the numeric type.
0.88
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.74
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump
Maximum entropy among attributes.
-0.86
Minimum kurtosis among attributes of the numeric type.
0
Second quartile (Median) of means among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
0.17
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.25
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump
1.77
Maximum kurtosis among attributes of the numeric type.
-0
Minimum of means among attributes of the numeric type.
0.48
Second quartile (Median) of skewness among attributes of the numeric type.
0.59
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3
0.45
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump
0
Maximum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
1
Second quartile (Median) of standard deviation of attributes of the numeric type.
0.81
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0
Number of attributes divided by the number of instances.
Maximum mutual information between the nominal attributes and the target attribute.
2
The minimal number of distinct values among attributes of the nominal type.
16.67
Percentage of binary attributes.

11 tasks

10 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: Class
0 runs - estimation_procedure: 10% Holdout set - target_feature: Class
0 runs - estimation_procedure: Test on Training Data - target_feature: Class
0 runs - estimation_procedure: Leave one out - target_feature: Class
0 runs - estimation_procedure: 33% Holdout set - target_feature: Class
0 runs - estimation_procedure: 5 times 2-fold Crossvalidation - target_feature: Class
0 runs - estimation_procedure: 20% Holdout (Ordered) - target_feature: Class
0 runs - estimation_procedure: 10 times 10-fold Crossvalidation - target_feature: Class
0 runs - estimation_procedure: 10-fold Learning Curve - target_feature: Class
0 runs - estimation_procedure: 10 times 10-fold Learning Curve - target_feature: Class
0 runs - estimation_procedure: Interleaved Test then Train - target_feature: Class
Define a new task