{ "data_id": "14", "name": "mushroom", "exact_name": "mushroom", "version": 1, "version_label": "1", "description": "**Author**: \n**Source**: Unknown - \n**Please cite**: \n\n1. Title: Mushroom Database\n \n 2. Sources: \n (a) Mushroom records drawn from The Audubon Society Field Guide to North\n American Mushrooms (1981). G. H. Lincoff (Pres.), New York: Alfred\n A. Knopf\n (b) Donor: Jeff Schlimmer (Jeffrey.Schlimmer@a.gp.cs.cmu.edu)\n (c) Date: 27 April 1987\n \n 3. Past Usage:\n 1. Schlimmer,J.S. (1987). Concept Acquisition Through Representational\n Adjustment (Technical Report 87-19). Doctoral disseration, Department\n of Information and Computer Science, University of California, Irvine.\n --- STAGGER: asymptoted to 95% classification accuracy after reviewing\n 1000 instances.\n 2. Iba,W., Wogulis,J., & Langley,P. (1988). Trading off Simplicity\n and Coverage in Incremental Concept Learning. In Proceedings of \n the 5th International Conference on Machine Learning, 73-79.\n Ann Arbor, Michigan: Morgan Kaufmann. \n -- approximately the same results with their HILLARY algorithm \n 3. In the following references a set of rules (given below) were\n learned for this data set which may serve as a point of\n comparison for other researchers.\n \n Duch W, Adamczak R, Grabczewski K (1996) Extraction of logical rules\n from training data using backpropagation networks, in: Proc. of the\n The 1st Online Workshop on Soft Computing, 19-30.Aug.1996, pp. 25-30,\n available on-line at: http:\/\/www.bioele.nuee.nagoya-u.ac.jp\/wsc1\/\n \n Duch W, Adamczak R, Grabczewski K, Ishikawa M, Ueda H, Extraction of\n crisp logical rules using constrained backpropagation networks -\n comparison of two new approaches, in: Proc. of the European Symposium\n on Artificial Neural Networks (ESANN'97), Bruge, Belgium 16-18.4.1997,\n pp. xx-xx\n \n Wlodzislaw Duch, Department of Computer Methods, Nicholas Copernicus\n University, 87-100 Torun, Grudziadzka 5, Poland\n e-mail: duch@phys.uni.torun.pl\n WWW http:\/\/www.phys.uni.torun.pl\/kmk\/\n \n Date: Mon, 17 Feb 1997 13:47:40 +0100\n From: Wlodzislaw Duch \n Organization: Dept. of Computer Methods, UMK\n \n I have attached a file containing logical rules for mushrooms.\n It should be helpful for other people since only in the last year I\n have seen about 10 papers analyzing this dataset and obtaining quite\n complex rules. We will try to contribute other results later.\n \n With best regards, Wlodek Duch\n ________________________________________________________________\n \n Logical rules for the mushroom data sets.\n \n Logical rules given below seem to be the simplest possible for the\n mushroom dataset and therefore should be treated as benchmark results.\n \n Disjunctive rules for poisonous mushrooms, from most general\n to most specific:\n \n P_1) odor=NOT(almond.OR.anise.OR.none)\n 120 poisonous cases missed, 98.52% accuracy\n \n P_2) spore-print-color=green\n 48 cases missed, 99.41% accuracy\n \n P_3) odor=none.AND.stalk-surface-below-ring=scaly.AND.\n (stalk-color-above-ring=NOT.brown) \n 8 cases missed, 99.90% accuracy\n \n P_4) habitat=leaves.AND.cap-color=white\n 100% accuracy \n \n Rule P_4) may also be\n \n P_4') population=clustered.AND.cap_color=white\n \n These rule involve 6 attributes (out of 22). Rules for edible\n mushrooms are obtained as negation of the rules given above, for\n example the rule:\n \n odor=(almond.OR.anise.OR.none).AND.spore-print-color=NOT.green\n \n gives 48 errors, or 99.41% accuracy on the whole dataset.\n \n Several slightly more complex variations on these rules exist,\n involving other attributes, such as gill_size, gill_spacing,\n stalk_surface_above_ring, but the rules given above are the simplest\n we have found.\n \n \n 4. Relevant Information:\n This data set includes descriptions of hypothetical samples\n corresponding to 23 species of gilled mushrooms in the Agaricus and\n Lepiota Family (pp. 500-525). Each species is identified as\n definitely edible, definitely poisonous, or of unknown edibility and\n not recommended. This latter class was combined with the poisonous\n one. The Guide clearly states that there is no simple rule for\n determining the edibility of a mushroom; no rule like ``leaflets\n three, let it be'' for Poisonous Oak and Ivy.\n \n 5. Number of Instances: 8124\n \n 6. Number of Attributes: 22 (all nominally valued)\n \n 7. Attribute Information: (classes: edible=e, poisonous=p)\n 1. cap-shape: bell=b,conical=c,convex=x,flat=f,\n knobbed=k,sunken=s\n 2. cap-surface: fibrous=f,grooves=g,scaly=y,smooth=s\n 3. cap-color: brown=n,buff=b,cinnamon=c,gray=g,green=r,\n pink=p,purple=u,red=e,white=w,yellow=y\n 4. bruises?: bruises=t,no=f\n 5. odor: almond=a,anise=l,creosote=c,fishy=y,foul=f,\n musty=m,none=n,pungent=p,spicy=s\n 6. gill-attachment: attached=a,descending=d,free=f,notched=n\n 7. gill-spacing: close=c,crowded=w,distant=d\n 8. gill-size: broad=b,narrow=n\n 9. gill-color: black=k,brown=n,buff=b,chocolate=h,gray=g,\n green=r,orange=o,pink=p,purple=u,red=e,\n white=w,yellow=y\n 10. stalk-shape: enlarging=e,tapering=t\n 11. stalk-root: bulbous=b,club=c,cup=u,equal=e,\n rhizomorphs=z,rooted=r,missing=?\n 12. stalk-surface-above-ring: ibrous=f,scaly=y,silky=k,smooth=s\n 13. stalk-surface-below-ring: ibrous=f,scaly=y,silky=k,smooth=s\n 14. stalk-color-above-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o,\n pink=p,red=e,white=w,yellow=y\n 15. stalk-color-below-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o,\n pink=p,red=e,white=w,yellow=y\n 16. veil-type: partial=p,universal=u\n 17. veil-color: brown=n,orange=o,white=w,yellow=y\n 18. ring-number: none=n,one=o,two=t\n 19. ring-type: cobwebby=c,evanescent=e,flaring=f,large=l,\n none=n,pendant=p,sheathing=s,zone=z\n 20. spore-print-color: black=k,brown=n,buff=b,chocolate=h,green=r,\n orange=o,purple=u,white=w,yellow=y\n 21. population: abundant=a,clustered=c,numerous=n,\n scattered=s,several=v,solitary=y\n 22. habitat: grasses=g,leaves=l,meadows=m,paths=p,\n urban=u,waste=w,woods=d\n \n 8. Missing Attribute Values: 2480 of them (denoted by \"?\"), all for\n attribute #11.\n \n 9. Class Distribution: \n -- edible: 4208 (51.8%)\n -- poisonous: 3916 (48.2%)\n -- total: 8124 instances", "format": "ARFF", "uploader": "Jan van Rijn", "uploader_id": 1, "visibility": "public", "creator": null, "contributor": null, "date": "2014-04-06 23:21:11", "update_comment": null, "last_update": "2014-04-06 23:21:11", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/www.openml.org\/data\/download\/24\/dataset_24_mushroom.arff", "default_target_attribute": "class", "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "mushroom", "1. Title: Mushroom Database 2. Sources: (a) Mushroom records drawn from The Audubon Society Field Guide to North American Mushrooms (1981). G. H. Lincoff (Pres.), New York: Alfred A. Knopf (b) Donor: Jeff Schlimmer (Jeffrey.Schlimmer@a.gp.cs.cmu.edu) (c) Date: 27 April 1987 3. Past Usage: 1. Schlimmer,J.S. (1987). Concept Acquisition Through Representational Adjustment (Technical Report 87-19). Doctoral disseration, Department of Information and Computer Science, University of California, Irvine " ], "weight": 5 }, "qualities": { "NumberOfInstances": 8124, "NumberOfFeatures": 23, "NumberOfClasses": 2, "NumberOfMissingValues": 2480, "NumberOfInstancesWithMissingValues": 2480, "NumberOfNumericFeatures": 0, "NumberOfSymbolicFeatures": 23, "AutoCorrelation": 0.726332635725717, "CfsSubsetEval_DecisionStumpAUC": 0.9910519616800724, "CfsSubsetEval_DecisionStumpErrRate": 0.013047759724273756, "CfsSubsetEval_DecisionStumpKappa": 0.9738461616958994, "CfsSubsetEval_NaiveBayesAUC": 0.9910519616800724, "CfsSubsetEval_NaiveBayesErrRate": 0.013047759724273756, "CfsSubsetEval_NaiveBayesKappa": 0.9738461616958994, "CfsSubsetEval_kNN1NAUC": 0.9910519616800724, "CfsSubsetEval_kNN1NErrRate": 0.013047759724273756, "CfsSubsetEval_kNN1NKappa": 0.9738461616958994, "ClassEntropy": 0.9990678968724604, "DecisionStumpAUC": 0.8894935275772204, "DecisionStumpErrRate": 0.11324470704086657, "DecisionStumpKappa": 0.77457574608175, "Dimensionality": 0.002831117676021664, "EquivalentNumberOfAtts": 5.0393135801657, "J48.00001.AUC": 1, "J48.00001.ErrRate": 0, "J48.00001.Kappa": 1, "J48.0001.AUC": 1, "J48.0001.ErrRate": 0, "J48.0001.Kappa": 1, "J48.001.AUC": 1, "J48.001.ErrRate": 0, "J48.001.Kappa": 1, "MajorityClassPercentage": 51.7971442639094, "MajorityClassSize": 4208, "MaxAttributeEntropy": 3.030432883772633, "MaxKurtosisOfNumericAtts": null, "MaxMeansOfNumericAtts": null, "MaxMutualInformation": 0.906074977384, "MaxNominalAttDistinctValues": 12, "MaxSkewnessOfNumericAtts": null, "MaxStdDevOfNumericAtts": null, "MeanAttributeEntropy": 1.4092554739602103, "MeanKurtosisOfNumericAtts": null, "MeanMeansOfNumericAtts": null, "MeanMutualInformation": 0.19825475850613955, "MeanNoiseToSignalRatio": 6.108305922031972, "MeanNominalAttDistinctValues": 5.130434782608695, "MeanSkewnessOfNumericAtts": null, "MeanStdDevOfNumericAtts": null, "MinAttributeEntropy": -0, "MinKurtosisOfNumericAtts": null, "MinMeansOfNumericAtts": null, "MinMutualInformation": 0, "MinNominalAttDistinctValues": 1, "MinSkewnessOfNumericAtts": null, "MinStdDevOfNumericAtts": null, "MinorityClassPercentage": 48.20285573609059, "MinorityClassSize": 3916, "NaiveBayesAUC": 0.9976229672941662, "NaiveBayesErrRate": 0.04899064500246184, "NaiveBayesKappa": 0.9015972799616292, "NumberOfBinaryFeatures": 5, "PercentageOfBinaryFeatures": 21.73913043478261, "PercentageOfInstancesWithMissingValues": 30.526834071885773, "PercentageOfMissingValues": 1.3272536552993814, "PercentageOfNumericFeatures": 0, "PercentageOfSymbolicFeatures": 100, "Quartile1AttributeEntropy": 0.8286618104993447, "Quartile1KurtosisOfNumericAtts": null, "Quartile1MeansOfNumericAtts": null, "Quartile1MutualInformation": 0.034184520425602494, "Quartile1SkewnessOfNumericAtts": null, "Quartile1StdDevOfNumericAtts": null, "Quartile2AttributeEntropy": 1.467128011861462, "Quartile2KurtosisOfNumericAtts": null, "Quartile2MeansOfNumericAtts": null, "Quartile2MutualInformation": 0.174606545183155, "Quartile2SkewnessOfNumericAtts": null, "Quartile2StdDevOfNumericAtts": null, "Quartile3AttributeEntropy": 2.0533554351937426, "Quartile3KurtosisOfNumericAtts": null, "Quartile3MeansOfNumericAtts": null, "Quartile3MutualInformation": 0.27510225484918505, "Quartile3SkewnessOfNumericAtts": null, "Quartile3StdDevOfNumericAtts": null, "REPTreeDepth1AUC": 0.9999987256143267, "REPTreeDepth1ErrRate": 0.00036927621861152144, "REPTreeDepth1Kappa": 0.9992605118549308, "REPTreeDepth2AUC": 0.9999987256143267, "REPTreeDepth2ErrRate": 0.00036927621861152144, "REPTreeDepth2Kappa": 0.9992605118549308, "REPTreeDepth3AUC": 0.9999987256143267, "REPTreeDepth3ErrRate": 0.00036927621861152144, "REPTreeDepth3Kappa": 0.9992605118549308, "RandomTreeDepth1AUC": 0.9995247148288974, "RandomTreeDepth1ErrRate": 0.0004923682914820286, "RandomTreeDepth1Kappa": 0.9990140245420991, "RandomTreeDepth2AUC": 0.9995247148288974, "RandomTreeDepth2ErrRate": 0.0004923682914820286, "RandomTreeDepth2Kappa": 0.9990140245420991, "RandomTreeDepth3AUC": 0.9995247148288974, "RandomTreeDepth3ErrRate": 0.0004923682914820286, "RandomTreeDepth3Kappa": 0.9990140245420991, "StdvNominalAttDistinctValues": 3.1809710899501766, "kNN1NAUC": 1, "kNN1NErrRate": 0, "kNN1NKappa": 1 }, "tags": [ { "tag": "study_14", "uploader": "1" }, { "tag": "study_1", "uploader": "0" }, { "tag": "study_293", "uploader": "0" }, { "tag": "study_793", "uploader": "0" }, { "tag": "study_118", "uploader": "0" }, { "tag": "study_429", "uploader": "0" }, { "tag": "study_745", "uploader": "0" }, { "tag": "study_121", "uploader": "0" }, { "tag": "study_521", "uploader": "0" }, { "tag": "study_560", "uploader": "0" }, { "tag": "study_121", "uploader": "0" }, { "tag": "study_757", "uploader": "0" }, { "tag": "study_121", "uploader": "0" }, { "tag": "study_106", "uploader": "0" }, { "tag": "study_793", "uploader": "0" }, { "tag": "study_79", "uploader": "0" }, { "tag": "study_105", "uploader": "0" }, { "tag": "study_118", "uploader": "0" } ], "features": [ { "name": "class", "index": "22", "type": "nominal", "distinct": "2", "missing": "0", "target": "1", "distr": [ [ "e", "p" ], [ [ "4208", "0" ], [ "0", "3916" ] ] ] }, { "name": "stalk-surface-above-ring", "index": "11", "type": "nominal", "distinct": "4", "missing": "0", "distr": [ [ "f", "k", "s", "y" ], [ [ "408", "144" ], [ "144", "2228" ], [ "3640", "1536" ], [ "16", "8" ] ] ] }, { "name": "habitat", "index": "21", "type": "nominal", "distinct": "7", "missing": "0", "distr": [ [ "d", "g", "l", "m", "p", "u", "w" ], [ [ "1880", "1268" ], [ "1408", "740" ], [ "240", "592" ], [ "256", "36" ], [ "136", "1008" ], [ "96", "272" ], [ "192", "0" ] ] ] }, { "name": "population", "index": "20", "type": "nominal", "distinct": "6", "missing": "0", "distr": [ [ "a", "c", "n", "s", "v", "y" ], [ [ "384", "0" ], [ "288", "52" ], [ "400", "0" ], [ "880", "368" ], [ "1192", "2848" ], [ "1064", "648" ] ] ] }, { "name": "spore-print-color", "index": "19", "type": "nominal", "distinct": "9", "missing": "0", "distr": [ [ "b", "h", "k", "n", "o", "r", "u", "w", "y" ], [ [ "48", "0" ], [ "48", "1584" ], [ "1648", "224" ], [ "1744", "224" ], [ "48", "0" ], [ "0", "72" ], [ "48", "0" ], [ "576", "1812" ], [ "48", "0" ] ] ] }, { "name": "ring-type", "index": "18", "type": "nominal", "distinct": "5", "missing": "0", "distr": [ [ "c", "e", "f", "l", "n", "p", "s", "z" ], [ [ "0", "0" ], [ "1008", "1768" ], [ "48", "0" ], [ "0", "1296" ], [ "0", "36" ], [ "3152", "816" ], [ "0", "0" ], [ "0", "0" ] ] ] }, { "name": "ring-number", "index": "17", "type": "nominal", "distinct": "3", "missing": "0", "distr": [ [ "n", "o", "t" ], [ [ "0", "36" ], [ "3680", "3808" ], [ "528", "72" ] ] ] }, { "name": "veil-color", "index": "16", "type": "nominal", "distinct": "4", "missing": "0", "distr": [ [ "n", "o", "w", "y" ], [ [ "96", "0" ], [ "96", "0" ], [ "4016", "3908" ], [ "0", "8" ] ] ] }, { "name": "veil-type", "index": "15", "type": "nominal", "distinct": "1", "missing": "0", "distr": [ [ "p", "u" ], [ [ "4208", "3916" ], [ "0", "0" ] ] ] }, { "name": "stalk-color-below-ring", "index": "14", "type": "nominal", "distinct": "9", "missing": "0", "distr": [ [ "b", "c", "e", "g", "n", "o", "p", "w", "y" ], [ [ "0", "432" ], [ "0", "36" ], [ "96", "0" ], [ "576", "0" ], [ "64", "448" ], [ "192", "0" ], [ "576", "1296" ], [ "2704", "1680" ], [ "0", "24" ] ] ] }, { "name": "stalk-color-above-ring", "index": "13", "type": "nominal", "distinct": "9", "missing": "0", "distr": [ [ "b", "c", "e", "g", "n", "o", "p", "w", "y" ], [ [ "0", "432" ], [ "0", "36" ], [ "96", "0" ], [ "576", "0" ], [ "16", "432" ], [ "192", "0" ], [ "576", "1296" ], [ "2752", "1712" ], [ "0", "8" ] ] ] }, { "name": "stalk-surface-below-ring", "index": "12", "type": "nominal", "distinct": "4", "missing": "0", "distr": [ [ "f", "k", "s", "y" ], [ [ "456", "144" ], [ "144", "2160" ], [ "3400", "1536" ], [ "208", "76" ] ] ] }, { "name": "cap-shape", "index": "0", "type": "nominal", "distinct": "6", "missing": "0", "distr": [ [ "b", "c", "f", "k", "s", "x" ], [ [ "404", "48" ], [ "0", "4" ], [ "1596", "1556" ], [ "228", "600" ], [ "32", "0" ], [ "1948", "1708" ] ] ] }, { "name": "stalk-root", "index": "10", "type": "nominal", "distinct": "4", "missing": "2480", "distr": [ [ "b", "c", "e", "r", "u", "z" ], [ [ "1920", "1856" ], [ "512", "44" ], [ "864", "256" ], [ "192", "0" ], [ "0", "0" ], [ "0", "0" ] ] ] }, { "name": "stalk-shape", "index": "9", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "e", "t" ], [ [ "1616", "1900" ], [ "2592", "2016" ] ] ] }, { "name": "gill-color", "index": "8", "type": "nominal", "distinct": "12", "missing": "0", "distr": [ [ "b", "e", "g", "h", "k", "n", "o", "p", "r", "u", "w", "y" ], [ [ "0", "1728" ], [ "96", "0" ], [ "248", "504" ], [ "204", "528" ], [ "344", "64" ], [ "936", "112" ], [ "64", "0" ], [ "852", "640" ], [ "0", "24" ], [ "444", "48" ], [ "956", "246" ], [ "64", "22" ] ] ] }, { "name": "gill-size", "index": "7", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "b", "n" ], [ [ "3920", "1692" ], [ "288", "2224" ] ] ] }, { "name": "gill-spacing", "index": "6", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "c", "d", "w" ], [ [ "3008", "3804" ], [ "0", "0" ], [ "1200", "112" ] ] ] }, { "name": "gill-attachment", "index": "5", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "a", "d", "f", "n" ], [ [ "192", "18" ], [ "0", "0" ], [ "4016", "3898" ], [ "0", "0" ] ] ] }, { "name": "odor", "index": "4", "type": "nominal", "distinct": "9", "missing": "0", "distr": [ [ "a", "c", "f", "l", "m", "n", "p", "s", "y" ], [ [ "400", "0" ], [ "0", "192" ], [ "0", "2160" ], [ "400", "0" ], [ "0", "36" ], [ "3408", "120" ], [ "0", "256" ], [ "0", "576" ], [ "0", "576" ] ] ] }, { "name": "bruises%3F", "index": "3", "type": "nominal", "distinct": "2", "missing": "0", "distr": [ [ "f", "t" ], [ [ "1456", "3292" ], [ "2752", "624" ] ] ] }, { "name": "cap-color", "index": "2", "type": "nominal", "distinct": "10", "missing": "0", "distr": [ [ "b", "c", "e", "g", "n", "p", "r", "u", "w", "y" ], [ [ "48", "120" ], [ "32", "12" ], [ "624", "876" ], [ "1032", "808" ], [ "1264", "1020" ], [ "56", "88" ], [ "16", "0" ], [ "16", "0" ], [ "720", "320" ], [ "400", "672" ] ] ] }, { "name": "cap-surface", "index": "1", "type": "nominal", "distinct": "4", "missing": "0", "distr": [ [ "f", "g", "s", "y" ], [ [ "1560", "760" ], [ "0", "4" ], [ "1144", "1412" ], [ "1504", "1740" ] ] ] } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 2, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 2 }