{ "data_id": "86", "name": "wdbc", "exact_name": "wdbc", "version": 1, "version_label": null, "description": "**Author**: William H. Wolberg, W. Nick Street, Olvi L. Mangasarian \n**Source**: UCI \n**Please cite**: \n\n* Title: \n\nBreast Cancer Wisconsin (Diagnostic) Data Set (WDBC)\n\n* Abstract: \n\nDiagnostic Wisconsin Breast Cancer Database\n\n* Source:\n\nCreators: \n\n1. Dr. William H. Wolberg, General Surgery Dept. \nUniversity of Wisconsin, Clinical Sciences Center \nMadison, WI 53792 \nwolberg '@' eagle.surgery.wisc.edu \n\n2. W. Nick Street, Computer Sciences Dept. \nUniversity of Wisconsin, 1210 West Dayton St., Madison, WI 53706 \nstreet '@' cs.wisc.edu 608-262-6619 \n\n3. Olvi L. Mangasarian, Computer Sciences Dept. \nUniversity of Wisconsin, 1210 West Dayton St., Madison, WI 53706 \nolvi '@' cs.wisc.edu \n\nDonor: Nick Street\n\n* Data Set Information:\n\nFeatures are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at [Web Link] \n\nSeparating plane described above was obtained using Multisurface Method-Tree (MSM-T) [K. P. Bennett, \"Decision Tree Construction Via Linear Programming.\" Proceedings of the 4th Midwest Artificial Intelligence and Cognitive Science Society, pp. 97-101, 1992], a classification method which uses linear programming to construct a decision tree. Relevant features were selected using an exhaustive search in the space of 1-4 features and 1-3 separating planes. \n\nThe actual linear program used to obtain the separating plane in the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: \"Robust Linear Programming Discrimination of Two Linearly Inseparable Sets\", Optimization Methods and Software 1, 1992, 23-34]. \n\nThis database is also available through the UW CS ftp server: \nftp ftp.cs.wisc.edu \ncd math-prog\/cpo-dataset\/machine-learn\/WDBC\/\n\n\n* Attribute Information:\n\n1) ID number \n2) Diagnosis (M = malignant, B = benign) \n3-32) \n\nTen real-valued features are computed for each cell nucleus: \n\na) radius (mean of distances from center to points on the perimeter) \nb) texture (standard deviation of gray-scale values) \nc) perimeter \nd) area \ne) smoothness (local variation in radius lengths) \nf) compactness (perimeter^2 \/ area - 1.0) \ng) concavity (severity of concave portions of the contour) \nh) concave points (number of concave portions of the contour) \ni) symmetry \nj) fractal dimension (\"coastline approximation\" - 1)\n\n\n* Relevant Papers:\n\nW.N. Street, W.H. Wolberg and O.L. Mangasarian. Nuclear feature extraction for breast tumor diagnosis. IS&T\/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology, volume 1905, pages 861-870, San Jose, CA, 1993. \n[Web Link] \n\nO.L. Mangasarian, W.N. Street and W.H. Wolberg. Breast cancer diagnosis and prognosis via linear programming. Operations Research, 43(4), pages 570-577, July-August 1995. \n[Web Link] \n\nW.H. Wolberg, W.N. Street, and O.L. Mangasarian. Machine learning techniques to diagnose breast cancer from fine-needle aspirates. Cancer Letters 77 (1994) 163-171. \n\nW.H. Wolberg, W.N. Street, and O.L. Mangasarian. Image analysis and machine learning applied to breast cancer diagnosis and prognosis. Analytical and Quantitative Cytology and Histology, Vol. 17 No. 2, pages 77-87, April 1995. \n\nW.H. Wolberg, W.N. Street, D.M. Heisey, and O.L. Mangasarian. Computerized breast cancer diagnosis and prognosis from fine needle aspirates. Archives of Surgery 1995;130:511-516. \n\nW.H. Wolberg, W.N. Street, D.M. Heisey, and O.L. Mangasarian. Computer-derived nuclear features distinguish malignant from benign breast cytology. Human Pathology, 26:792--796, 1995. \n\n", "format": "ARFF", "uploader": "Rafael Gomes Mantovani", "uploader_id": 64, "visibility": "public", "creator": null, "contributor": null, "date": "2015-05-26 16:24:07", "update_comment": null, "last_update": "2015-11-09 20:15:56", "licence": "Public", "status": "active", "error_message": null, "url": "https:\/\/www.openml.org\/data\/download\/1592318\/phpAmSP4g", "default_target_attribute": "Class", "row_id_attribute": null, "ignore_attribute": null, "runs": 0, "suggest": { "input": [ "wdbc", "* Title: Breast Cancer Wisconsin (Diagnostic) Data Set (WDBC) * Abstract: Diagnostic Wisconsin Breast Cancer Database * Source: Creators: 1. Dr. William H. Wolberg, General Surgery Dept. University of Wisconsin, Clinical Sciences Center Madison, WI 53792 wolberg '@' eagle.surgery.wisc.edu 2. W. Nick Street, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 street '@' cs.wisc.edu 608-262-6619 3. Olvi L. Mangasarian, Computer Sciences Dept. University of Wisc " ], "weight": 5 }, "qualities": { "NumberOfInstances": 569, "NumberOfFeatures": 31, "NumberOfClasses": 2, "NumberOfMissingValues": 0, "NumberOfInstancesWithMissingValues": 0, "NumberOfNumericFeatures": 30, "NumberOfSymbolicFeatures": 1, "AutoCorrelation": 0.625, "CfsSubsetEval_DecisionStumpAUC": 0.9550763701707098, "CfsSubsetEval_DecisionStumpErrRate": 0.05975395430579965, "CfsSubsetEval_DecisionStumpKappa": 0.8714534412417441, "CfsSubsetEval_NaiveBayesAUC": 0.9550763701707098, "CfsSubsetEval_NaiveBayesErrRate": 0.05975395430579965, "CfsSubsetEval_NaiveBayesKappa": 0.8714534412417441, "CfsSubsetEval_kNN1NAUC": 0.9550763701707098, "CfsSubsetEval_kNN1NErrRate": 0.05975395430579965, "CfsSubsetEval_kNN1NKappa": 0.8714534412417441, "ClassEntropy": 0.9526351224018599, "DecisionStumpAUC": 0.8721592410549125, "DecisionStumpErrRate": 0.08963093145869948, "DecisionStumpKappa": 0.8012438100586974, "Dimensionality": 0.054481546572934976, "EquivalentNumberOfAtts": null, "J48.00001.AUC": 0.9370870989905398, "J48.00001.ErrRate": 0.056239015817223195, "J48.00001.Kappa": 0.8792476854922141, "J48.0001.AUC": 0.9370870989905398, "J48.0001.ErrRate": 0.056239015817223195, "J48.0001.Kappa": 0.8792476854922141, "J48.001.AUC": 0.9370870989905398, "J48.001.ErrRate": 0.056239015817223195, "J48.001.Kappa": 0.8792476854922141, "MajorityClassPercentage": 62.741652021089635, "MajorityClassSize": 357, "MaxAttributeEntropy": null, "MaxKurtosisOfNumericAtts": 49.20907650724138, "MaxMeansOfNumericAtts": 880.5831282952547, "MaxMutualInformation": null, "MaxNominalAttDistinctValues": 2, "MaxSkewnessOfNumericAtts": 5.447186284898407, "MaxStdDevOfNumericAtts": 569.3569926699494, "MeanAttributeEntropy": null, "MeanKurtosisOfNumericAtts": 7.8147348251102615, "MeanMeansOfNumericAtts": 61.89071233954305, "MeanMutualInformation": null, "MeanNoiseToSignalRatio": null, "MeanNominalAttDistinctValues": 2, "MeanSkewnessOfNumericAtts": 1.740406628520751, "MeanStdDevOfNumericAtts": 34.904718603211656, "MinAttributeEntropy": null, "MinKurtosisOfNumericAtts": -0.5355351225188612, "MinMeansOfNumericAtts": 0.0037949033391915642, "MinMutualInformation": null, "MinNominalAttDistinctValues": 2, "MinSkewnessOfNumericAtts": 0.4154259962824675, "MinStdDevOfNumericAtts": 0.002646071523977847, "MinorityClassPercentage": 37.258347978910365, "MinorityClassSize": 212, "NaiveBayesAUC": 0.9797278811381966, "NaiveBayesErrRate": 0.07205623901581722, "NaiveBayesKappa": 0.8451371786276162, "NumberOfBinaryFeatures": 1, "PercentageOfBinaryFeatures": 3.225806451612903, "PercentageOfInstancesWithMissingValues": 0, "PercentageOfMissingValues": 0, "PercentageOfNumericFeatures": 96.7741935483871, "PercentageOfSymbolicFeatures": 3.225806451612903, "Quartile1AttributeEntropy": null, "Quartile1KurtosisOfNumericAtts": 0.9651825547526209, "Quartile1MeansOfNumericAtts": 0.05932799384885764, "Quartile1MutualInformation": null, "Quartile1SkewnessOfNumericAtts": 0.9785827119630317, "Quartile1StdDevOfNumericAtts": 0.018022995343089827, "Quartile2AttributeEntropy": null, "Quartile2KurtosisOfNumericAtts": 3.022590146044739, "Quartile2MeansOfNumericAtts": 0.21771345342706502, "Quartile2MutualInformation": null, "Quartile2SkewnessOfNumericAtts": 1.4175537695584106, "Quartile2StdDevOfNumericAtts": 0.072726074660982, "Quartile3AttributeEntropy": null, "Quartile3KurtosisOfNumericAtts": 5.985908976234704, "Quartile3MeansOfNumericAtts": 17.024304481546572, "Quartile3MutualInformation": null, "Quartile3SkewnessOfNumericAtts": 1.9754487571153525, "Quartile3StdDevOfNumericAtts": 4.434087221242544, "REPTreeDepth1AUC": 0.9651709740499974, "REPTreeDepth1ErrRate": 0.05272407732864675, "REPTreeDepth1Kappa": 0.8870120070427198, "REPTreeDepth2AUC": 0.9651709740499974, "REPTreeDepth2ErrRate": 0.05272407732864675, "REPTreeDepth2Kappa": 0.8870120070427198, "REPTreeDepth3AUC": 0.9651709740499974, "REPTreeDepth3ErrRate": 0.05272407732864675, "REPTreeDepth3Kappa": 0.8870120070427198, "RandomTreeDepth1AUC": 0.9346229057660799, "RandomTreeDepth1ErrRate": 0.05799648506151142, "RandomTreeDepth1Kappa": 0.8751138986252353, "RandomTreeDepth2AUC": 0.9346229057660799, "RandomTreeDepth2ErrRate": 0.05799648506151142, "RandomTreeDepth2Kappa": 0.8751138986252353, "RandomTreeDepth3AUC": 0.9346229057660799, "RandomTreeDepth3ErrRate": 0.05799648506151142, "RandomTreeDepth3Kappa": 0.8751138986252353, "StdvNominalAttDistinctValues": 0, "kNN1NAUC": 0.9512248295544633, "kNN1NErrRate": 0.043936731107205626, "kNN1NKappa": 0.9052064799450897 }, "tags": [ { "tag": "study_14", "uploader": "1" }, { "tag": "study_1", "uploader": "0" }, { "tag": "study_102", "uploader": "0" }, { "tag": "study_163", "uploader": "0" }, { "tag": "study_521", "uploader": "0" }, { "tag": "study_591", "uploader": "0" }, { "tag": "study_122", "uploader": "0" }, { "tag": "study_757", "uploader": "0" }, { "tag": "study_121", "uploader": "0" }, { "tag": "study_122", "uploader": "0" }, { "tag": "study_538", "uploader": "0" }, { "tag": "study_769", "uploader": "0" }, { "tag": "study_106", "uploader": "0" } ], "features": [ { "name": "Class", "index": "30", "type": "nominal", "distinct": "2", "missing": "0", "target": "1", "distr": [ [ "1", "2" ], [ [ "357", "0" ], [ "0", "212" ] ] ] }, { "name": "V16", "index": "15", "type": "numeric", "distinct": "541", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V30", "index": "29", "type": "numeric", "distinct": "535", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V29", "index": "28", "type": "numeric", "distinct": "500", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "V28", "index": "27", "type": "numeric", "distinct": "492", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V27", "index": "26", "type": "numeric", "distinct": "539", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "V26", "index": "25", "type": "numeric", "distinct": "529", "missing": "0", "min": "0", "max": "1", "mean": "0", "stdev": "0" }, { "name": "V25", "index": "24", "type": "numeric", "distinct": "411", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V24", "index": "23", "type": "numeric", "distinct": "544", "missing": "0", "min": "185", "max": "4254", "mean": "881", "stdev": "569" }, { "name": "V23", "index": "22", "type": "numeric", "distinct": "514", "missing": "0", "min": "50", "max": "251", "mean": "107", "stdev": "34" }, { "name": "V22", "index": "21", "type": "numeric", "distinct": "511", "missing": "0", "min": "12", "max": "50", "mean": "26", "stdev": "6" }, { "name": "V21", "index": "20", "type": "numeric", "distinct": "457", "missing": "0", "min": "8", "max": "36", "mean": "16", "stdev": "5" }, { "name": "V20", "index": "19", "type": "numeric", "distinct": "545", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V19", "index": "18", "type": "numeric", "distinct": "498", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V18", "index": "17", "type": "numeric", "distinct": "507", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V17", "index": "16", "type": "numeric", "distinct": "533", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V1", "index": "0", "type": "numeric", "distinct": "456", "missing": "0", "min": "7", "max": "28", "mean": "14", "stdev": "4" }, { "name": "V15", "index": "14", "type": "numeric", "distinct": "547", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V14", "index": "13", "type": "numeric", "distinct": "528", "missing": "0", "min": "7", "max": "542", "mean": "40", "stdev": "45" }, { "name": "V13", "index": "12", "type": "numeric", "distinct": "533", "missing": "0", "min": "1", "max": "22", "mean": "3", "stdev": "2" }, { "name": "V12", "index": "11", "type": "numeric", "distinct": "519", "missing": "0", "min": "0", "max": "5", "mean": "1", "stdev": "1" }, { "name": "V11", "index": "10", "type": "numeric", "distinct": "540", "missing": "0", "min": "0", "max": "3", "mean": "0", "stdev": "0" }, { "name": "V10", "index": "9", "type": "numeric", "distinct": "499", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V9", "index": "8", "type": "numeric", "distinct": "432", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V8", "index": "7", "type": "numeric", "distinct": "542", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V7", "index": "6", "type": "numeric", "distinct": "537", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V6", "index": "5", "type": "numeric", "distinct": "537", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V5", "index": "4", "type": "numeric", "distinct": "474", "missing": "0", "min": "0", "max": "0", "mean": "0", "stdev": "0" }, { "name": "V4", "index": "3", "type": "numeric", "distinct": "539", "missing": "0", "min": "144", "max": "2501", "mean": "655", "stdev": "352" }, { "name": "V3", "index": "2", "type": "numeric", "distinct": "522", "missing": "0", "min": "44", "max": "189", "mean": "92", "stdev": "24" }, { "name": "V2", "index": "1", "type": "numeric", "distinct": "479", "missing": "0", "min": "10", "max": "39", "mean": "19", "stdev": "4" } ], "nr_of_issues": 0, "nr_of_downvotes": 0, "nr_of_likes": 0, "nr_of_downloads": 0, "total_downloads": 0, "reach": 0, "reuse": 11, "impact_of_reuse": 0, "reach_of_reuse": 0, "impact": 11 }