OpenML

JavaScript is required to properly view the contents of this page!

Explore
- Data
- Task
- Flow
- Run
- Study
- Task type
- Measure
- People
Help
Blog
Contact
Please cite us

lowbwt

active ARFF Publicly available Visibility: public Uploaded 23-04-2014 by Jan van Rijn
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

study_15 study_17 study_29 study_32 study_48 study_69 study_90 study_92 study_94 study_96 study_98 study_127 study_15 study_17 study_29 study_32 study_40 study_48 study_69 study_90 study_92 study_94 study_96 study_98 study_127 study_15 study_17 study_29 study_32 study_48 study_69 study_90 study_92 study_94 study_96 study_98 study_127 study_15 study_17 study_29 study_32 study_38 study_48 study_69 study_90 study_92 study_94 study_96 study_98 study_127 study_15 study_17 study_29 study_32 study_48 study_69 study_90 study_92 study_94 study_96 study_98 study_127 study_143 study_15 study_17 study_29 study_32 study_48 study_69 study_90 study_92 study_94 study_96 study_98 study_127 study_2 study_167 study_463 Add tag

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

Author: Source: Unknown - Please cite: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Identification code deleted. As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems. Singapore: Springer-Verlag. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! NAME: LOW BIRTH WEIGHT DATA KEYWORDS: Logistic Regression SIZE: 189 observations, 11 variables NOTE: These data come from Appendix 1 of Hosmer and Lemeshow (1989). These data are copyrighted and must be acknowledged and used accordingly. DESCRIPTIVE ABSTRACT: The goal of this study was to identify risk factors associated with giving birth to a low birth weight baby (weighing less than 2500 grams). Data were collected on 189 women, 59 of which had low birth weight babies and 130 of which had normal birth weight babies. Four variables which were thought to be of importance were age, weight of the subject at her last menstrual period, race, and the number of physician visits during the first trimester of pregnancy. SOURCE: Data were collected at Baystate Medical Center, Springfield, Massachusetts, during 1986. NOTE: This data set consists of the complete data. A paired data set created from this low birth weight data may be found in plowbwt.dat and a 3 to 1 matched data set created from the low birth weight data may be found in mlowbwt.dat. Table: Code Sheet for the Variables in the Low Birth Weight Data Set. Columns Variable Abbreviation ----------------------------------------------------------------------------- 2-4 Identification Code ID 10 Low Birth Weight (0 = Birth Weight ge 2500g, LOW l = Birth Weight < 2500g) 17-18 Age of the Mother in Years AGE 23-25 Weight in Pounds at the Last Menstrual Period LWT 32 Race (1 = White, 2 = Black, 3 = Other) RACE 40 Smoking Status During Pregnancy (1 = Yes, 0 = No) SMOKE 48 History of Premature Labor (0 = None, 1 = One, etc.) PTL 55 History of Hypertension (1 = Yes, 0 = No) HT 61 Presence of Uterine Irritability (1 = Yes, 0 = No) UI 67 Number of Physician Visits During the First Trimester FTV (0 = None, 1 = One, 2 = Two, etc.) 73-76 Birth Weight in Grams BWT ----------------------------------------------------------------------------- PEDAGOGICAL NOTES: These data have been used as an example of fitting a multiple logistic regression model. STORY BEHIND THE DATA: Low birth weight is an outcome that has been of concern to physicians for years. This is due to the fact that infant mortality rates and birth defect rates are very high for low birth weight babies. A woman's behavior during pregnancy (including diet, smoking habits, and receiving prenatal care) can greatly alter the chances of carrying the baby to term and, consequently, of delivering a baby of normal birth weight. The variables identified in the code sheet given in the table have been shown to be associated with low birth weight in the obstetrical literature. The goal of the current study was to ascertain if these variables were important in the population being served by the medical center where the data were collected. References: 1. Hosmer and Lemeshow, Applied Logistic Regression, Wiley, (1989).

10 features

class (target)	numeric	133 unique values 0 missing
LOW	nominal	2 unique values 0 missing
AGE	numeric	24 unique values 0 missing
LWT	numeric	75 unique values 0 missing
RACE	nominal	3 unique values 0 missing
SMOKE	nominal	2 unique values 0 missing
PTL	nominal	4 unique values 0 missing
HT	nominal	2 unique values 0 missing
UI	nominal	2 unique values 0 missing
FTV	nominal	6 unique values 0 missing

Show all 10 features

107 properties

NumberOfInstances

189

Number of instances (rows) of the dataset.

NumberOfFeatures

Number of attributes (columns) of the dataset.

NumberOfClasses

Number of distinct values of the target attribute (if it is nominal).

NumberOfMissingValues

Number of missing values in the dataset.

NumberOfInstancesWithMissingValues

Number of instances with at least one value missing.

NumberOfNumericFeatures

Number of numeric attributes.

NumberOfSymbolicFeatures

Number of nominal attributes.

AutoCorrelation

-44.39

Average class difference between consecutive instances.

CfsSubsetEval_DecisionStumpAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_DecisionStumpErrRate

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_DecisionStumpKappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesErrRate

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesKappa

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NErrRate

Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NKappa

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

ClassEntropy

Entropy of the target attribute values.

DecisionStumpAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump

DecisionStumpErrRate

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump

DecisionStumpKappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump

Dimensionality

0.05

Number of attributes divided by the number of instances.

EquivalentNumberOfAtts

Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.

J48.00001.AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.00001.ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.00001.Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.0001.AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.0001.ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.0001.Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.001.AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001

J48.001.ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001

J48.001.Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001

MajorityClassPercentage

Percentage of instances belonging to the most frequent class.

MajorityClassSize

Number of instances belonging to the most frequent class.

MaxAttributeEntropy

Maximum entropy among attributes.

MaxKurtosisOfNumericAtts

2.4

Maximum kurtosis among attributes of the numeric type.

MaxMeansOfNumericAtts

2944.66

Maximum of means among attributes of the numeric type.

MaxMutualInformation

Maximum mutual information between the nominal attributes and the target attribute.

MaxNominalAttDistinctValues

The maximum number of distinct values among attributes of the nominal type.

MaxSkewnessOfNumericAtts

1.4

Maximum skewness among attributes of the numeric type.

MaxStdDevOfNumericAtts

729.02

Maximum standard deviation of attributes of the numeric type.

MeanAttributeEntropy

Average entropy of the attributes.

MeanKurtosisOfNumericAtts

0.98

Mean kurtosis among attributes of the numeric type.

MeanMeansOfNumericAtts

1032.57

Mean of means among attributes of the numeric type.

MeanMutualInformation

Average mutual information between the nominal attributes and the target attribute.

MeanNoiseToSignalRatio

An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.

MeanNominalAttDistinctValues

Average number of distinct values among the attributes of the nominal type.

MeanSkewnessOfNumericAtts

0.64

Mean skewness among attributes of the numeric type.

MeanStdDevOfNumericAtts

254.97

Mean standard deviation of attributes of the numeric type.

MinAttributeEntropy

Minimal entropy among attributes.

MinKurtosisOfNumericAtts

-0.08

Minimum kurtosis among attributes of the numeric type.

MinMeansOfNumericAtts

23.24

Minimum of means among attributes of the numeric type.

MinMutualInformation

Minimal mutual information between the nominal attributes and the target attribute.

MinNominalAttDistinctValues

The minimal number of distinct values among attributes of the nominal type.

MinSkewnessOfNumericAtts

-0.21

Minimum skewness among attributes of the numeric type.

MinStdDevOfNumericAtts

5.3

Minimum standard deviation of attributes of the numeric type.

MinorityClassPercentage

Percentage of instances belonging to the least frequent class.

MinorityClassSize

Number of instances belonging to the least frequent class.

NaiveBayesAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NaiveBayesErrRate

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NaiveBayesKappa

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NumberOfBinaryFeatures

Number of binary attributes.

PercentageOfBinaryFeatures

Percentage of binary attributes.

PercentageOfInstancesWithMissingValues

Percentage of instances having missing values.

PercentageOfMissingValues

Percentage of missing values.

PercentageOfNumericFeatures

Percentage of numeric attributes.

PercentageOfSymbolicFeatures

Percentage of nominal attributes.

Quartile1AttributeEntropy

First quartile of entropy among attributes.

Quartile1KurtosisOfNumericAtts

-0.08

First quartile of kurtosis among attributes of the numeric type.

Quartile1MeansOfNumericAtts

23.24

First quartile of means among attributes of the numeric type.

Quartile1MutualInformation

First quartile of mutual information between the nominal attributes and the target attribute.

Quartile1SkewnessOfNumericAtts

-0.21

First quartile of skewness among attributes of the numeric type.

Quartile1StdDevOfNumericAtts

5.3

First quartile of standard deviation of attributes of the numeric type.

Quartile2AttributeEntropy

Second quartile (Median) of entropy among attributes.

Quartile2KurtosisOfNumericAtts

0.62

Second quartile (Median) of kurtosis among attributes of the numeric type.

Quartile2MeansOfNumericAtts

129.81

Second quartile (Median) of means among attributes of the numeric type.

Quartile2MutualInformation

Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

Quartile2SkewnessOfNumericAtts

0.72

Second quartile (Median) of skewness among attributes of the numeric type.

Quartile2StdDevOfNumericAtts

30.58

Second quartile (Median) of standard deviation of attributes of the numeric type.

Quartile3AttributeEntropy

Third quartile of entropy among attributes.

Quartile3KurtosisOfNumericAtts

2.4

Third quartile of kurtosis among attributes of the numeric type.

Quartile3MeansOfNumericAtts

2944.66

Third quartile of means among attributes of the numeric type.

Quartile3MutualInformation

Third quartile of mutual information between the nominal attributes and the target attribute.

Quartile3SkewnessOfNumericAtts

1.4

Third quartile of skewness among attributes of the numeric type.

Quartile3StdDevOfNumericAtts

729.02

Third quartile of standard deviation of attributes of the numeric type.

REPTreeDepth1AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth1ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth1Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth2AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth2ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth2Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth3AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3

REPTreeDepth3ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3

REPTreeDepth3Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3

RandomTreeDepth1AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth1ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth1Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth2AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth2ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth2Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth3AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

RandomTreeDepth3ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

RandomTreeDepth3Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

StdvNominalAttDistinctValues

1.53

Standard deviation of the number of distinct values among attributes of the nominal type.

kNN1NAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk

kNN1NErrRate

Error rate achieved by the landmarker weka.classifiers.lazy.IBk

kNN1NKappa

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk

Show all 107 properties

7 tasks

Supervised Regression on lowbwt

0 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: class

Supervised Regression on lowbwt

0 runs - estimation_procedure: 5 times 2-fold Crossvalidation - target_feature: class

Supervised Regression on lowbwt

0 runs - estimation_procedure: 10 times 10-fold Crossvalidation - target_feature: class

Supervised Regression on lowbwt

0 runs - estimation_procedure: Leave one out - target_feature: class

Supervised Regression on lowbwt

0 runs - estimation_procedure: 33% Holdout set - target_feature: class

Supervised Regression on lowbwt

0 runs - estimation_procedure: 10% Holdout set - target_feature: class

Supervised Regression on lowbwt

0 runs - estimation_procedure: Test on Training Data - target_feature: class

Define a new task

Sign in

lowbwt

10 features

107 properties

7 tasks