OpenML

JavaScript is required to properly view the contents of this page!

Explore
- Data
- Task
- Flow
- Run
- Study
- Task type
- Measure
- People
Help
Blog
Contact
Please cite us

MiceProtein

active ARFF Publicly available Visibility: public Uploaded 17-02-2016 by Hilda Fabiola Bernard
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue	#Downvotes for this reason	By

Loading wiki

Help us complete this description Edit

Author: lara Higuera Department of Software Engineering and Artificial Intelligence","Faculty of Informatics and the Department of Biochemistry and Molecular Biology","Faculty of Chemistry","University Complutense","Madrid","Spain. Email: clarahiguera '@' ucm.es Katheleen J. Gardiner","creator and owner of the protein expression data","is currently with the Linda Crnic Institute for Down Syndrome","Department of Pediatrics","Department of Biochemistry and Molecular Genetics","Human Medical Genetics and Genomics","and Neuroscience Programs","University of Colorado","School of Medicine","Aurora","Colorado","USA. Email: katheleen.gardiner '@' ucdenver.edu Krzysztof J. Cios is currently with the Department of Computer Science","Virginia Commonwealth University","Richmond","Virginia","USA","and IITiS Polish Academy of Sciences","Poland. Email: kcios '@' vcu.edu Source: UCI Please cite: Higuera C, Gardiner KJ, Cios KJ (2015) Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome. PLoS ONE 10(6): e0129126. [Web Link] journal.pone.0129126 Abstract: Expression levels of 77 proteins measured in the cerebral cortex of 8 classes of control and Down syndrome mice exposed to context fear conditioning, a task used to assess associative learning. Source: Clara Higuera Department of Software Engineering and Artificial Intelligence, Faculty of Informatics and the Department of Biochemistry and Molecular Biology, Faculty of Chemistry, University Complutense, Madrid, Spain. Email: clarahiguera '@' ucm.es Katheleen J. Gardiner, creator and owner of the protein expression data, is currently with the Linda Crnic Institute for Down Syndrome, Department of Pediatrics, Department of Biochemistry and Molecular Genetics, Human Medical Genetics and Genomics, and Neuroscience Programs, University of Colorado, School of Medicine, Aurora, Colorado, USA. Email: katheleen.gardiner '@' ucdenver.edu Krzysztof J. Cios is currently with the Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA, and IITiS Polish Academy of Sciences, Poland. Email: kcios '@' vcu.edu Data Set Information: The data set consists of the expression levels of 77 proteins/protein modifications that produced detectable signals in the nuclear fraction of cortex. There are 38 control mice and 34 trisomic mice (Down syndrome), for a total of 72 mice. In the experiments, 15 measurements were registered of each protein per sample/mouse. Therefore, for control mice, there are 38x15, or 570 measurements, and for trisomic mice, there are 34x15, or 510 measurements. The dataset contains a total of 1080 measurements per protein. Each measurement can be considered as an independent sample/mouse. The eight classes of mice are described based on features such as genotype, behavior and treatment. According to genotype, mice can be control or trisomic. According to behavior, some mice have been stimulated to learn (context-shock) and others have not (shock-context) and in order to assess the effect of the drug memantine in recovering the ability to learn in trisomic mice, some mice have been injected with the drug and others have not. Classes: c-CS-s: control mice, stimulated to learn, injected with saline (9 mice) c-CS-m: control mice, stimulated to learn, injected with memantine (10 mice) c-SC-s: control mice, not stimulated to learn, injected with saline (9 mice) c-SC-m: control mice, not stimulated to learn, injected with memantine (10 mice) t-CS-s: trisomy mice, stimulated to learn, injected with saline (7 mice) t-CS-m: trisomy mice, stimulated to learn, injected with memantine (9 mice) t-SC-s: trisomy mice, not stimulated to learn, injected with saline (9 mice) t-SC-m: trisomy mice, not stimulated to learn, injected with memantine (9 mice) The aim is to identify subsets of proteins that are discriminant between the classes. Attribute Information: 1 Mouse ID 2..78 Values of expression levels of 77 proteins; the names of proteins are followed by â€œ_nâ€ indicating that they were measured in the nuclear fraction. For example: DYRK1A_n 79 Genotype: control (c) or trisomy (t) 80 Treatment type: memantine (m) or saline (s) 81 Behavior: context-shock (CS) or shock-context (SC) 82 Class: c-CS-s, c-CS-m, c-SC-s, c-SC-m, t-CS-s, t-CS-m, t-SC-s, t-SC-m Relevant Papers: The posted data were analyzed by: Higuera C, Gardiner KJ, Cios KJ (2015) Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome. PLoS ONE 10(6): e0129126. [Web Link] journal.pone.0129126 The data are a subset of the data analyzed by: Ahmed MM, Dhanasekaran AR, Block A, Tong S, Costa ACS, Stasko M, et al. (2015) Protein Dynamics Associated with Failed and Rescued Learning in the Ts65Dn Mouse Model of Down Syndrome. PLoS ONE 10(3): e0119491. [Web Link] Citation Request: Higuera C, Gardiner KJ, Cios KJ (2015) Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome. PLoS ONE 10(6): e0129126. [Web Link] journal.pone.0129126

82 features

class (target)	nominal	8 unique values 0 missing
GluR4_N	numeric	1079 unique values 0 missing
TIAM1_N	numeric	1075 unique values 3 missing
GluR3_N	numeric	1080 unique values 0 missing
GFAP_N	numeric	1079 unique values 0 missing
Tau_N	numeric	1080 unique values 0 missing
nNOS_N	numeric	1079 unique values 0 missing
ERBB4_N	numeric	1079 unique values 0 missing
ARC_N	numeric	1080 unique values 0 missing
BAX_N	numeric	1080 unique values 0 missing
RRP1_N	numeric	1080 unique values 0 missing
AcetylH3K9_N	numeric	1080 unique values 0 missing
ADARB1_N	numeric	1080 unique values 0 missing
S6_N	numeric	1080 unique values 0 missing
CDK5_N	numeric	1080 unique values 0 missing
pPKCG_N	numeric	1080 unique values 0 missing
pGSK3B_N	numeric	1080 unique values 0 missing
P70S6_N	numeric	1080 unique values 0 missing
NUMB_N	numeric	1080 unique values 0 missing
pP70S6_N	numeric	1076 unique values 3 missing
RAPTOR_N	numeric	1077 unique values 3 missing
pS6_N	numeric	1080 unique values 0 missing
Behavior	nominal	2 unique values 0 missing
Treatment	nominal	2 unique values 0 missing
Genotype	nominal	2 unique values 0 missing
CaNA_N	numeric	1080 unique values 0 missing
H3MeK4_N	numeric	810 unique values 270 missing
EGR1_N	numeric	870 unique values 210 missing
H3AcK18_N	numeric	900 unique values 180 missing
SYP_N	numeric	1079 unique values 0 missing
pCFOS_N	numeric	1005 unique values 75 missing
IL1B_N	numeric	1080 unique values 0 missing
BCL2_N	numeric	795 unique values 285 missing
BAD_N	numeric	866 unique values 213 missing
SHH_N	numeric	1080 unique values 0 missing
pGSK3B_Tyr216_N	numeric	1080 unique values 0 missing
Ubiquitin_N	numeric	1080 unique values 0 missing
SNCA_N	numeric	1079 unique values 0 missing
PSD95_N	numeric	1080 unique values 0 missing
pCASP9_N	numeric	1080 unique values 0 missing
P3525_N	numeric	1080 unique values 0 missing
pELK_N	numeric	1077 unique values 3 missing
AKT_N	numeric	1077 unique values 3 missing
pRSK_N	numeric	1077 unique values 3 missing
pPKCAB_N	numeric	1077 unique values 3 missing
pNR2B_N	numeric	1077 unique values 3 missing
pNR2A_N	numeric	1077 unique values 3 missing
pNR1_N	numeric	1077 unique values 3 missing
pMEK_N	numeric	1077 unique values 3 missing
PKCA_N	numeric	1077 unique values 3 missing
pJNK_N	numeric	1076 unique values 3 missing
pERK_N	numeric	1077 unique values 3 missing
BRAF_N	numeric	1077 unique values 3 missing
pCREB_N	numeric	1077 unique values 3 missing
pCAMKII_N	numeric	1077 unique values 3 missing
pBRAF_N	numeric	1075 unique values 3 missing
pAKT_N	numeric	1076 unique values 3 missing
NR2A_N	numeric	1077 unique values 3 missing
NR1_N	numeric	1077 unique values 3 missing
BDNF_N	numeric	1077 unique values 3 missing
ITSN1_N	numeric	1076 unique values 3 missing
DYRK1A_N	numeric	1077 unique values 3 missing
APP_N	numeric	1077 unique values 3 missing
pNUMB_N	numeric	1077 unique values 3 missing
NR2B_N	numeric	1077 unique values 3 missing
AMPKA_N	numeric	1075 unique values 3 missing
DSCR1_N	numeric	1077 unique values 3 missing
pMTOR_N	numeric	1077 unique values 3 missing
P38_N	numeric	1075 unique values 3 missing
MTOR_N	numeric	1077 unique values 3 missing
SOD1_N	numeric	1077 unique values 3 missing
Bcatenin_N	numeric	1062 unique values 18 missing
MouseID	nominal	1080 unique values 0 missing
RSK_N	numeric	1074 unique values 3 missing
TRKA_N	numeric	1075 unique values 3 missing
MEK_N	numeric	1072 unique values 7 missing
JNK_N	numeric	1077 unique values 3 missing
GSK3B_N	numeric	1077 unique values 3 missing
ERK_N	numeric	1077 unique values 3 missing
ELK_N	numeric	1062 unique values 18 missing
CREB_N	numeric	1073 unique values 3 missing
CAMKII_N	numeric	1077 unique values 3 missing

Show all 82 features

107 properties

NumberOfInstances

1080

Number of instances (rows) of the dataset.

NumberOfFeatures

Number of attributes (columns) of the dataset.

NumberOfClasses

Number of distinct values of the target attribute (if it is nominal).

NumberOfMissingValues

1396

Number of missing values in the dataset.

NumberOfInstancesWithMissingValues

528

Number of instances with at least one value missing.

NumberOfNumericFeatures

Number of numeric attributes.

NumberOfSymbolicFeatures

Number of nominal attributes.

AutoCorrelation

0.99

Average class difference between consecutive instances.

CfsSubsetEval_DecisionStumpAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_DecisionStumpErrRate

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_DecisionStumpKappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesErrRate

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_NaiveBayesKappa

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NAUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NErrRate

Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

CfsSubsetEval_kNN1NKappa

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

ClassEntropy

2.99

Entropy of the target attribute values.

DecisionStumpAUC

0.79

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump

DecisionStumpErrRate

0.72

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump

DecisionStumpKappa

0.16

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump

Dimensionality

0.08

Number of attributes divided by the number of instances.

EquivalentNumberOfAtts

Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.

J48.00001.AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.00001.ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.00001.Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001

J48.0001.AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.0001.ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.0001.Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001

J48.001.AUC

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001

J48.001.ErrRate

Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001

J48.001.Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001

MajorityClassPercentage

13.89

Percentage of instances belonging to the most frequent class.

MajorityClassSize

150

Number of instances belonging to the most frequent class.

MaxAttributeEntropy

10.08

Maximum entropy among attributes.

MaxKurtosisOfNumericAtts

62.55

Maximum kurtosis among attributes of the numeric type.

MaxMeansOfNumericAtts

3.84

Maximum of means among attributes of the numeric type.

MaxMutualInformation

2.99

Maximum mutual information between the nominal attributes and the target attribute.

MaxNominalAttDistinctValues

1080

The maximum number of distinct values among attributes of the nominal type.

MaxSkewnessOfNumericAtts

4.74

Maximum skewness among attributes of the numeric type.

MaxStdDevOfNumericAtts

1.3

Maximum standard deviation of attributes of the numeric type.

MeanAttributeEntropy

3.27

Average entropy of the attributes.

MeanKurtosisOfNumericAtts

4.54

Mean kurtosis among attributes of the numeric type.

MeanMeansOfNumericAtts

0.67

Mean of means among attributes of the numeric type.

MeanMutualInformation

1.5

Average mutual information between the nominal attributes and the target attribute.

MeanNoiseToSignalRatio

1.18

An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.

MeanNominalAttDistinctValues

218.8

Average number of distinct values among the attributes of the nominal type.

MeanSkewnessOfNumericAtts

0.79

Mean skewness among attributes of the numeric type.

MeanStdDevOfNumericAtts

0.16

Mean standard deviation of attributes of the numeric type.

MinAttributeEntropy

Minimal entropy among attributes.

MinKurtosisOfNumericAtts

-0.74

Minimum kurtosis among attributes of the numeric type.

MinMeansOfNumericAtts

0.12

Minimum of means among attributes of the numeric type.

MinMutualInformation

Minimal mutual information between the nominal attributes and the target attribute.

MinNominalAttDistinctValues

The minimal number of distinct values among attributes of the nominal type.

MinSkewnessOfNumericAtts

-0.88

Minimum skewness among attributes of the numeric type.

MinStdDevOfNumericAtts

0.01

Minimum standard deviation of attributes of the numeric type.

MinorityClassPercentage

9.72

Percentage of instances belonging to the least frequent class.

MinorityClassSize

105

Number of instances belonging to the least frequent class.

NaiveBayesAUC

0.99

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NaiveBayesErrRate

0.14

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NaiveBayesKappa

0.83

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes

NumberOfBinaryFeatures

Number of binary attributes.

PercentageOfBinaryFeatures

3.66

Percentage of binary attributes.

PercentageOfInstancesWithMissingValues

48.89

Percentage of instances having missing values.

PercentageOfMissingValues

1.58

Percentage of missing values.

PercentageOfNumericFeatures

93.9

Percentage of numeric attributes.

PercentageOfSymbolicFeatures

6.1

Percentage of nominal attributes.

Quartile1AttributeEntropy

First quartile of entropy among attributes.

Quartile1KurtosisOfNumericAtts

0.21

First quartile of kurtosis among attributes of the numeric type.

Quartile1MeansOfNumericAtts

0.19

First quartile of means among attributes of the numeric type.

Quartile1MutualInformation

First quartile of mutual information between the nominal attributes and the target attribute.

Quartile1SkewnessOfNumericAtts

0.03

First quartile of skewness among attributes of the numeric type.

Quartile1StdDevOfNumericAtts

0.03

First quartile of standard deviation of attributes of the numeric type.

Quartile2AttributeEntropy

Second quartile (Median) of entropy among attributes.

Quartile2KurtosisOfNumericAtts

0.89

Second quartile (Median) of kurtosis among attributes of the numeric type.

Quartile2MeansOfNumericAtts

0.38

Second quartile (Median) of means among attributes of the numeric type.

Quartile2MutualInformation

Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

Quartile2SkewnessOfNumericAtts

0.46

Second quartile (Median) of skewness among attributes of the numeric type.

Quartile2StdDevOfNumericAtts

0.07

Second quartile (Median) of standard deviation of attributes of the numeric type.

Quartile3AttributeEntropy

7.81

Third quartile of entropy among attributes.

Quartile3KurtosisOfNumericAtts

2.29

Third quartile of kurtosis among attributes of the numeric type.

Quartile3MeansOfNumericAtts

0.79

Third quartile of means among attributes of the numeric type.

Quartile3MutualInformation

2.49

Third quartile of mutual information between the nominal attributes and the target attribute.

Quartile3SkewnessOfNumericAtts

0.97

Third quartile of skewness among attributes of the numeric type.

Quartile3StdDevOfNumericAtts

0.23

Third quartile of standard deviation of attributes of the numeric type.

REPTreeDepth1AUC

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth1ErrRate

0.86

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth1Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1

REPTreeDepth2AUC

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth2ErrRate

0.86

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth2Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2

REPTreeDepth3AUC

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3

REPTreeDepth3ErrRate

0.86

Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3

REPTreeDepth3Kappa

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3

RandomTreeDepth1AUC

0.85

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth1ErrRate

0.44

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth1Kappa

0.49

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

RandomTreeDepth2AUC

0.85

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth2ErrRate

0.44

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth2Kappa

0.49

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

RandomTreeDepth3AUC

0.85

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

RandomTreeDepth3ErrRate

0.44

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

RandomTreeDepth3Kappa

0.49

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

StdvNominalAttDistinctValues

481.43

Standard deviation of the number of distinct values among attributes of the nominal type.

kNN1NAUC

0.99

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk

kNN1NErrRate

0.01

Error rate achieved by the landmarker weka.classifiers.lazy.IBk

kNN1NKappa

0.99

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk

Show all 107 properties

11 tasks

Supervised Classification on MiceProtein

0 runs - estimation_procedure: 20% Holdout (Ordered) - target_feature: class

Supervised Classification on MiceProtein

0 runs - estimation_procedure: 10-fold Crossvalidation - target_feature: class

Supervised Classification on MiceProtein

0 runs - estimation_procedure: 5 times 2-fold Crossvalidation - target_feature: class

Supervised Classification on MiceProtein

0 runs - estimation_procedure: 10% Holdout set - target_feature: class

Supervised Classification on MiceProtein

0 runs - estimation_procedure: 33% Holdout set - target_feature: class

Supervised Classification on MiceProtein

0 runs - estimation_procedure: 10 times 10-fold Crossvalidation - target_feature: class

Supervised Classification on MiceProtein

0 runs - estimation_procedure: Test on Training Data - target_feature: class

Supervised Classification on MiceProtein

0 runs - estimation_procedure: Leave one out - target_feature: class

Learning Curve on MiceProtein

0 runs - estimation_procedure: 10-fold Learning Curve - target_feature: class

Learning Curve on MiceProtein

0 runs - estimation_procedure: 10 times 10-fold Learning Curve - target_feature: class

Supervised Data Stream Classification on MiceProtein

0 runs - estimation_procedure: Interleaved Test then Train - target_feature: class

Define a new task

Sign in

MiceProtein

82 features

107 properties

11 tasks