Data
house_prices_nominal

house_prices_nominal

active ARFF Publicly available Visibility: public Uploaded 26-06-2020 by Test Test
0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes
Issue #Downvotes for this reason By


Loading wiki
Help us complete this description Edit
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence. With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home. MSSubClass: Identifies the type of dwelling involved in the sale. 20 1-STORY 1946 & NEWER ALL STYLES 30 1-STORY 1945 & OLDER 40 1-STORY W/FINISHED ATTIC ALL AGES 45 1-1/2 STORY - UNFINISHED ALL AGES 50 1-1/2 STORY FINISHED ALL AGES 60 2-STORY 1946 & NEWER 70 2-STORY 1945 & OLDER 75 2-1/2 STORY ALL AGES 80 SPLIT OR MULTI-LEVEL 85 SPLIT FOYER 90 DUPLEX - ALL STYLES AND AGES 120 1-STORY PUD (Planned Unit Development) - 1946 & NEWER 150 1-1/2 STORY PUD - ALL AGES 160 2-STORY PUD - 1946 & NEWER 180 PUD - MULTILEVEL - INCL SPLIT LEV/FOYER 190 2 FAMILY CONVERSION - ALL STYLES AND AGES MSZoning: Identifies the general zoning classification of the sale. A Agriculture C Commercial FV Floating Village Residential I Industrial RH Residential High Density RL Residential Low Density RP Residential Low Density Park RM Residential Medium Density LotFrontage: Linear feet of street connected to property LotArea: Lot size in square feet Street: Type of road access to property Grvl Gravel Pave Paved Alley: Type of alley access to property Grvl Gravel Pave Paved NA No alley access LotShape: General shape of property Reg Regular IR1 Slightly irregular IR2 Moderately Irregular IR3 Irregular LandContour: Flatness of the property Lvl Near Flat/Level Bnk Banked - Quick and significant rise from street grade to building HLS Hillside - Significant slope from side to side Low Depression Utilities: Type of utilities available AllPub All public Utilities (E,G,W,& S) NoSewr Electricity, Gas, and Water (Septic Tank) NoSeWa Electricity and Gas Only ELO Electricity only LotConfig: Lot configuration Inside Inside lot Corner Corner lot CulDSac Cul-de-sac FR2 Frontage on 2 sides of property FR3 Frontage on 3 sides of property LandSlope: Slope of property Gtl Gentle slope Mod Moderate Slope Sev Severe Slope Neighborhood: Physical locations within Ames city limits Blmngtn Bloomington Heights Blueste Bluestem BrDale Briardale BrkSide Brookside ClearCr Clear Creek CollgCr College Creek Crawfor Crawford Edwards Edwards Gilbert Gilbert IDOTRR Iowa DOT and Rail Road MeadowV Meadow Village Mitchel Mitchell Names North Ames NoRidge Northridge NPkVill Northpark Villa NridgHt Northridge Heights NWAmes Northwest Ames OldTown Old Town SWISU South & West of Iowa State University Sawyer Sawyer SawyerW Sawyer West Somerst Somerset StoneBr Stone Brook Timber Timberland Veenker Veenker Condition1: Proximity to various conditions Artery Adjacent to arterial street Feedr Adjacent to feeder street Norm Normal RRNn Within 200' of North-South Railroad RRAn Adjacent to North-South Railroad PosN Near positive off-site feature--park, greenbelt, etc. PosA Adjacent to postive off-site feature RRNe Within 200' of East-West Railroad RRAe Adjacent to East-West Railroad Condition2: Proximity to various conditions (if more than one is present) Artery Adjacent to arterial street Feedr Adjacent to feeder street Norm Normal RRNn Within 200' of North-South Railroad RRAn Adjacent to North-South Railroad PosN Near positive off-site feature--park, greenbelt, etc. PosA Adjacent to postive off-site feature RRNe Within 200' of East-West Railroad RRAe Adjacent to East-West Railroad BldgType: Type of dwelling 1Fam Single-family Detached 2FmCon Two-family Conversion; originally built as one-family dwelling Duplx Duplex TwnhsE Townhouse End Unit TwnhsI Townhouse Inside Unit HouseStyle: Style of dwelling 1Story One story 1.5Fin One and one-half story: 2nd level finished 1.5Unf One and one-half story: 2nd level unfinished 2Story Two story 2.5Fin Two and one-half story: 2nd level finished 2.5Unf Two and one-half story: 2nd level unfinished SFoyer Split Foyer SLvl Split Level OverallQual: Rates the overall material and finish of the house 10 Very Excellent 9 Excellent 8 Very Good 7 Good 6 Above Average 5 Average 4 Below Average 3 Fair 2 Poor 1 Very Poor OverallCond: Rates the overall condition of the house 10 Very Excellent 9 Excellent 8 Very Good 7 Good 6 Above Average 5 Average 4 Below Average 3 Fair 2 Poor 1 Very Poor YearBuilt: Original construction date YearRemodAdd: Remodel date (same as construction date if no remodeling or additions) RoofStyle: Type of roof Flat Flat Gable Gable Gambrel Gabrel (Barn) Hip Hip Mansard Mansard Shed Shed RoofMatl: Roof material ClyTile Clay or Tile CompShg Standard (Composite) Shingle Membran Membrane Metal Metal Roll Roll Tar&Grv Gravel & Tar WdShake Wood Shakes WdShngl Wood Shingles Exterior1st: Exterior covering on house AsbShng Asbestos Shingles AsphShn Asphalt Shingles BrkComm Brick Common BrkFace Brick Face CBlock Cinder Block CemntBd Cement Board HdBoard Hard Board ImStucc Imitation Stucco MetalSd Metal Siding Other Other Plywood Plywood PreCast PreCast Stone Stone Stucco Stucco VinylSd Vinyl Siding Wd Sdng Wood Siding WdShing Wood Shingles Exterior2nd: Exterior covering on house (if more than one material) AsbShng Asbestos Shingles AsphShn Asphalt Shingles BrkComm Brick Common BrkFace Brick Face CBlock Cinder Block CemntBd Cement Board HdBoard Hard Board ImStucc Imitation Stucco MetalSd Metal Siding Other Other Plywood Plywood PreCast PreCast Stone Stone Stucco Stucco VinylSd Vinyl Siding Wd Sdng Wood Siding WdShing Wood Shingles MasVnrType: Masonry veneer type BrkCmn Brick Common BrkFace Brick Face CBlock Cinder Block None None Stone Stone MasVnrArea: Masonry veneer area in square feet ExterQual: Evaluates the quality of the material on the exterior Ex Excellent Gd Good TA Average/Typical Fa Fair Po Poor ExterCond: Evaluates the present condition of the material on the exterior Ex Excellent Gd Good TA Average/Typical Fa Fair Po Poor Foundation: Type of foundation BrkTil Brick & Tile CBlock Cinder Block PConc Poured Contrete Slab Slab Stone Stone Wood Wood BsmtQual: Evaluates the height of the basement Ex Excellent (100+ inches) Gd Good (90-99 inches) TA Typical (80-89 inches) Fa Fair (70-79 inches) Po Poor (<70 inches NA No Basement BsmtCond: Evaluates the general condition of the basement Ex Excellent Gd Good TA Typical - slight dampness allowed Fa Fair - dampness or some cracking or settling Po Poor - Severe cracking, settling, or wetness NA No Basement BsmtExposure: Refers to walkout or garden level walls Gd Good Exposure Av Average Exposure (split levels or foyers typically score average or above) Mn Mimimum Exposure No No Exposure NA No Basement BsmtFinType1: Rating of basement finished area GLQ Good Living Quarters ALQ Average Living Quarters BLQ Below Average Living Quarters Rec Average Rec Room LwQ Low Quality Unf Unfinshed NA No Basement BsmtFinSF1: Type 1 finished square feet BsmtFinType2: Rating of basement finished area (if multiple types)

80 features

SalePrice (target)numeric663 unique values
0 missing
Id (row identifier)numeric1460 unique values
0 missing
MSSubClassnumeric15 unique values
0 missing
MSZoningnominal5 unique values
0 missing
LotFrontagenumeric110 unique values
259 missing
LotAreanumeric1073 unique values
0 missing
Streetnominal2 unique values
0 missing
Alleynominal2 unique values
1369 missing
LotShapenominal4 unique values
0 missing
LandContournominal4 unique values
0 missing
Utilitiesnominal2 unique values
0 missing
LotConfignominal5 unique values
0 missing
LandSlopenominal3 unique values
0 missing
Neighborhoodnominal25 unique values
0 missing
Condition1nominal9 unique values
0 missing
Condition2nominal8 unique values
0 missing
BldgTypenominal5 unique values
0 missing
HouseStylenominal8 unique values
0 missing
OverallQualnumeric10 unique values
0 missing
OverallCondnumeric9 unique values
0 missing
YearBuiltnumeric112 unique values
0 missing
YearRemodAddnumeric61 unique values
0 missing
RoofStylenominal6 unique values
0 missing
RoofMatlnominal8 unique values
0 missing
Exterior1stnominal15 unique values
0 missing
Exterior2ndnominal16 unique values
0 missing
MasVnrTypenominal4 unique values
8 missing
MasVnrAreanumeric327 unique values
8 missing
ExterQualnominal4 unique values
0 missing
ExterCondnominal5 unique values
0 missing
Foundationnominal6 unique values
0 missing
BsmtQualnominal4 unique values
37 missing
BsmtCondnominal4 unique values
37 missing
BsmtExposurenominal4 unique values
38 missing
BsmtFinType1nominal6 unique values
37 missing
BsmtFinSF1numeric637 unique values
0 missing
BsmtFinType2nominal6 unique values
38 missing
BsmtFinSF2numeric144 unique values
0 missing
BsmtUnfSFnumeric780 unique values
0 missing
TotalBsmtSFnumeric721 unique values
0 missing
Heatingnominal6 unique values
0 missing
HeatingQCnominal5 unique values
0 missing
CentralAirnominal2 unique values
0 missing
Electricalnominal5 unique values
1 missing
1stFlrSFnumeric753 unique values
0 missing
2ndFlrSFnumeric417 unique values
0 missing
LowQualFinSFnumeric24 unique values
0 missing
GrLivAreanumeric861 unique values
0 missing
BsmtFullBathnumeric4 unique values
0 missing
BsmtHalfBathnumeric3 unique values
0 missing
FullBathnumeric4 unique values
0 missing
HalfBathnumeric3 unique values
0 missing
BedroomAbvGrnumeric8 unique values
0 missing
KitchenAbvGrnumeric4 unique values
0 missing
KitchenQualnominal4 unique values
0 missing
TotRmsAbvGrdnumeric12 unique values
0 missing
Functionalnominal7 unique values
0 missing
Fireplacesnumeric4 unique values
0 missing
FireplaceQunominal5 unique values
690 missing
GarageTypenominal6 unique values
81 missing
GarageYrBltnumeric97 unique values
81 missing
GarageFinishnominal3 unique values
81 missing
GarageCarsnumeric5 unique values
0 missing
GarageAreanumeric441 unique values
0 missing
GarageQualnominal5 unique values
81 missing
GarageCondnominal5 unique values
81 missing
PavedDrivenominal3 unique values
0 missing
WoodDeckSFnumeric274 unique values
0 missing
OpenPorchSFnumeric202 unique values
0 missing
EnclosedPorchnumeric120 unique values
0 missing
3SsnPorchnumeric20 unique values
0 missing
ScreenPorchnumeric76 unique values
0 missing
PoolAreanumeric8 unique values
0 missing
PoolQCnominal3 unique values
1453 missing
Fencenominal4 unique values
1179 missing
MiscFeaturenominal4 unique values
1406 missing
MiscValnumeric21 unique values
0 missing
MoSoldnumeric12 unique values
0 missing
YrSoldnumeric5 unique values
0 missing
SaleTypenominal9 unique values
0 missing
SaleConditionnominal6 unique values
0 missing

107 properties

1460
Number of instances (rows) of the dataset.
80
Number of attributes (columns) of the dataset.
0
Number of distinct values of the target attribute (if it is nominal).
6965
Number of missing values in the dataset.
1460
Number of instances with at least one value missing.
37
Number of numeric attributes.
43
Number of nominal attributes.
First quartile of mutual information between the nominal attributes and the target attribute.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001
An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.
4
Number of binary attributes.
0.21
First quartile of skewness among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
4.14
Standard deviation of the number of distinct values among attributes of the nominal type.
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .001
5.86
Average number of distinct values among the attributes of the nominal type.
1.22
First quartile of standard deviation of attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .001
3.05
Mean skewness among attributes of the numeric type.
Second quartile (Median) of entropy among attributes.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.lazy.IBk
Percentage of instances belonging to the most frequent class.
2533.58
Mean standard deviation of attributes of the numeric type.
2.99
Second quartile (Median) of kurtosis among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2
Entropy of the target attribute values.
Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk
Number of instances belonging to the most frequent class.
Minimal entropy among attributes.
46.55
Second quartile (Median) of means among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump
Maximum entropy among attributes.
-1.27
Minimum kurtosis among attributes of the numeric type.
Second quartile (Median) of mutual information between the nominal attributes and the target attribute.
Error rate achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump
701
Maximum kurtosis among attributes of the numeric type.
0.06
Minimum of means among attributes of the numeric type.
1.38
Second quartile (Median) of skewness among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump
180921.2
Maximum of means among attributes of the numeric type.
Minimal mutual information between the nominal attributes and the target attribute.
40.18
Second quartile (Median) of standard deviation of attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
0.05
Number of attributes divided by the number of instances.
Maximum mutual information between the nominal attributes and the target attribute.
2
The minimal number of distinct values among attributes of the nominal type.
5
Percentage of binary attributes.
Third quartile of entropy among attributes.
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.
25
The maximum number of distinct values among attributes of the nominal type.
-0.65
Minimum skewness among attributes of the numeric type.
100
Percentage of instances having missing values.
16.92
Third quartile of kurtosis among attributes of the numeric type.
-80645.68
Average class difference between consecutive instances.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001
24.48
Maximum skewness among attributes of the numeric type.
0.22
Minimum standard deviation of attributes of the numeric type.
5.96
Percentage of missing values.
812.33
Third quartile of means among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .00001
79442.5
Maximum standard deviation of attributes of the numeric type.
Percentage of instances belonging to the least frequent class.
46.25
Percentage of numeric attributes.
Third quartile of mutual information between the nominal attributes and the target attribute.
Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001
Average entropy of the attributes.
Number of instances belonging to the least frequent class.
53.75
Percentage of nominal attributes.
First quartile of entropy among attributes.
3.6
Third quartile of skewness among attributes of the numeric type.
Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001
40.6
Mean kurtosis among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes
-0.31
First quartile of kurtosis among attributes of the numeric type.
300.2
Third quartile of standard deviation of attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Error rate achieved by the landmarker weka.classifiers.trees.J48 -C .0001
5553.8
Mean of means among attributes of the numeric type.
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes
3.14
First quartile of means among attributes of the numeric type.
Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1
Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W
Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3
Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001
Average mutual information between the nominal attributes and the target attribute.
Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes

0 tasks

Define a new task