Data

profb

active
ARFF
Publicly available Visibility: public Uploaded 28-09-2014 by Joaquin Vanschoren

0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

0 likes downloaded by 0 people , 0 total downloads 0 issues 0 downvotes

Issue | #Downvotes for this reason | By |
---|

Loading wiki

Help us complete this description
Edit

Author:
Source: Unknown - Date unknown
Please cite:
PRO FOOTBALL SCORES (raw data appears after the description below)
How well do the oddsmakers of Las Vegas predict the outcome of
professional football games? Is there really a home field advantage - if
so how large is it? Are teams that play the Monday Night game at a
disadvantage when they play again the following Sunday? Do teams benefit
from having a "bye" week off in the current schedule? These questions and
a host of others can be investigated using this data set.
Hal Stern from the Statistics Department at Harvard University has
made available his compilation of scores for all National Football League
games from the 1989, 1990, and 1991 seasons. Dr. Stern used these data as
part of his presentation "Who's Number One?" in the special "Best of
Boston" session at the 1992 Joint Statistics Meetings.
Several variables in the data are keyed to the oddsmakers "point
spread" for each game. The point spread is a value assigned before each
game to serve as a handicap for whichever is perceived to be the better
team. Thus, to win against the point spread, the "favorite" team must beat
the "underdog" team by more points than the spread. The underdog "wins"
against the spread if it wins the game outright or manages to lose by fewer
points than the spread. In theory, the point spread should represent the
"expert" prediction as to the game's outcome. In practice, it more usually
denotes a point at which an equal amount of money will be wagered both for
and against the favored team.
Raw data below contains 672 cases (all 224 regular season games in
each season and informatino on the following 9 varialbes: .
Home/Away = Favored team is at home (1) or away (0)
Favorite Points = Points scored by the favored team
Underdog Points = Points scored by the underdog team
Pointspread = Oddsmaker's points to handicap the favored team
Favorite Name = Code for favored team's name
Underdog name = Code for underdog's name
Year = 89, 90, or 91
Week = 1, 2, ... 17
Special = Mon.night (M), Sat. (S), Thur. (H), Sun. night (N)
ot - denotes an overtime game
Data were submitted by: Robin Lock (rlock@stlawu.bitnet)
Mathematics Department, St. Lawrence University
Data were compiled by: Hal Stern, Dept. of Statistics, Harvard University
Information about the dataset
CLASSTYPE: nominal
CLASSINDEX: 1

Home/Away (target) | nominal | 2 unique values 0 missing | |

Favorite_Points | numeric | 46 unique values 0 missing | |

Underdog_Points | numeric | 38 unique values 0 missing | |

Pointspread | numeric | 32 unique values 0 missing | |

Favorite_Name | nominal | 28 unique values 0 missing | |

Underdog_name | nominal | 28 unique values 0 missing | |

Year | numeric | 3 unique values 0 missing | |

Week | numeric | 17 unique values 0 missing | |

Weekday | nominal | 4 unique values 560 missing | |

Overtime | nominal | 1 unique values 640 missing |

0

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 1

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

14.1

Standard deviation of the number of distinct values among attributes of the nominal type.

12.6

Average number of distinct values among the attributes of the nominal type.

-0.01

First quartile of skewness among attributes of the numeric type.

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 2

0.33

Error rate achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

2.07

First quartile of standard deviation of attributes of the numeric type.

0

Kappa coefficient achieved by the landmarker weka.classifiers.lazy.IBk -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

0

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 2

-0.12

Second quartile (Median) of kurtosis among attributes of the numeric type.

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 3

0.63

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump

16.86

Second quartile (Median) of means among attributes of the numeric type.

0.33

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump

0.02

Second quartile (Median) of mutual information between the nominal attributes and the target attribute.

0

Kappa coefficient achieved by the landmarker weka.classifiers.trees.REPTree -L 3

0

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump

0

Minimal mutual information between the nominal attributes and the target attribute.

0.42

Second quartile (Median) of skewness among attributes of the numeric type.

0.55

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

0.03

Maximum mutual information between the nominal attributes and the target attribute.

1

The minimal number of distinct values among attributes of the nominal type.

4.88

Second quartile (Median) of standard deviation of attributes of the numeric type.

0.37

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

48.79

Number of attributes needed to optimally describe the class (under the assumption of independence among attributes). Equals ClassEntropy divided by MeanMutualInformation.

28

The maximum number of distinct values among attributes of the nominal type.

0.52

Third quartile of kurtosis among attributes of the numeric type.

0.12

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 1

0.55

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .00001

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

0.55

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

0.03

Third quartile of mutual information between the nominal attributes and the target attribute.

0.33

Error rate achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

0.37

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

0.11

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .00001

0.65

Third quartile of skewness among attributes of the numeric type.

0

Kappa coefficient achieved by the landmarker weka.classifiers.trees.DecisionStump -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

0.12

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 2

0.55

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .0001

0.64

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes

9.62

Third quartile of standard deviation of attributes of the numeric type.

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

0.55

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

-1.37

First quartile of kurtosis among attributes of the numeric type.

0.5

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.REPTree -L 1

0.33

Error rate achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

0.37

Error rate achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

0.11

Kappa coefficient achieved by the landmarker weka.classifiers.trees.J48 -C .0001

0.02

Average mutual information between the nominal attributes and the target attribute.

0

Kappa coefficient achieved by the landmarker weka.classifiers.bayes.NaiveBayes -E "weka.attributeSelection.CfsSubsetEval -P 1 -E 1" -S "weka.attributeSelection.BestFirst -D 1 -N 5" -W

0.12

Kappa coefficient achieved by the landmarker weka.classifiers.trees.RandomTree -depth 3

0.55

Area Under the ROC Curve achieved by the landmarker weka.classifiers.trees.J48 -C .001

130.19

An estimate of the amount of irrelevant information in the attributes regarding the class. Equals (MeanAttributeEntropy - MeanMutualInformation) divided by MeanMutualInformation.

0

First quartile of mutual information between the nominal attributes and the target attribute.