cv | Determines the cross-validation splitting strategy
Possible inputs for cv are:
- None, to use the default 3-fold cross validation,
- integer, to specify the number of folds in a `(Stratified)KFold`,
- An object to be used as a cross-validation generator
- An iterable yielding train, test splits
For integer/None inputs, if the estimator is a classifier and ``y`` is
either binary or multiclass, :class:`StratifiedKFold` is used. In all
other cases, :class:`KFold` is used
Refer :ref:`User Guide ` for the various
cross-validation strategies that can be used here | default: 3 |
error_score | Value to assign to the score if an error occurs in estimator fitting
If set to 'raise', the error is raised. If a numeric value is given,
FitFailedWarning is raised. This parameter does not affect the refit
step, which will always raise the error | default: "raise" |
estimator | This is assumed to implement the scikit-learn estimator interface
Either estimator needs to provide a ``score`` function,
or ``scoring`` must be passed | default: {"oml-python:serialized_object": "component_reference", "value": {"key": "estimator", "step_name": null}} |
fit_params | Parameters to pass to the fit method | |
iid | If True, the data is assumed to be identically distributed across
the folds, and the loss minimized is the total loss per sample,
and not the mean loss across the folds | default: true |
n_jobs | Number of jobs to run in parallel | default: 1 |
param_grid | Dictionary with parameters names (string) as keys and lists of
parameter settings to try as values, or a list of such
dictionaries, in which case the grids spanned by each dictionary
in the list are explored. This enables searching over any sequence
of parameter settings | default: {"base_estimator__C": [0.01, 0.1, 10], "base_estimator__gamma": [0.01, 0.1, 10]} |
pre_dispatch | Controls the number of jobs that get dispatched during parallel
execution. Reducing this number can be useful to avoid an
explosion of memory consumption when more jobs get dispatched
than CPUs can process. This parameter can be:
- None, in which case all the jobs are immediately
created and spawned. Use this for lightweight and
fast-running jobs, to avoid delays due to on-demand
spawning of the jobs
- An int, giving the exact number of total jobs that are
spawned
- A string, giving an expression as a function of n_jobs,
as in '2*n_jobs' | default: "2*n_jobs" |
refit | Refit the best estimator with the entire dataset
If "False", it is impossible to make predictions using
this GridSearchCV instance after fitting | default: true |
return_train_score | If ``'False'``, the ``cv_results_`` attribute will not include training
scores
Examples
--------
>>> from sklearn import svm, datasets
>>> from sklearn.model_selection import GridSearchCV
>>> iris = datasets.load_iris()
>>> parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
>>> svr = svm.SVC()
>>> clf = GridSearchCV(svr, parameters)
>>> clf.fit(iris.data, iris.target)
... # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
GridSearchCV(cv=None, error_score=...,
estimator=SVC(C=1.0, cache_size=..., class_weight=..., coef0=...,
decision_function_shape=None, degree=..., gamma=...,
kernel='rbf', max_iter=-1, probability=False,
random_state=None, shrinking=True, tol=...,
verbose=False),
fit_params={}, iid=..., n_jobs=1,
param_grid=..., pre_dispatch=..., refit=..., return_train_score=...,
scoring=..., verbose=...)
>>> sorted(clf.cv_results_.keys())
... ... | default: true |
scoring | A string (see model evaluation documentation) or
a scorer callable object / function with signature
``scorer(estimator, X, y)``
If ``None``, the ``score`` method of the estimator is used | default: null |
verbose | Controls the verbosity: the higher, the more messages | default: 0 |