361
1159
TEST5659a58a3fsklearn.pipeline.Pipeline(scaler=sklearn.preprocessing._data.StandardScaler,dummy=sklearn.dummy.DummyClassifier)
sklearn.Pipeline(StandardScaler,DummyClassifier)
sklearn.pipeline.Pipeline
1
openml==0.14.1,sklearn==1.3.2
Pipeline of transforms with a final estimator.
Sequentially apply a list of transforms and a final estimator.
Intermediate steps of the pipeline must be 'transforms', that is, they
must implement `fit` and `transform` methods.
The final estimator only needs to implement `fit`.
The transformers in the pipeline can be cached using ``memory`` argument.
The purpose of the pipeline is to assemble several steps that can be
cross-validated together while setting different parameters. For this, it
enables setting parameters of the various steps using their names and the
parameter name separated by a `'__'`, as in the example below. A step's
estimator may be replaced entirely by setting the parameter with its name
to another estimator, or a transformer removed by setting it to
`'passthrough'` or `None`.
For an example use case of `Pipeline` combined with
:class:`~sklearn.model_selection.GridSearchCV`, refer to
:ref:`sphx_glr_auto_examples_compose_plot_compare_reduction.py`. The
example :ref:`sphx_glr_auto_exampl...
2024-01-10T15:41:57
English
sklearn==1.3.2
numpy>=1.17.3
scipy>=1.5.0
joblib>=1.1.1
threadpoolctl>=2.0.0
memory
str or object with the joblib
null
Used to cache the fitted transformers of the pipeline. The last step
will never be cached, even if it is a transformer. By default, no
caching is performed. If a string is given, it is the path to the
caching directory. Enabling caching triggers a clone of the transformers
before fitting. Therefore, the transformer instance given to the
pipeline cannot be inspected directly. Use the attribute ``named_steps``
or ``steps`` to inspect estimators within the pipeline. Caching the
transformers is advantageous when fitting is time consuming
steps
list of tuple
[{"oml-python:serialized_object": "component_reference", "value": {"key": "scaler", "step_name": "scaler"}}, {"oml-python:serialized_object": "component_reference", "value": {"key": "dummy", "step_name": "dummy"}}]
List of (name, transform) tuples (implementing `fit`/`transform`) that
are chained in sequential order. The last transform must be an
estimator
verbose
bool
false
If True, the time elapsed while fitting each step will be printed as it
is completed.
scaler
362
1159
TEST5659a58a3fsklearn.preprocessing._data.StandardScaler
sklearn.StandardScaler
sklearn.preprocessing._data.StandardScaler
1
openml==0.14.1,sklearn==1.3.2
Standardize features by removing the mean and scaling to unit variance.
The standard score of a sample `x` is calculated as:
z = (x - u) / s
where `u` is the mean of the training samples or zero if `with_mean=False`,
and `s` is the standard deviation of the training samples or one if
`with_std=False`.
Centering and scaling happen independently on each feature by computing
the relevant statistics on the samples in the training set. Mean and
standard deviation are then stored to be used on later data using
:meth:`transform`.
Standardization of a dataset is a common requirement for many
machine learning estimators: they might behave badly if the
individual features do not more or less look like standard normally
distributed data (e.g. Gaussian with 0 mean and unit variance).
For instance many elements used in the objective function of
a learning algorithm (such as the RBF kernel of Support Vector
Machines or the L1 and L2 regularizers of linear models) assume that
all features are centered around 0 ...
2024-01-10T15:41:57
English
sklearn==1.3.2
numpy>=1.17.3
scipy>=1.5.0
joblib>=1.1.1
threadpoolctl>=2.0.0
copy
bool
true
If False, try to avoid a copy and do inplace scaling instead
This is not guaranteed to always work inplace; e.g. if the data is
not a NumPy array or scipy.sparse CSR matrix, a copy may still be
returned
with_mean
bool
false
If True, center the data before scaling
This does not work (and will raise an exception) when attempted on
sparse matrices, because centering them entails building a dense
matrix which in common use cases is likely to be too large to fit in
memory
with_std
bool
true
If True, scale the data to unit variance (or equivalently,
unit standard deviation).
openml-python
python
scikit-learn
sklearn
sklearn_1.3.2
dummy
363
1159
TEST5659a58a3fsklearn.dummy.DummyClassifier
sklearn.DummyClassifier
sklearn.dummy.DummyClassifier
1
openml==0.14.1,sklearn==1.3.2
DummyClassifier makes predictions that ignore the input features.
This classifier serves as a simple baseline to compare against other more
complex classifiers.
The specific behavior of the baseline is selected with the `strategy`
parameter.
All strategies make predictions that ignore the input feature values passed
as the `X` argument to `fit` and `predict`. The predictions, however,
typically depend on values observed in the `y` parameter passed to `fit`.
Note that the "stratified" and "uniform" strategies lead to
non-deterministic predictions that can be rendered deterministic by setting
the `random_state` parameter if needed. The other strategies are naturally
deterministic and, once fit, always return the same constant prediction
for any value of `X`.
2024-01-10T15:41:57
English
sklearn==1.3.2
numpy>=1.17.3
scipy>=1.5.0
joblib>=1.1.1
threadpoolctl>=2.0.0
constant
int or str or array
null
The explicit constant as predicted by the "constant" strategy. This
parameter is useful only for the "constant" strategy.
random_state
int
null
Controls the randomness to generate the predictions when
``strategy='stratified'`` or ``strategy='uniform'``
Pass an int for reproducible output across multiple function calls
See :term:`Glossary <random_state>`
strategy
"prior"
openml-python
python
scikit-learn
sklearn
sklearn_1.3.2
openml-python
python
scikit-learn
sklearn
sklearn_1.3.2