28386
1159
TESTe577c467d9sklearn.impute._base.SimpleImputer
sklearn.SimpleImputer
sklearn.impute._base.SimpleImputer
1
openml==0.14.1,sklearn==1.3.2
Univariate imputer for completing missing values with simple strategies.
Replace missing values using a descriptive statistic (e.g. mean, median, or
most frequent) along each column, or using a constant value.
2024-01-15T12:59:11
English
sklearn==1.3.2
numpy>=1.17.3
scipy>=1.5.0
joblib>=1.1.1
threadpoolctl>=2.0.0
add_indicator
bool
false
If True, a :class:`MissingIndicator` transform will stack onto output
of the imputer's transform. This allows a predictive estimator
to account for missingness despite imputation. If a feature has no
missing values at fit/train time, the feature won't appear on
the missing indicator even if there are missing values at
transform/test time
copy
bool
true
If True, a copy of X will be created. If False, imputation will
be done in-place whenever possible. Note that, in the following cases,
a new copy will always be made, even if `copy=False`:
- If `X` is not an array of floating values;
- If `X` is encoded as a CSR matrix;
- If `add_indicator=True`
fill_value
str or numerical value
null
When strategy == "constant", `fill_value` is used to replace all
occurrences of missing_values. For string or object data types,
`fill_value` must be a string
If `None`, `fill_value` will be 0 when imputing numerical
data and "missing_value" for strings or object data types
keep_empty_features
bool
false
If True, features that consist exclusively of missing values when
`fit` is called are returned in results when `transform` is called
The imputed value is always `0` except when `strategy="constant"`
in which case `fill_value` will be used instead
.. versionadded:: 1.2
missing_values
int
NaN
The placeholder for the missing values. All occurrences of
`missing_values` will be imputed. For pandas' dataframes with
nullable integer dtypes with missing values, `missing_values`
can be set to either `np.nan` or `pd.NA`
strategy
str
"mean"
The imputation strategy
- If "mean", then replace missing values using the mean along
each column. Can only be used with numeric data
- If "median", then replace missing values using the median along
each column. Can only be used with numeric data
- If "most_frequent", then replace missing using the most frequent
value along each column. Can be used with strings or numeric data
If there is more than one such value, only the smallest is returned
- If "constant", then replace missing values with fill_value. Can be
used with strings or numeric data
.. versionadded:: 0.20
strategy="constant" for fixed value imputation
openml-python
python
scikit-learn
sklearn
sklearn_1.3.2