28386 1159 TESTe577c467d9sklearn.impute._base.SimpleImputer sklearn.SimpleImputer sklearn.impute._base.SimpleImputer 1 openml==0.14.1,sklearn==1.3.2 Univariate imputer for completing missing values with simple strategies. Replace missing values using a descriptive statistic (e.g. mean, median, or most frequent) along each column, or using a constant value. 2024-01-15T12:59:11 English sklearn==1.3.2 numpy>=1.17.3 scipy>=1.5.0 joblib>=1.1.1 threadpoolctl>=2.0.0 add_indicator bool false If True, a :class:`MissingIndicator` transform will stack onto output of the imputer's transform. This allows a predictive estimator to account for missingness despite imputation. If a feature has no missing values at fit/train time, the feature won't appear on the missing indicator even if there are missing values at transform/test time copy bool true If True, a copy of X will be created. If False, imputation will be done in-place whenever possible. Note that, in the following cases, a new copy will always be made, even if `copy=False`: - If `X` is not an array of floating values; - If `X` is encoded as a CSR matrix; - If `add_indicator=True` fill_value str or numerical value null When strategy == "constant", `fill_value` is used to replace all occurrences of missing_values. For string or object data types, `fill_value` must be a string If `None`, `fill_value` will be 0 when imputing numerical data and "missing_value" for strings or object data types keep_empty_features bool false If True, features that consist exclusively of missing values when `fit` is called are returned in results when `transform` is called The imputed value is always `0` except when `strategy="constant"` in which case `fill_value` will be used instead .. versionadded:: 1.2 missing_values int NaN The placeholder for the missing values. All occurrences of `missing_values` will be imputed. For pandas' dataframes with nullable integer dtypes with missing values, `missing_values` can be set to either `np.nan` or `pd.NA` strategy str "mean" The imputation strategy - If "mean", then replace missing values using the mean along each column. Can only be used with numeric data - If "median", then replace missing values using the median along each column. Can only be used with numeric data - If "most_frequent", then replace missing using the most frequent value along each column. Can be used with strings or numeric data If there is more than one such value, only the smallest is returned - If "constant", then replace missing values with fill_value. Can be used with strings or numeric data .. versionadded:: 0.20 strategy="constant" for fixed value imputation openml-python python scikit-learn sklearn sklearn_1.3.2