sklearn.impute.SimpleImputer 파라미터 정리

<Python>/[Sklearn]

sklearn.impute.SimpleImputer 파라미터 정리

9566 2021. 12. 27. 22:38

728x90

SimpleImputer

class sklearn.impute.SimpleImputer(*, missing_values=nan, strategy='mean', fill_value=None,
verbose=0, copy=True, add_indicator=False)

from sklearn.impute import SimpleImputer

728x90

SimpleImputer 파라미터

missing_values = {int, float, str, np.nan or None}, default=np.nan

strategy = str, default=’mean’

# mean, median, constant : 숫자형
# most_frequent : 숫자형, 범주형

fill_value = {str, numerical value}, default=None

# strategy == constant(상수)인 경우, 사용

verbose = int, default=0

copy = bool, default=True

# If True, a copy of X will be created. If False, imputation will be done in-place whenever possible.

# true면 복사본을 만들어서 imputation할거고, false면 그냥 imputation한다.

add_indicator = bool, default=False

# If True, a MissingIndicator transform will stack onto output of the imputer’s transform. This allows a predictive estimator to account for missingness despite imputation. If a feature has no missing values at fit/train time, the feature won’t appear on the missing indicator even if there are missing values at transform/test time.

# true면 missingindicator를 추가한다. = 결측치 열이 생긴다.

# 이러면 예측추정기 + 결측치를 없애는 imputation이라도 결측치의 수를 설명할 수 있다.

# fit/train에 결측치가 없는 변수면 transform/test에 결측치가 있더라도 missingindicator에 보여주지 않길 바란다.

728x90

저작자표시