Module pulearn.bagging
Bagging meta-estimator for PU learning.
Any scikit-learn estimator should work as the base estimator.
This implementation is fully compatible with scikit-learn, and is in fact based on the code of the sklearn.ensemble.BaggingClassifier class with very minor changes.
Classes
class BaggingPuClassifier (estimator=None,
n_estimators=10,
max_samples=1.0,
max_features=1.0,
bootstrap=True,
bootstrap_features=False,
oob_score=True,
warm_start=False,
n_jobs=1,
random_state=None,
verbose=0,
balanced_subsample=False)-
Expand source code Browse git
class BaggingPuClassifier(BaseBaggingPU, ClassifierMixin): """A Bagging PU classifier. Adapted from sklearn.ensemble.BaggingClassifier, based on A bagging SVM to learn from positive and unlabeled examples (2013) by Mordelet and Vert http://dx.doi.org/10.1016/j.patrec.2013.06.010 http://members.cbio.mines-paristech.fr/~jvert/svn/bibli/local/Mordelet2013bagging.pdf Parameters ---------- estimator : object or None, optional (default=None) The base estimator to fit on random subsets of the dataset. If None, then the base estimator is a decision tree. n_estimators : int, optional (default=10) The number of base estimators in the ensemble. max_samples : int or float, optional (default=1.0) The number of unlabeled samples to draw to train each base estimator. Ignored when ``balanced_subsample=True``. max_features : int or float, optional (default=1.0) The number of features to draw from X to train each base estimator. - If int, then draw `max_features` features. - If float, then draw `max_features * X.shape[1]` features. bootstrap : boolean, optional (default=True) Whether samples are drawn with replacement. bootstrap_features : boolean, optional (default=False) Whether features are drawn with replacement. oob_score : bool, optional (default=True) Whether to use out-of-bag samples to estimate the generalization error. warm_start : bool, optional (default=False) When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new ensemble. n_jobs : int, optional (default=1) The number of jobs to run in parallel for both `fit` and `predict`. If -1, then the number of jobs is set to the number of cores. random_state : int, RandomState instance or None, optional (default=None) If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by `np.random`. verbose : int, optional (default=0) Controls the verbosity of the building process. balanced_subsample : bool, optional (default=False) When True, each bag always includes all positive samples and draws up to ``n_positives`` unlabeled samples (without replacement). This yields a roughly 1:1 positive-to-unlabeled ratio when ``n_unlabeled >= n_positives``; otherwise, all unlabeled samples are used and the bag contains more positives than unlabeled. When True, the ``max_samples`` parameter is ignored. Attributes ---------- estimator_ : estimator The base estimator from which the ensemble is grown. estimators_ : list of estimators The collection of fitted base estimators. estimators_samples_ : list of arrays The subset of drawn samples (i.e., the in-bag samples) for each base estimator. Each subset is defined by a boolean mask. estimators_features_ : list of arrays The subset of drawn features for each base estimator. classes_ : array of shape = [n_classes] The classes labels. n_classes_ : int or list The number of classes. oob_score_ : float Score of the training dataset obtained using an out-of-bag estimate. oob_decision_function_ : array of shape = [n_samples, n_classes] Decision function computed with out-of-bag estimate on the training set. Positive data points, and perhaps some of the unlabeled, are left out during the bootstrap. In these cases, `oob_decision_function_` contains NaN. ensemble_diagnostics_ : dict Summary statistics computed after ``fit``. Always present. Keys: - ``n_positives`` (int): number of positive training samples. - ``n_unlabeled`` (int): number of unlabeled training samples. - ``effective_max_samples`` (int): unlabeled samples drawn per bag. - ``bag_size`` (int): total samples per bag (``effective_max_samples`` + ``n_positives``). - ``positive_ratio_in_bags`` (float): fraction of positives in each bag. When ``oob_score=True`` the following keys are also present: - ``oob_score`` (float): out-of-bag accuracy. - ``oob_prediction_variance`` (float): variance of the OOB positive-class probability estimates across all OOB samples; useful as a proxy for ensemble prediction stability. """ def __init__( self, estimator=None, n_estimators=10, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=True, warm_start=False, n_jobs=1, random_state=None, verbose=0, balanced_subsample=False, ): """Initialize the Bagging meta-estimator.""" super(BaggingPuClassifier, self).__init__( estimator, n_estimators=n_estimators, max_samples=max_samples, max_features=max_features, bootstrap=bootstrap, bootstrap_features=bootstrap_features, oob_score=oob_score, warm_start=warm_start, n_jobs=n_jobs, random_state=random_state, verbose=verbose, balanced_subsample=balanced_subsample, ) def _validate_estimator(self): """Check the estimator and set the estimator_ attribute.""" super(BaggingPuClassifier, self)._validate_estimator( default=DecisionTreeClassifier() ) def _set_oob_score(self, X, y): n_samples = y.shape[0] n_classes_ = self.n_classes_ # classes_ = self.classes_ predictions = np.zeros((n_samples, n_classes_)) for estimator, samples, features in zip( self.estimators_, self.estimators_samples_, self.estimators_features_, ): # Create mask for OOB samples mask = ~samples if hasattr(estimator, "predict_proba"): predictions[mask, :] += estimator.predict_proba( (X[mask, :])[:, features] ) else: p = estimator.predict((X[mask, :])[:, features]) j = 0 for i in range(n_samples): if mask[i]: predictions[i, p[j]] += 1 j += 1 # Modified: no warnings about non-OOB points (i.e. positives) with np.errstate(invalid="ignore"): denominator = predictions.sum(axis=1)[:, np.newaxis] oob_decision_function = predictions / denominator oob_score = accuracy_score(y, np.argmax(predictions, axis=1)) self.oob_decision_function_ = oob_decision_function self.oob_score_ = oob_score def _validate_y(self, y): y = column_or_1d(y, warn=True) y = normalize_pu_y( y, require_positive=True, require_unlabeled=True, strict=True, ) self.classes_ = np.array([0, 1], dtype=int) self.n_classes_ = 2 return y def predict(self, X): """Predict class for X. The predicted class of an input sample is computed as the class with the highest mean predicted probability. If base estimators do not implement a ``predict_proba`` method, then it resorts to voting. Parameters ---------- X : {array-like, sparse matrix} of shape = [n_samples, n_features] The training input samples. Sparse matrices are accepted only if they are supported by the base estimator. Returns ------- y : array of shape = [n_samples] The predicted classes. """ predicted_probabilitiy = self.predict_proba(X) return self.classes_.take( (np.argmax(predicted_probabilitiy, axis=1)), axis=0 ) def predict_proba(self, X): """Predict class probabilities for X. The predicted class probabilities of an input sample is computed as the mean predicted class probabilities of the base estimators in the ensemble. If base estimators do not implement a ``predict_proba`` method, then it resorts to voting and the predicted class probabilities of an input sample represents the proportion of estimators predicting each class. Parameters ---------- X : {array-like, sparse matrix} of shape = [n_samples, n_features] The training input samples. Sparse matrices are accepted only if they are supported by the base estimator. Returns ------- p : array of shape = [n_samples, n_classes] The class probabilities of the input samples. The order of the classes corresponds to that in the attribute `classes_`. """ check_is_fitted(self, "classes_") # Check data X = check_array(X, accept_sparse=["csr", "csc"]) if self.n_features_ != X.shape[1]: raise ValueError( "Number of features of the model must " "match the input. Model n_features is {0} and " "input n_features is {1}." "".format(self.n_features_, X.shape[1]) ) # Parallel loop n_jobs, n_estimators, starts = _partition_estimators( self.n_estimators, self.n_jobs ) all_proba = Parallel(n_jobs=n_jobs, verbose=self.verbose)( delayed(_parallel_predict_proba)( self.estimators_[starts[i] : starts[i + 1]], self.estimators_features_[starts[i] : starts[i + 1]], X, self.n_classes_, ) for i in range(n_jobs) ) # Reduce proba = sum(all_proba) / self.n_estimators return proba def predict_log_proba(self, X): """Predict class log-probabilities for X. The predicted class log-probabilities of an input sample is computed as the log of the mean predicted class probabilities of the base estimators in the ensemble. Parameters ---------- X : {array-like, sparse matrix} of shape = [n_samples, n_features] The training input samples. Sparse matrices are accepted only if they are supported by the base estimator. Returns ------- p : array of shape = [n_samples, n_classes] The class log-probabilities of the input samples. The order of the classes corresponds to that in the attribute `classes_`. """ check_is_fitted(self, "classes_") if hasattr(self.estimator_, "predict_log_proba"): # Check data X = check_array(X, accept_sparse=["csr", "csc"]) if self.n_features_ != X.shape[1]: raise ValueError( "Number of features of the model must " "match the input. Model n_features is {0} " "and input n_features is {1} " "".format(self.n_features_, X.shape[1]) ) # Parallel loop n_jobs, n_estimators, starts = _partition_estimators( self.n_estimators, self.n_jobs ) all_log_proba = Parallel(n_jobs=n_jobs, verbose=self.verbose)( delayed(_parallel_predict_log_proba)( self.estimators_[starts[i] : starts[i + 1]], self.estimators_features_[starts[i] : starts[i + 1]], X, self.n_classes_, ) for i in range(n_jobs) ) # Reduce log_proba = all_log_proba[0] for j in range(1, len(all_log_proba)): # pragma: no cover log_proba = np.logaddexp(log_proba, all_log_proba[j]) log_proba -= np.log(self.n_estimators) return log_proba # else, the base estimator has no predict_log_proba, so... return np.log(self.predict_proba(X)) @available_if(lambda self: hasattr(self.estimator, "decision_function")) def decision_function(self, X): """Average of the decision functions of the base classifiers. Parameters ---------- X : {array-like, sparse matrix} of shape = [n_samples, n_features] The training input samples. Sparse matrices are accepted only if they are supported by the base estimator. Returns ------- score : array, shape = [n_samples, k] The decision function of the input samples. The columns correspond to the classes in sorted order, as they appear in the attribute ``classes_``. Regression and binary classification are special cases with ``k == 1``, otherwise ``k==n_classes``. """ check_is_fitted(self, "classes_") # Check data X = check_array(X, accept_sparse=["csr", "csc"]) if self.n_features_ != X.shape[1]: raise ValueError( "Number of features of the model must " "match the input. Model n_features is {0} and " "input n_features is {1} " "".format(self.n_features_, X.shape[1]) ) # Parallel loop n_jobs, n_estimators, starts = _partition_estimators( self.n_estimators, self.n_jobs ) all_decisions = Parallel(n_jobs=n_jobs, verbose=self.verbose)( delayed(_parallel_decision_function)( self.estimators_[starts[i] : starts[i + 1]], self.estimators_features_[starts[i] : starts[i + 1]], X, ) for i in range(n_jobs) ) # Reduce decisions = sum(all_decisions) / self.n_estimators return decisionsA Bagging PU classifier.
Adapted from sklearn.ensemble.BaggingClassifier, based on A bagging SVM to learn from positive and unlabeled examples (2013) by Mordelet and Vert http://dx.doi.org/10.1016/j.patrec.2013.06.010 http://members.cbio.mines-paristech.fr/~jvert/svn/bibli/local/Mordelet2013bagging.pdf
Parameters
estimator:objectorNone, optional(default=None)- The base estimator to fit on random subsets of the dataset. If None, then the base estimator is a decision tree.
n_estimators:int, optional(default=10)- The number of base estimators in the ensemble.
max_samples:intorfloat, optional(default=1.0)- The number of unlabeled samples to draw to train each base estimator.
Ignored when
balanced_subsample=True. max_features:intorfloat, optional(default=1.0)-
The number of features to draw from X to train each base estimator.
- If int, then draw
max_featuresfeatures. - If float, then draw
max_features * X.shape[1]features.
- If int, then draw
bootstrap:boolean, optional(default=True)- Whether samples are drawn with replacement.
bootstrap_features:boolean, optional(default=False)- Whether features are drawn with replacement.
oob_score:bool, optional(default=True)- Whether to use out-of-bag samples to estimate the generalization error.
warm_start:bool, optional(default=False)- When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new ensemble.
n_jobs:int, optional(default=1)- The number of jobs to run in parallel for both
fitandpredict. If -1, then the number of jobs is set to the number of cores. random_state:int, RandomState instanceorNone, optional(default=None)- If int, random_state is the seed used by the random number generator;
If RandomState instance, random_state is the random number generator;
If None, the random number generator is the RandomState instance used
by
np.random. verbose:int, optional(default=0)- Controls the verbosity of the building process.
balanced_subsample:bool, optional(default=False)- When True, each bag always includes all positive samples and draws
up to
n_positivesunlabeled samples (without replacement). This yields a roughly 1:1 positive-to-unlabeled ratio whenn_unlabeled >= n_positives; otherwise, all unlabeled samples are used and the bag contains more positives than unlabeled. When True, themax_samplesparameter is ignored.
Attributes
estimator_:estimator- The base estimator from which the ensemble is grown.
estimators_:listofestimators- The collection of fitted base estimators.
estimators_samples_:listofarrays- The subset of drawn samples (i.e., the in-bag samples) for each base estimator. Each subset is defined by a boolean mask.
estimators_features_:listofarrays- The subset of drawn features for each base estimator.
classes_:arrayofshape = [n_classes]- The classes labels.
n_classes_:intorlist- The number of classes.
oob_score_:float- Score of the training dataset obtained using an out-of-bag estimate.
oob_decision_function_:arrayofshape = [n_samples, n_classes]- Decision function computed with out-of-bag estimate on the training
set. Positive data points, and perhaps some of the unlabeled,
are left out during the bootstrap. In these cases,
oob_decision_function_contains NaN. ensemble_diagnostics_:dict-
Summary statistics computed after
fit. Always present. Keys:n_positives(int): number of positive training samples.n_unlabeled(int): number of unlabeled training samples.effective_max_samples(int): unlabeled samples drawn per bag.bag_size(int): total samples per bag (effective_max_samples+n_positives).positive_ratio_in_bags(float): fraction of positives in each bag.
When
oob_score=Truethe following keys are also present:oob_score(float): out-of-bag accuracy.oob_prediction_variance(float): variance of the OOB positive-class probability estimates across all OOB samples; useful as a proxy for ensemble prediction stability.
Initialize the Bagging meta-estimator.
Ancestors
- pulearn.bagging.BaseBaggingPU
- sklearn.ensemble._base.BaseEnsemble
- sklearn.base.MetaEstimatorMixin
- sklearn.base.BaseEstimator
- sklearn.utils._repr_html.base.ReprHTMLMixin
- sklearn.utils._repr_html.base._HTMLDocumentationLinkMixin
- sklearn.utils._metadata_requests._MetadataRequester
- sklearn.base.ClassifierMixin
Methods
def decision_function(self, X)-
Expand source code Browse git
@available_if(lambda self: hasattr(self.estimator, "decision_function")) def decision_function(self, X): """Average of the decision functions of the base classifiers. Parameters ---------- X : {array-like, sparse matrix} of shape = [n_samples, n_features] The training input samples. Sparse matrices are accepted only if they are supported by the base estimator. Returns ------- score : array, shape = [n_samples, k] The decision function of the input samples. The columns correspond to the classes in sorted order, as they appear in the attribute ``classes_``. Regression and binary classification are special cases with ``k == 1``, otherwise ``k==n_classes``. """ check_is_fitted(self, "classes_") # Check data X = check_array(X, accept_sparse=["csr", "csc"]) if self.n_features_ != X.shape[1]: raise ValueError( "Number of features of the model must " "match the input. Model n_features is {0} and " "input n_features is {1} " "".format(self.n_features_, X.shape[1]) ) # Parallel loop n_jobs, n_estimators, starts = _partition_estimators( self.n_estimators, self.n_jobs ) all_decisions = Parallel(n_jobs=n_jobs, verbose=self.verbose)( delayed(_parallel_decision_function)( self.estimators_[starts[i] : starts[i + 1]], self.estimators_features_[starts[i] : starts[i + 1]], X, ) for i in range(n_jobs) ) # Reduce decisions = sum(all_decisions) / self.n_estimators return decisionsAverage of the decision functions of the base classifiers.
Parameters
X:{array-like, sparse matrix}ofshape = [n_samples, n_features]- The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.
Returns
score:array, shape = [n_samples, k]- The decision function of the input samples. The columns correspond
to the classes in sorted order, as they appear in the attribute
classes_. Regression and binary classification are special cases withk == 1, otherwisek==n_classes.
def predict(self, X)-
Expand source code Browse git
def predict(self, X): """Predict class for X. The predicted class of an input sample is computed as the class with the highest mean predicted probability. If base estimators do not implement a ``predict_proba`` method, then it resorts to voting. Parameters ---------- X : {array-like, sparse matrix} of shape = [n_samples, n_features] The training input samples. Sparse matrices are accepted only if they are supported by the base estimator. Returns ------- y : array of shape = [n_samples] The predicted classes. """ predicted_probabilitiy = self.predict_proba(X) return self.classes_.take( (np.argmax(predicted_probabilitiy, axis=1)), axis=0 )Predict class for X.
The predicted class of an input sample is computed as the class with the highest mean predicted probability. If base estimators do not implement a
predict_probamethod, then it resorts to voting.Parameters
X:{array-like, sparse matrix}ofshape = [n_samples, n_features]- The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.
Returns
y:arrayofshape = [n_samples]- The predicted classes.
def predict_log_proba(self, X)-
Expand source code Browse git
def predict_log_proba(self, X): """Predict class log-probabilities for X. The predicted class log-probabilities of an input sample is computed as the log of the mean predicted class probabilities of the base estimators in the ensemble. Parameters ---------- X : {array-like, sparse matrix} of shape = [n_samples, n_features] The training input samples. Sparse matrices are accepted only if they are supported by the base estimator. Returns ------- p : array of shape = [n_samples, n_classes] The class log-probabilities of the input samples. The order of the classes corresponds to that in the attribute `classes_`. """ check_is_fitted(self, "classes_") if hasattr(self.estimator_, "predict_log_proba"): # Check data X = check_array(X, accept_sparse=["csr", "csc"]) if self.n_features_ != X.shape[1]: raise ValueError( "Number of features of the model must " "match the input. Model n_features is {0} " "and input n_features is {1} " "".format(self.n_features_, X.shape[1]) ) # Parallel loop n_jobs, n_estimators, starts = _partition_estimators( self.n_estimators, self.n_jobs ) all_log_proba = Parallel(n_jobs=n_jobs, verbose=self.verbose)( delayed(_parallel_predict_log_proba)( self.estimators_[starts[i] : starts[i + 1]], self.estimators_features_[starts[i] : starts[i + 1]], X, self.n_classes_, ) for i in range(n_jobs) ) # Reduce log_proba = all_log_proba[0] for j in range(1, len(all_log_proba)): # pragma: no cover log_proba = np.logaddexp(log_proba, all_log_proba[j]) log_proba -= np.log(self.n_estimators) return log_proba # else, the base estimator has no predict_log_proba, so... return np.log(self.predict_proba(X))Predict class log-probabilities for X.
The predicted class log-probabilities of an input sample is computed as the log of the mean predicted class probabilities of the base estimators in the ensemble.
Parameters
X:{array-like, sparse matrix}ofshape = [n_samples, n_features]- The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.
Returns
p:arrayofshape = [n_samples, n_classes]- The class log-probabilities of the input samples. The order of the
classes corresponds to that in the attribute
classes_.
def predict_proba(self, X)-
Expand source code Browse git
def predict_proba(self, X): """Predict class probabilities for X. The predicted class probabilities of an input sample is computed as the mean predicted class probabilities of the base estimators in the ensemble. If base estimators do not implement a ``predict_proba`` method, then it resorts to voting and the predicted class probabilities of an input sample represents the proportion of estimators predicting each class. Parameters ---------- X : {array-like, sparse matrix} of shape = [n_samples, n_features] The training input samples. Sparse matrices are accepted only if they are supported by the base estimator. Returns ------- p : array of shape = [n_samples, n_classes] The class probabilities of the input samples. The order of the classes corresponds to that in the attribute `classes_`. """ check_is_fitted(self, "classes_") # Check data X = check_array(X, accept_sparse=["csr", "csc"]) if self.n_features_ != X.shape[1]: raise ValueError( "Number of features of the model must " "match the input. Model n_features is {0} and " "input n_features is {1}." "".format(self.n_features_, X.shape[1]) ) # Parallel loop n_jobs, n_estimators, starts = _partition_estimators( self.n_estimators, self.n_jobs ) all_proba = Parallel(n_jobs=n_jobs, verbose=self.verbose)( delayed(_parallel_predict_proba)( self.estimators_[starts[i] : starts[i + 1]], self.estimators_features_[starts[i] : starts[i + 1]], X, self.n_classes_, ) for i in range(n_jobs) ) # Reduce proba = sum(all_proba) / self.n_estimators return probaPredict class probabilities for X.
The predicted class probabilities of an input sample is computed as the mean predicted class probabilities of the base estimators in the ensemble. If base estimators do not implement a
predict_probamethod, then it resorts to voting and the predicted class probabilities of an input sample represents the proportion of estimators predicting each class.Parameters
X:{array-like, sparse matrix}ofshape = [n_samples, n_features]- The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.
Returns
p:arrayofshape = [n_samples, n_classes]- The class probabilities of the input samples. The order of the
classes corresponds to that in the attribute
classes_.
def set_fit_request(self: BaggingPuClassifier,
*,
sample_weight: bool | str | None = '$UNCHANGED$') ‑> BaggingPuClassifier-
Expand source code Browse git
def func(*args, **kw): """Updates the `_metadata_request` attribute of the consumer (`instance`) for the parameters provided as `**kw`. This docstring is overwritten below. See REQUESTER_DOC for expected functionality. """ if not _routing_enabled(): raise RuntimeError( "This method is only available when metadata routing is enabled." " You can enable it using" " sklearn.set_config(enable_metadata_routing=True)." ) if self.validate_keys and (set(kw) - set(self.keys)): raise TypeError( f"Unexpected args: {set(kw) - set(self.keys)} in {self.name}. " f"Accepted arguments are: {set(self.keys)}" ) # This makes it possible to use the decorated method as an unbound method, # for instance when monkeypatching. # https://github.com/scikit-learn/scikit-learn/issues/28632 if instance is None: _instance = args[0] args = args[1:] else: _instance = instance # Replicating python's behavior when positional args are given other than # `self`, and `self` is only allowed if this method is unbound. if args: raise TypeError( f"set_{self.name}_request() takes 0 positional argument but" f" {len(args)} were given" ) requests = _instance._get_metadata_request() method_metadata_request = getattr(requests, self.name) for prop, alias in kw.items(): if alias is not UNCHANGED: method_metadata_request.add_request(param=prop, alias=alias) _instance._metadata_request = requests return _instanceConfigure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a :term:
meta-estimatorand metadata routing is enabled withenable_metadata_routing=True(see :func:sklearn.set_config). Please check the :ref:User Guide <metadata_routing>on how the routing mechanism works.The options for each parameter are:
-
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided. -
False: metadata is not requested and the meta-estimator will not pass it tofit. -
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it. -
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version: 1.3
Parameters
sample_weight:str, True, False,orNone, default=sklearn.utils.metadata_routing.UNCHANGED- Metadata routing for
sample_weightparameter infit.
Returns
self:object- The updated object.
-
def set_score_request(self: BaggingPuClassifier,
*,
sample_weight: bool | str | None = '$UNCHANGED$') ‑> BaggingPuClassifier-
Expand source code Browse git
def func(*args, **kw): """Updates the `_metadata_request` attribute of the consumer (`instance`) for the parameters provided as `**kw`. This docstring is overwritten below. See REQUESTER_DOC for expected functionality. """ if not _routing_enabled(): raise RuntimeError( "This method is only available when metadata routing is enabled." " You can enable it using" " sklearn.set_config(enable_metadata_routing=True)." ) if self.validate_keys and (set(kw) - set(self.keys)): raise TypeError( f"Unexpected args: {set(kw) - set(self.keys)} in {self.name}. " f"Accepted arguments are: {set(self.keys)}" ) # This makes it possible to use the decorated method as an unbound method, # for instance when monkeypatching. # https://github.com/scikit-learn/scikit-learn/issues/28632 if instance is None: _instance = args[0] args = args[1:] else: _instance = instance # Replicating python's behavior when positional args are given other than # `self`, and `self` is only allowed if this method is unbound. if args: raise TypeError( f"set_{self.name}_request() takes 0 positional argument but" f" {len(args)} were given" ) requests = _instance._get_metadata_request() method_metadata_request = getattr(requests, self.name) for prop, alias in kw.items(): if alias is not UNCHANGED: method_metadata_request.add_request(param=prop, alias=alias) _instance._metadata_request = requests return _instanceConfigure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a :term:
meta-estimatorand metadata routing is enabled withenable_metadata_routing=True(see :func:sklearn.set_config). Please check the :ref:User Guide <metadata_routing>on how the routing mechanism works.The options for each parameter are:
-
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided. -
False: metadata is not requested and the meta-estimator will not pass it toscore. -
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it. -
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version: 1.3
Parameters
sample_weight:str, True, False,orNone, default=sklearn.utils.metadata_routing.UNCHANGED- Metadata routing for
sample_weightparameter inscore.
Returns
self:object- The updated object.
-