Positive-unlabeled learning with Python

Positive-unlabeled learning (aka PU-learning) is a machine learning scenario for binary classification where the training set consists of a set of positively-labeled examples and an additional unlabeled set that contains positive and negative examples in unknown proportions (so no training example is explicitly labeled as negative). Positive-unlabeled learning methods aim to incorporate the unique structure of this scenario into the learning process, in a way that improves generalization of the learned notion of the positive class, when compared to simply treating all unlabaeled examples as negative examples, or alternatively discarding them and training a one-class classifier over only the positive samples.

Pulearn is a Python package that provides fully documented and tested scikit-learn wrappers to existing Python implementations of several positive-unlabeled learning methods. The familiar API and ease of installation should allow you to get going right away, and easily compare various PU-learning against both each other and naive methods.

  • Compatible with Python 3+

    Python 3.5 and up. Important for modern Python projects.

  • Fully documented

    Every classifier object is meticulously documented, down to the last parameter.



then read the