New in version 0.1.7.

class quapy.classification.calibration.BCTSCalibration(classifier, val_split=5, n_jobs=None, verbose=False)

Bases: RecalibratedProbabilisticClassifierBase

Applies the Bias-Corrected Temperature Scaling (BCTS) calibration method from abstention.calibration, as defined in Alexandari et al. paper:

  • classifier – a scikit-learn probabilistic classifier

  • val_split – indicate an integer k for performing kFCV to obtain the posterior prevalences, or a float p in (0,1) to indicate that the posteriors are obtained in a stratified validation split containing p% of the training instances (the rest is used for training). In any case, the classifier is retrained in the whole training set afterwards. Default value is 5.

  • n_jobs – indicate the number of parallel workers (only when val_split is an integer)

  • verbose – whether or not to display information in the standard output

class quapy.classification.calibration.NBVSCalibration(classifier, val_split=5, n_jobs=None, verbose=False)

Bases: RecalibratedProbabilisticClassifierBase

Applies the No-Bias Vector Scaling (NBVS) calibration method from abstention.calibration, as defined in Alexandari et al. paper:

  • classifier – a scikit-learn probabilistic classifier

  • val_split – indicate an integer k for performing kFCV to obtain the posterior prevalences, or a float p in (0,1) to indicate that the posteriors are obtained in a stratified validation split containing p% of the training instances (the rest is used for training). In any case, the classifier is retrained in the whole training set afterwards. Default value is 5.

  • n_jobs – indicate the number of parallel workers (only when val_split is an integer)

  • verbose – whether or not to display information in the standard output

class quapy.classification.calibration.RecalibratedProbabilisticClassifier

Bases: object

Abstract class for (re)calibration method from abstention.calibration, as defined in Alexandari, A., Kundaje, A., & Shrikumar, A. (2020, November). Maximum likelihood with bias-corrected calibration is hard-to-beat at label shift adaptation. In International Conference on Machine Learning (pp. 222-232). PMLR.:

class quapy.classification.calibration.RecalibratedProbabilisticClassifierBase(classifier, calibrator, val_split=5, n_jobs=None, verbose=False)

Bases: BaseEstimator, RecalibratedProbabilisticClassifier

Applies a (re)calibration method from abstention.calibration, as defined in Alexandari et al. paper:

  • classifier – a scikit-learn probabilistic classifier

  • calibrator – the calibration object (an instance of abstention.calibration.CalibratorFactory)

  • val_split – indicate an integer k for performing kFCV to obtain the posterior probabilities, or a float p in (0,1) to indicate that the posteriors are obtained in a stratified validation split containing p% of the training instances (the rest is used for training). In any case, the classifier is retrained in the whole training set afterwards. Default value is 5.

  • n_jobs – indicate the number of parallel workers (only when val_split is an integer); default=None

  • verbose – whether or not to display information in the standard output

property classes_

Returns the classes on which the classifier has been trained on


array-like of shape (n_classes)

fit(X, y)

Fits the calibration for the probabilistic classifier.

  • X – array-like of shape (n_samples, n_features) with the data instances

  • y – array-like of shape (n_samples,) with the class labels



fit_cv(X, y)

Fits the calibration in a cross-validation manner, i.e., it generates posterior probabilities for all training instances via cross-validation, and then retrains the classifier on all training instances. The posterior probabilities thus generated are used for calibrating the outputs of the classifier.

  • X – array-like of shape (n_samples, n_features) with the data instances

  • y – array-like of shape (n_samples,) with the class labels



fit_tr_val(X, y)

Fits the calibration in a train/val-split manner, i.e.t, it partitions the training instances into a training and a validation set, and then uses the training samples to learn classifier which is then used to generate posterior probabilities for the held-out validation data. These posteriors are used to calibrate the classifier. The classifier is not retrained on the whole dataset.

  • X – array-like of shape (n_samples, n_features) with the data instances

  • y – array-like of shape (n_samples,) with the class labels




Predicts class labels for the data instances in X


X – array-like of shape (n_samples, n_features) with the data instances


array-like of shape (n_samples,) with the class label predictions


Generates posterior probabilities for the data instances in X


X – array-like of shape (n_samples, n_features) with the data instances


array-like of shape (n_samples, n_classes) with posterior probabilities

class quapy.classification.calibration.TSCalibration(classifier, val_split=5, n_jobs=None, verbose=False)

Bases: RecalibratedProbabilisticClassifierBase

Applies the Temperature Scaling (TS) calibration method from abstention.calibration, as defined in Alexandari et al. paper:

  • classifier – a scikit-learn probabilistic classifier

  • val_split – indicate an integer k for performing kFCV to obtain the posterior prevalences, or a float p in (0,1) to indicate that the posteriors are obtained in a stratified validation split containing p% of the training instances (the rest is used for training). In any case, the classifier is retrained in the whole training set afterwards. Default value is 5.

  • n_jobs – indicate the number of parallel workers (only when val_split is an integer)

  • verbose – whether or not to display information in the standard output

class quapy.classification.calibration.VSCalibration(classifier, val_split=5, n_jobs=None, verbose=False)

Bases: RecalibratedProbabilisticClassifierBase

Applies the Vector Scaling (VS) calibration method from abstention.calibration, as defined in Alexandari et al. paper:

  • classifier – a scikit-learn probabilistic classifier

  • val_split – indicate an integer k for performing kFCV to obtain the posterior prevalences, or a float p in (0,1) to indicate that the posteriors are obtained in a stratified validation split containing p% of the training instances (the rest is used for training). In any case, the classifier is retrained in the whole training set afterwards. Default value is 5.

  • n_jobs – indicate the number of parallel workers (only when val_split is an integer)

  • verbose – whether or not to display information in the standard output


class quapy.classification.methods.LowRankLogisticRegression(n_components=100, **kwargs)

Bases: BaseEstimator

An example of a classification method (i.e., an object that implements fit, predict, and predict_proba) that also generates embedded inputs (i.e., that implements transform), as those required for quapy.method.neural.QuaNet. This is a mock method to allow for easily instantiating quapy.method.neural.QuaNet on array-like real-valued instances. The transformation consists of applying sklearn.decomposition.TruncatedSVD while classification is performed using sklearn.linear_model.LogisticRegression on the low-rank space.

  • n_components – the number of principal components to retain

  • kwargs – parameters for the Logistic Regression classifier

fit(X, y)

Fit the model according to the given training data. The fit consists of fitting TruncatedSVD and then LogisticRegression on the low-rank representation.

  • X – array-like of shape (n_samples, n_features) with the instances

  • y – array-like of shape (n_samples, n_classes) with the class labels




Get hyper-parameters for this estimator.


a dictionary with parameter names mapped to their values


Predicts labels for the instances X embedded into the low-rank space.


X – array-like of shape (n_samples, n_features) instances to classify


a numpy array of length n containing the label predictions, where n is the number of instances in X


Predicts posterior probabilities for the instances X embedded into the low-rank space.


X – array-like of shape (n_samples, n_features) instances to classify


array-like of shape (n_samples, n_classes) with the posterior probabilities


Set the parameters of this estimator.


parameters – a **kwargs dictionary with the estimator parameters for Logistic Regression and eventually also n_components for TruncatedSVD


Returns the low-rank approximation of X with n_components dimensions, or X unaltered if n_components >= X.shape[1].


X – array-like of shape (n_samples, n_features) instances to embed


array-like of shape (n_samples, n_components) with the embedded instances


class quapy.classification.neural.CNNnet(vocabulary_size, n_classes, embedding_size=100, hidden_size=256, repr_size=100, kernel_heights=[3, 5, 7], stride=1, padding=0, drop_p=0.5)

Bases: TextClassifierNet

An implementation of quapy.classification.neural.TextClassifierNet based on Convolutional Neural Networks.

  • vocabulary_size – the size of the vocabulary

  • n_classes – number of target classes

  • embedding_size – the dimensionality of the word embeddings space (default 100)

  • hidden_size – the dimensionality of the hidden space (default 256)

  • repr_size – the dimensionality of the document embeddings space (default 100)

  • kernel_heights – list of kernel lengths (default [3,5,7]), i.e., the number of consecutive tokens that each kernel covers

  • stride – convolutional stride (default 1)

  • stride – convolutional pad (default 0)

  • drop_p – drop probability for dropout (default 0.5)


Embeds documents (i.e., performs the forward pass up to the next-to-last layer).


input – a batch of instances, typically generated by a torch’s DataLoader instance (see quapy.classification.neural.TorchDataset)


a torch tensor of shape (n_samples, n_dimensions), where n_samples is the number of documents, and n_dimensions is the dimensionality of the embedding


Get hyper-parameters for this estimator


a dictionary with parameter names mapped to their values

training: bool
property vocabulary_size

Return the size of the vocabulary



class quapy.classification.neural.LSTMnet(vocabulary_size, n_classes, embedding_size=100, hidden_size=256, repr_size=100, lstm_class_nlayers=1, drop_p=0.5)

Bases: TextClassifierNet

An implementation of quapy.classification.neural.TextClassifierNet based on Long Short Term Memory networks.

  • vocabulary_size – the size of the vocabulary

  • n_classes – number of target classes

  • embedding_size – the dimensionality of the word embeddings space (default 100)

  • hidden_size – the dimensionality of the hidden space (default 256)

  • repr_size – the dimensionality of the document embeddings space (default 100)

  • lstm_class_nlayers – number of LSTM layers (default 1)

  • drop_p – drop probability for dropout (default 0.5)


Embeds documents (i.e., performs the forward pass up to the next-to-last layer).


x – a batch of instances, typically generated by a torch’s DataLoader instance (see quapy.classification.neural.TorchDataset)


a torch tensor of shape (n_samples, n_dimensions), where n_samples is the number of documents, and n_dimensions is the dimensionality of the embedding


Get hyper-parameters for this estimator


a dictionary with parameter names mapped to their values

training: bool
property vocabulary_size

Return the size of the vocabulary



class quapy.classification.neural.NeuralClassifierTrainer(net: TextClassifierNet, lr=0.001, weight_decay=0, patience=10, epochs=200, batch_size=64, batch_size_test=512, padding_length=300, device='cpu', checkpointpath='../checkpoint/classifier_net.dat')

Bases: object

Trains a neural network for text classification.

  • net – an instance of TextClassifierNet implementing the forward pass

  • lr – learning rate (default 1e-3)

  • weight_decay – weight decay (default 0)

  • patience – number of epochs that do not show any improvement in validation to wait before applying early stop (default 10)

  • epochs – maximum number of training epochs (default 200)

  • batch_size – batch size for training (default 64)

  • batch_size_test – batch size for test (default 512)

  • padding_length – maximum number of tokens to consider in a document (default 300)

  • device – specify ‘cpu’ (default) or ‘cuda’ for enabling gpu

  • checkpointpath – where to store the parameters of the best model found so far according to the evaluation in the held-out validation split (default ‘../checkpoint/classifier_net.dat’)

property device

Gets the device in which the network is allocated



fit(instances, labels, val_split=0.3)

Fits the model according to the given training data.

  • instances – list of lists of indexed tokens

  • labels – array-like of shape (n_samples, n_classes) with the class labels

  • val_split – proportion of training documents to be taken as the validation set (default 0.3)



Get hyper-parameters for this estimator


a dictionary with parameter names mapped to their values


Predicts labels for the instances


instances – list of lists of indexed tokens


a numpy array of length n containing the label predictions, where n is the number of instances in X


Predicts posterior probabilities for the instances


X – array-like of shape (n_samples, n_features) instances to classify


array-like of shape (n_samples, n_classes) with the posterior probabilities

reset_net_params(vocab_size, n_classes)

Reinitialize the network parameters

  • vocab_size – the size of the vocabulary

  • n_classes – the number of target classes


Set the parameters of this trainer and the learner it is training. In this current version, parameter names for the trainer and learner should be disjoint.


params – a **kwargs dictionary with the parameters


Returns the embeddings of the instances


instances – list of lists of indexed tokens


array-like of shape (n_samples, embed_size) with the embedded instances, where embed_size is defined by the classification network

class quapy.classification.neural.TextClassifierNet

Bases: Module

Abstract Text classifier (torch.nn.Module)


Gets the number of dimensions of the embedding space



abstract document_embedding(x)

Embeds documents (i.e., performs the forward pass up to the next-to-last layer).


x – a batch of instances, typically generated by a torch’s DataLoader instance (see quapy.classification.neural.TorchDataset)


a torch tensor of shape (n_samples, n_dimensions), where n_samples is the number of documents, and n_dimensions is the dimensionality of the embedding


Performs the forward pass.


x – a batch of instances, typically generated by a torch’s DataLoader instance (see quapy.classification.neural.TorchDataset)


a tensor of shape (n_instances, n_classes) with the decision scores for each of the instances and classes

abstract get_params()

Get hyper-parameters for this estimator


a dictionary with parameter names mapped to their values


Predicts posterior probabilities for the instances in x


x – a torch tensor of indexed tokens with shape (n_instances, pad_length) where n_instances is the number of instances in the batch, and pad_length is length of the pad in the batch


array-like of shape (n_samples, n_classes) with the posterior probabilities

training: bool
property vocabulary_size

Return the size of the vocabulary




Performs Xavier initialization of the network parameters

class quapy.classification.neural.TorchDataset(instances, labels=None)

Bases: Dataset

Transforms labelled instances into a Torch’s torch.utils.data.DataLoader object

  • instances – list of lists of indexed tokens

  • labels – array-like of shape (n_samples, n_classes) with the class labels

asDataloader(batch_size, shuffle, pad_length, device)

Converts the labelled collection into a Torch DataLoader with dynamic padding for the batch

  • batch_size – batch size

  • shuffle – whether or not to shuffle instances

  • pad_length – the maximum length for the list of tokens (dynamic padding is applied, meaning that if the longest document in the batch is shorter than pad_length, then the batch is padded up to its length, and not to pad_length.

  • device – whether to allocate tensors in cpu or in cuda


a torch.utils.data.DataLoader object


class quapy.classification.svmperf.SVMperf(svmperf_base, C=0.01, verbose=False, loss='01')

Bases: BaseEstimator, ClassifierMixin

A wrapper for the SVM-perf package by Thorsten Joachims. When using losses for quantification, the source code has to be patched. See the installation documentation for further details.


  • svmperf_base – path to directory containing the binary files svm_perf_learn and svm_perf_classify

  • C – trade-off between training error and margin (default 0.01)

  • verbose – set to True to print svm-perf std outputs

  • loss – the loss to optimize for. Available losses are “01”, “f1”, “kld”, “nkld”, “q”, “qacc”, “qf1”, “qgm”, “mae”, “mrae”.

decision_function(X, y=None)

Evaluate the decision function for the samples in X.

  • X – array-like of shape (n_samples, n_features) containing the instances to classify

  • y – unused


array-like of shape (n_samples,) containing the decision scores of the instances

fit(X, y)

Trains the SVM for the multivariate performance loss

  • X – training instances

  • y – a binary vector of labels




Predicts labels for the instances X


X – array-like of shape (n_samples, n_features) instances to classify


a numpy array of length n containing the label predictions, where n is the number of instances in X


Set the hyper-parameters for svm-perf. Currently, only the C parameter is supported


parameters – a **kwargs dictionary {‘C’: <float>}

valid_losses = {'01': 0, 'f1': 1, 'kld': 12, 'mae': 26, 'mrae': 27, 'nkld': 13, 'q': 22, 'qacc': 23, 'qf1': 24, 'qgm': 25}

