quapy package

Subpackages

Submodules

quapy.error module

quapy.error.absolute_error(prevs, prevs_hat)
Computes the absolute error between the two prevalence vectors.

Absolute error between two prevalence vectors \(p\) and \(\hat{p}\) is computed as \(AE(p,\hat{p})=\frac{1}{|\mathcal{Y}|}\sum_{y\in \mathcal{Y}}|\hat{p}(y)-p(y)|\), where \(\mathcal{Y}\) are the classes of interest.

Parameters
  • prevs – array-like of shape (n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_classes,) with the predicted prevalence values

Returns

absolute error

quapy.error.acc_error(y_true, y_pred)

Computes the error in terms of 1-accuracy. The accuracy is computed as \(\frac{tp+tn}{tp+fp+fn+tn}\), with tp, fp, fn, and tn standing for true positives, false positives, false negatives, and true negatives, respectively

Parameters
  • y_true – array-like of true labels

  • y_pred – array-like of predicted labels

Returns

1-accuracy

quapy.error.acce(y_true, y_pred)

Computes the error in terms of 1-accuracy. The accuracy is computed as \(\frac{tp+tn}{tp+fp+fn+tn}\), with tp, fp, fn, and tn standing for true positives, false positives, false negatives, and true negatives, respectively

Parameters
  • y_true – array-like of true labels

  • y_pred – array-like of predicted labels

Returns

1-accuracy

quapy.error.ae(prevs, prevs_hat)
Computes the absolute error between the two prevalence vectors.

Absolute error between two prevalence vectors \(p\) and \(\hat{p}\) is computed as \(AE(p,\hat{p})=\frac{1}{|\mathcal{Y}|}\sum_{y\in \mathcal{Y}}|\hat{p}(y)-p(y)|\), where \(\mathcal{Y}\) are the classes of interest.

Parameters
  • prevs – array-like of shape (n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_classes,) with the predicted prevalence values

Returns

absolute error

quapy.error.f1_error(y_true, y_pred)

F1 error: simply computes the error in terms of macro \(F_1\), i.e., \(1-F_1^M\), where \(F_1\) is the harmonic mean of precision and recall, defined as \(\frac{2tp}{2tp+fp+fn}\), with tp, fp, and fn standing for true positives, false positives, and false negatives, respectively. Macro averaging means the \(F_1\) is computed for each category independently, and then averaged.

Parameters
  • y_true – array-like of true labels

  • y_pred – array-like of predicted labels

Returns

\(1-F_1^M\)

quapy.error.f1e(y_true, y_pred)

F1 error: simply computes the error in terms of macro \(F_1\), i.e., \(1-F_1^M\), where \(F_1\) is the harmonic mean of precision and recall, defined as \(\frac{2tp}{2tp+fp+fn}\), with tp, fp, and fn standing for true positives, false positives, and false negatives, respectively. Macro averaging means the \(F_1\) is computed for each category independently, and then averaged.

Parameters
  • y_true – array-like of true labels

  • y_pred – array-like of predicted labels

Returns

\(1-F_1^M\)

quapy.error.from_name(err_name)

Gets an error function from its name. E.g., from_name(“mae”) will return function quapy.error.mae()

Parameters

err_name – string, the error name

Returns

a callable implementing the requested error

quapy.error.kld(p, p_hat, eps=None)
Computes the Kullback-Leibler divergence between the two prevalence distributions.

Kullback-Leibler divergence between two prevalence distributions \(p\) and \(\hat{p}\) is computed as \(KLD(p,\hat{p})=D_{KL}(p||\hat{p})=\sum_{y\in \mathcal{Y}} p(y)\log\frac{p(y)}{\hat{p}(y)}\), where \(\mathcal{Y}\) are the classes of interest. The distributions are smoothed using the eps factor (see quapy.error.smooth()).

Parameters
  • prevs – array-like of shape (n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_classes,) with the predicted prevalence values

  • eps – smoothing factor. KLD is not defined in cases in which the distributions contain zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).

Returns

Kullback-Leibler divergence between the two distributions

quapy.error.mae(prevs, prevs_hat)

Computes the mean absolute error (see quapy.error.ae()) across the sample pairs.

Parameters
  • prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values

Returns

mean absolute error

quapy.error.mean_absolute_error(prevs, prevs_hat)

Computes the mean absolute error (see quapy.error.ae()) across the sample pairs.

Parameters
  • prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values

Returns

mean absolute error

quapy.error.mean_relative_absolute_error(p, p_hat, eps=None)

Computes the mean relative absolute error (see quapy.error.rae()) across the sample pairs. The distributions are smoothed using the eps factor (see quapy.error.smooth()).

Parameters
  • prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values

  • eps – smoothing factor. mrae is not defined in cases in which the true distribution contains zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).

Returns

mean relative absolute error

quapy.error.mkld(prevs, prevs_hat, eps=None)

Computes the mean Kullback-Leibler divergence (see quapy.error.kld()) across the sample pairs. The distributions are smoothed using the eps factor (see quapy.error.smooth()).

Parameters
  • prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values

  • eps – smoothing factor. KLD is not defined in cases in which the distributions contain zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).

Returns

mean Kullback-Leibler distribution

quapy.error.mnkld(prevs, prevs_hat, eps=None)

Computes the mean Normalized Kullback-Leibler divergence (see quapy.error.nkld()) across the sample pairs. The distributions are smoothed using the eps factor (see quapy.error.smooth()).

Parameters
  • prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values

  • eps – smoothing factor. NKLD is not defined in cases in which the distributions contain zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).

Returns

mean Normalized Kullback-Leibler distribution

quapy.error.mrae(p, p_hat, eps=None)

Computes the mean relative absolute error (see quapy.error.rae()) across the sample pairs. The distributions are smoothed using the eps factor (see quapy.error.smooth()).

Parameters
  • prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values

  • eps – smoothing factor. mrae is not defined in cases in which the true distribution contains zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).

Returns

mean relative absolute error

quapy.error.mse(prevs, prevs_hat)

Computes the mean squared error (see quapy.error.se()) across the sample pairs.

Parameters
  • prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values

Returns

mean squared error

quapy.error.nkld(p, p_hat, eps=None)
Computes the Normalized Kullback-Leibler divergence between the two prevalence distributions.

Normalized Kullback-Leibler divergence between two prevalence distributions \(p\) and \(\hat{p}\) is computed as \(NKLD(p,\hat{p}) = 2\frac{e^{KLD(p,\hat{p})}}{e^{KLD(p,\hat{p})}+1}-1\), where \(\mathcal{Y}\) are the classes of interest. The distributions are smoothed using the eps factor (see quapy.error.smooth()).

Parameters
  • prevs – array-like of shape (n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_classes,) with the predicted prevalence values

  • eps – smoothing factor. NKLD is not defined in cases in which the distributions contain zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).

Returns

Normalized Kullback-Leibler divergence between the two distributions

quapy.error.rae(p, p_hat, eps=None)
Computes the absolute relative error between the two prevalence vectors.

Relative absolute error between two prevalence vectors \(p\) and \(\hat{p}\) is computed as \(RAE(p,\hat{p})=\frac{1}{|\mathcal{Y}|}\sum_{y\in \mathcal{Y}}\frac{|\hat{p}(y)-p(y)|}{p(y)}\), where \(\mathcal{Y}\) are the classes of interest. The distributions are smoothed using the eps factor (see quapy.error.smooth()).

Parameters
  • prevs – array-like of shape (n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_classes,) with the predicted prevalence values

  • eps – smoothing factor. rae is not defined in cases in which the true distribution contains zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).

Returns

relative absolute error

quapy.error.relative_absolute_error(p, p_hat, eps=None)
Computes the absolute relative error between the two prevalence vectors.

Relative absolute error between two prevalence vectors \(p\) and \(\hat{p}\) is computed as \(RAE(p,\hat{p})=\frac{1}{|\mathcal{Y}|}\sum_{y\in \mathcal{Y}}\frac{|\hat{p}(y)-p(y)|}{p(y)}\), where \(\mathcal{Y}\) are the classes of interest. The distributions are smoothed using the eps factor (see quapy.error.smooth()).

Parameters
  • prevs – array-like of shape (n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_classes,) with the predicted prevalence values

  • eps – smoothing factor. rae is not defined in cases in which the true distribution contains zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).

Returns

relative absolute error

quapy.error.se(p, p_hat)
Computes the squared error between the two prevalence vectors.

Squared error between two prevalence vectors \(p\) and \(\hat{p}\) is computed as \(SE(p,\hat{p})=\frac{1}{|\mathcal{Y}|}\sum_{y\in \mathcal{Y}}(\hat{p}(y)-p(y))^2\), where \(\mathcal{Y}\) are the classes of interest.

Parameters
  • prevs – array-like of shape (n_classes,) with the true prevalence values

  • prevs_hat – array-like of shape (n_classes,) with the predicted prevalence values

Returns

absolute error

quapy.error.smooth(prevs, eps)

Smooths a prevalence distribution with \(\epsilon\) (eps) as: \(\underline{p}(y)=\frac{\epsilon+p(y)}{\epsilon|\mathcal{Y}|+\displaystyle\sum_{y\in \mathcal{Y}}p(y)}\)

Parameters
  • prevs – array-like of shape (n_classes,) with the true prevalence values

  • eps – smoothing factor

Returns

array-like of shape (n_classes,) with the smoothed distribution

quapy.evaluation module

quapy.evaluation.artificial_prevalence_prediction(model: quapy.method.base.BaseQuantifier, test: quapy.data.base.LabelledCollection, sample_size, n_prevpoints=101, repeats=1, eval_budget: Optional[int] = None, n_jobs=1, random_seed=42, verbose=False)

Performs the predictions for all samples generated according to the Artificial Prevalence Protocol (APP). The APP consists of exploring a grid of prevalence values containing n_prevalences points (e.g., [0, 0.05, 0.1, 0.15, …, 1], if n_prevalences=21), and generating all valid combinations of prevalence values for all classes (e.g., for 3 classes, samples with [0, 0, 1], [0, 0.05, 0.95], …, [1, 0, 0] prevalence values of size sample_size will be considered). The number of samples for each valid combination of prevalence values is indicated by repeats.

Parameters
  • model – the model in charge of generating the class prevalence estimations

  • test – the test set on which to perform APP

  • sample_size – integer, the size of the samples

  • n_prevpoints – integer, the number of different prevalences to sample (or set to None if eval_budget is specified; default 101, i.e., steps of 1%)

  • repeats – integer, the number of repetitions for each prevalence (default 1)

  • eval_budget – integer, if specified, sets a ceil on the number of evaluations to perform. For example, if there are 3 classes, repeats=1, and eval_budget=20, then n_prevpoints will be set to 5, since this will generate 15 different prevalence vectors ([0, 0, 1], [0, 0.25, 0.75], [0, 0.5, 0.5] … [1, 0, 0]) and since setting n_prevpoints=6 would produce more than 20 evaluations.

  • n_jobs – integer, number of jobs to be run in parallel (default 1)

  • random_seed – integer, allows to replicate the samplings. The seed is local to the method and does not affect any other random process (default 42)

  • verbose – if True, shows a progress bar

Returns

a tuple containing two np.ndarrays of shape (m,n,) with m the number of samples (n_prevpoints*repeats) and n the number of classes. The first one contains the true prevalence values for the samples generated while the second one contains the prevalence estimations

quapy.evaluation.artificial_prevalence_protocol(model: quapy.method.base.BaseQuantifier, test: quapy.data.base.LabelledCollection, sample_size, n_prevpoints=101, repeats=1, eval_budget: Optional[int] = None, n_jobs=1, random_seed=42, error_metric: Union[str, Callable] = 'mae', verbose=False)

Generates samples according to the Artificial Prevalence Protocol (APP). The APP consists of exploring a grid of prevalence values containing n_prevalences points (e.g., [0, 0.05, 0.1, 0.15, …, 1], if n_prevalences=21), and generating all valid combinations of prevalence values for all classes (e.g., for 3 classes, samples with [0, 0, 1], [0, 0.05, 0.95], …, [1, 0, 0] prevalence values of size sample_size will be considered). The number of samples for each valid combination of prevalence values is indicated by repeats.

Parameters
  • model – the model in charge of generating the class prevalence estimations

  • test – the test set on which to perform APP

  • sample_size – integer, the size of the samples

  • n_prevpoints – integer, the number of different prevalences to sample (or set to None if eval_budget is specified; default 101, i.e., steps of 1%)

  • repeats – integer, the number of repetitions for each prevalence (default 1)

  • eval_budget – integer, if specified, sets a ceil on the number of evaluations to perform. For example, if there are 3 classes, repeats=1, and eval_budget=20, then n_prevpoints will be set to 5, since this will generate 15 different prevalence vectors ([0, 0, 1], [0, 0.25, 0.75], [0, 0.5, 0.5] … [1, 0, 0]) and since setting n_prevpoints=6 would produce more than 20 evaluations.

  • n_jobs – integer, number of jobs to be run in parallel (default 1)

  • random_seed – integer, allows to replicate the samplings. The seed is local to the method and does not affect any other random process (default 42)

  • error_metric – a string indicating the name of the error (as defined in quapy.error) or a callable error function

  • verbose – set to True (default False) for displaying some information on standard output

Returns

yields one sample at a time

quapy.evaluation.artificial_prevalence_report(model: quapy.method.base.BaseQuantifier, test: quapy.data.base.LabelledCollection, sample_size, n_prevpoints=101, repeats=1, eval_budget: Optional[int] = None, n_jobs=1, random_seed=42, error_metrics: Iterable[Union[str, Callable]] = 'mae', verbose=False)

Generates an evaluation report for all samples generated according to the Artificial Prevalence Protocol (APP). The APP consists of exploring a grid of prevalence values containing n_prevalences points (e.g., [0, 0.05, 0.1, 0.15, …, 1], if n_prevalences=21), and generating all valid combinations of prevalence values for all classes (e.g., for 3 classes, samples with [0, 0, 1], [0, 0.05, 0.95], …, [1, 0, 0] prevalence values of size sample_size will be considered). The number of samples for each valid combination of prevalence values is indicated by repeats. Te report takes the form of a pandas’ dataframe in which the rows correspond to different samples, and the columns inform of the true prevalence values, the estimated prevalence values, and the score obtained by each of the evaluation measures indicated.

Parameters
  • model – the model in charge of generating the class prevalence estimations

  • test – the test set on which to perform APP

  • sample_size – integer, the size of the samples

  • n_prevpoints – integer, the number of different prevalences to sample (or set to None if eval_budget is specified; default 101, i.e., steps of 1%)

  • repeats – integer, the number of repetitions for each prevalence (default 1)

  • eval_budget – integer, if specified, sets a ceil on the number of evaluations to perform. For example, if there are 3 classes, repeats=1, and eval_budget=20, then n_prevpoints will be set to 5, since this will generate 15 different prevalence vectors ([0, 0, 1], [0, 0.25, 0.75], [0, 0.5, 0.5] … [1, 0, 0]) and since setting n_prevpoints=6 would produce more than 20 evaluations.

  • n_jobs – integer, number of jobs to be run in parallel (default 1)

  • random_seed – integer, allows to replicate the samplings. The seed is local to the method and does not affect any other random process (default 42)

  • error_metrics – a string indicating the name of the error (as defined in quapy.error) or a callable error function; optionally, a list of strings or callables can be indicated, if the results are to be evaluated with more than one error metric. Default is “mae”

  • verbose – if True, shows a progress bar

Returns

pandas’ dataframe with rows corresponding to different samples, and with columns informing of the true prevalence values, the estimated prevalence values, and the score obtained by each of the evaluation measures indicated.

quapy.evaluation.evaluate(model: quapy.method.base.BaseQuantifier, test_samples: Iterable[quapy.data.base.LabelledCollection], error_metric: Union[str, Callable], n_jobs: int = - 1)

Evaluates a model on a sequence of test samples in terms of a given error metric.

Parameters
  • model – the model in charge of generating the class prevalence estimations

  • test_samples – an iterable yielding one sample at a time

  • error_metric – a string indicating the name of the error (as defined in quapy.error) or a callable error function

  • n_jobs – integer, number of jobs to be run in parallel (default 1)

Returns

the score obtained using error_metric

quapy.evaluation.gen_prevalence_prediction(model: quapy.method.base.BaseQuantifier, gen_fn: Callable, eval_budget=None)

Generates prevalence predictions for a custom protocol defined as a generator function that yields samples at each iteration. The sequence of samples is processed exhaustively if eval_budget=None or up to the eval_budget iterations if specified.

Parameters
  • model – the model in charge of generating the class prevalence estimations

  • gen_fn – a generator function yielding one sample at each iteration

  • eval_budget – a maximum number of evaluations to run. Set to None (default) for exploring the entire sequence

Returns

a tuple containing two np.ndarrays of shape (m,n,) with m the number of samples generated and n the number of classes. The first one contains the true prevalence values for the samples generated while the second one contains the prevalence estimations

quapy.evaluation.gen_prevalence_report(model: quapy.method.base.BaseQuantifier, gen_fn: Callable, eval_budget=None, error_metrics: Iterable[Union[str, Callable]] = 'mae')

GGenerates an evaluation report for a custom protocol defined as a generator function that yields samples at each iteration. The sequence of samples is processed exhaustively if eval_budget=None or up to the eval_budget iterations if specified. Te report takes the form of a pandas’ dataframe in which the rows correspond to different samples, and the columns inform of the true prevalence values, the estimated prevalence values, and the score obtained by each of the evaluation measures indicated.

Parameters
  • model – the model in charge of generating the class prevalence estimations

  • gen_fn – a generator function yielding one sample at each iteration

  • eval_budget – a maximum number of evaluations to run. Set to None (default) for exploring the entire sequence

Returns

a tuple containing two np.ndarrays of shape (m,n,) with m the number of samples generated. The first one contains the true prevalence values for the samples generated while the second one contains the prevalence estimations

quapy.evaluation.natural_prevalence_prediction(model: quapy.method.base.BaseQuantifier, test: quapy.data.base.LabelledCollection, sample_size, repeats, n_jobs=1, random_seed=42, verbose=False)

Performs the predictions for all samples generated according to the Natural Prevalence Protocol (NPP). The NPP consists of drawing samples uniformly at random, therefore approximately preserving the natural prevalence of the collection.

Parameters
  • model – the model in charge of generating the class prevalence estimations

  • test – the test set on which to perform NPP

  • sample_size – integer, the size of the samples

  • repeats – integer, the number of samples to generate

  • n_jobs – integer, number of jobs to be run in parallel (default 1)

  • random_seed – allows to replicate the samplings. The seed is local to the method and does not affect any other random process (default 42)

  • verbose – if True, shows a progress bar

Returns

a tuple containing two np.ndarrays of shape (m,n,) with m the number of samples (repeats) and n the number of classes. The first one contains the true prevalence values for the samples generated while the second one contains the prevalence estimations

quapy.evaluation.natural_prevalence_protocol(model: quapy.method.base.BaseQuantifier, test: quapy.data.base.LabelledCollection, sample_size, repeats=1, n_jobs=1, random_seed=42, error_metric: Union[str, Callable] = 'mae', verbose=False)

Generates samples according to the Natural Prevalence Protocol (NPP). The NPP consists of drawing samples uniformly at random, therefore approximately preserving the natural prevalence of the collection.

Parameters
  • model – the model in charge of generating the class prevalence estimations

  • test – the test set on which to perform NPP

  • sample_size – integer, the size of the samples

  • repeats – integer, the number of samples to generate

  • n_jobs – integer, number of jobs to be run in parallel (default 1)

  • random_seed – allows to replicate the samplings. The seed is local to the method and does not affect any other random process (default 42)

  • error_metric – a string indicating the name of the error (as defined in quapy.error) or a callable error function

  • verbose – if True, shows a progress bar

Returns

yields one sample at a time

quapy.evaluation.natural_prevalence_report(model: quapy.method.base.BaseQuantifier, test: quapy.data.base.LabelledCollection, sample_size, repeats=1, n_jobs=1, random_seed=42, error_metrics: Iterable[Union[str, Callable]] = 'mae', verbose=False)

Generates an evaluation report for all samples generated according to the Natural Prevalence Protocol (NPP). The NPP consists of drawing samples uniformly at random, therefore approximately preserving the natural prevalence of the collection. Te report takes the form of a pandas’ dataframe in which the rows correspond to different samples, and the columns inform of the true prevalence values, the estimated prevalence values, and the score obtained by each of the evaluation measures indicated.

Parameters
  • model – the model in charge of generating the class prevalence estimations

  • test – the test set on which to perform NPP

  • sample_size – integer, the size of the samples

  • repeats – integer, the number of samples to generate

  • n_jobs – integer, number of jobs to be run in parallel (default 1)

  • random_seed – allows to replicate the samplings. The seed is local to the method and does not affect any other random process (default 42)

  • error_metrics – a string indicating the name of the error (as defined in quapy.error) or a callable error function; optionally, a list of strings or callables can be indicated, if the results are to be evaluated with more than one error metric. Default is “mae”

  • verbose – if True, shows a progress bar

Returns

a tuple containing two np.ndarrays of shape (m,n,) with m the number of samples (repeats) and n the number of classes. The first one contains the true prevalence values for the samples generated while the second one contains the prevalence estimations

quapy.functional module

quapy.functional.HellingerDistance(P, Q)

Computes the Hellingher Distance (HD) between (discretized) distributions P and Q. The HD for two discrete distributions of k bins is defined as:

\[HD(P,Q) = \frac{ 1 }{ \sqrt{ 2 } } \sqrt{ \sum_{i=1}^k ( \sqrt{p_i} - \sqrt{q_i} )^2 }\]
Parameters
  • P – real-valued array-like of shape (k,) representing a discrete distribution

  • Q – real-valued array-like of shape (k,) representing a discrete distribution

Returns

float

quapy.functional.adjusted_quantification(prevalence_estim, tpr, fpr, clip=True)

Implements the adjustment of ACC and PACC for the binary case. The adjustment for a prevalence estimate of the positive class p comes down to computing:

\[ACC(p) = \frac{ p - fpr }{ tpr - fpr }\]
Parameters
  • prevalence_estim – float, the estimated value for the positive class

  • tpr – float, the true positive rate of the classifier

  • fpr – float, the false positive rate of the classifier

  • clip – set to True (default) to clip values that might exceed the range [0,1]

Returns

float, the adjusted count

quapy.functional.artificial_prevalence_sampling(dimensions, n_prevalences=21, repeat=1, return_constrained_dim=False)

Generates vectors of prevalence values artificially drawn from an exhaustive grid of prevalence values. The number of prevalence values explored for each dimension depends on n_prevalences, so that, if, for example, n_prevalences=11 then the prevalence values of the grid are taken from [0, 0.1, 0.2, …, 0.9, 1]. Only valid prevalence distributions are returned, i.e., vectors of prevalence values that sum up to 1. For each valid vector of prevalence values, repeat copies are returned. The vector of prevalence values can be implicit (by setting return_constrained_dim=False), meaning that the last dimension (which is constrained to 1 - sum of the rest) is not returned (note that, quite obviously, in this case the vector does not sum up to 1).

Parameters
  • dimensions – the number of classes

  • n_prevalences – the number of equidistant prevalence points to extract from the [0,1] interval for the grid (default is 21)

  • repeat – number of copies for each valid prevalence vector (default is 1)

  • return_constrained_dim – set to True to return all dimensions, or to False (default) for ommitting the constrained dimension

Returns

a np.ndarray of shape (n, dimensions) if return_constrained_dim=True or of shape (n, dimensions-1) if return_constrained_dim=False, where n is the number of valid combinations found in the grid multiplied by repeat

quapy.functional.get_nprevpoints_approximation(combinations_budget: int, n_classes: int, n_repeats: int = 1)

Searches for the largest number of (equidistant) prevalence points to define for each of the n_classes classes so that the number of valid prevalence values generated as combinations of prevalence points (points in a n_classes-dimensional simplex) do not exceed combinations_budget.

Parameters
  • combinations_budget – integer, maximum number of combinatios allowed

  • n_classes – integer, number of classes

  • n_repeats – integer, number of repetitions for each prevalence combination

Returns

the largest number of prevalence points that generate less than combinations_budget valid prevalences

quapy.functional.normalize_prevalence(prevalences)

Normalize a vector or matrix of prevalence values. The normalization consists of applying a L1 normalization in cases in which the prevalence values are not all-zeros, and to convert the prevalence values into 1/n_classes in cases in which all values are zero.

Parameters

prevalences – array-like of shape (n_classes,) or of shape (n_samples, n_classes,) with prevalence values

Returns

a normalized vector or matrix of prevalence values

quapy.functional.num_prevalence_combinations(n_prevpoints: int, n_classes: int, n_repeats: int = 1)

Computes the number of valid prevalence combinations in the n_classes-dimensional simplex if n_prevpoints equally distant prevalence values are generated and n_repeats repetitions are requested. The computation comes down to calculating:

\[\binom{N+C-1}{C-1} \times r\]

where N is n_prevpoints-1, i.e., the number of probability mass blocks to allocate, C is the number of classes, and r is n_repeats. This solution comes from the Stars and Bars problem.

Parameters
  • n_classes – integer, number of classes

  • n_prevpoints – integer, number of prevalence points.

  • n_repeats – integer, number of repetitions for each prevalence combination

Returns

The number of possible combinations. For example, if n_classes=2, n_prevpoints=5, n_repeats=1, then the

number of possible combinations are 5, i.e.: [0,1], [0.25,0.75], [0.50,0.50], [0.75,0.25], and [1.0,0.0]

quapy.functional.prevalence_from_labels(labels, classes)

Computed the prevalence values from a vector of labels.

Parameters
  • labels – array-like of shape (n_instances) with the label for each instance

  • classes – the class labels. This is needed in order to correctly compute the prevalence vector even when some classes have no examples.

Returns

an ndarray of shape (len(classes)) with the class prevalence values

quapy.functional.prevalence_from_probabilities(posteriors, binarize: bool = False)

Returns a vector of prevalence values from a matrix of posterior probabilities.

Parameters
  • posteriors – array-like of shape (n_instances, n_classes,) with posterior probabilities for each class

  • binarize – set to True (default is False) for computing the prevalence values on crisp decisions (i.e., converting the vectors of posterior probabilities into class indices, by taking the argmax).

Returns

array of shape (n_classes,) containing the prevalence values

quapy.functional.prevalence_linspace(n_prevalences=21, repeats=1, smooth_limits_epsilon=0.01)

Produces an array of uniformly separated values of prevalence. By default, produces an array of 21 prevalence values, with step 0.05 and with the limits smoothed, i.e.: [0.01, 0.05, 0.10, 0.15, …, 0.90, 0.95, 0.99]

Parameters
  • n_prevalences – the number of prevalence values to sample from the [0,1] interval (default 21)

  • repeats – number of times each prevalence is to be repeated (defaults to 1)

  • smooth_limits_epsilon – the quantity to add and subtract to the limits 0 and 1

Returns

an array of uniformly separated prevalence values

quapy.functional.strprev(prevalences, prec=3)

Returns a string representation for a prevalence vector. E.g.,

>>> strprev([1/3, 2/3], prec=2)
>>> '[0.33, 0.67]'
Parameters
  • prevalences – a vector of prevalence values

  • prec – float precision

Returns

string

quapy.functional.uniform_prevalence_sampling(n_classes, size=1)

Implements the Kraemer algorithm for sampling uniformly at random from the unit simplex. This implementation is adapted from this post <https://cs.stackexchange.com/questions/3227/uniform-sampling-from-a-simplex>_.

Parameters
  • n_classes – integer, number of classes (dimensionality of the simplex)

  • size – number of samples to return

Returns

np.ndarray of shape (size, n_classes,) if size>1, or of shape (n_classes,) otherwise

quapy.functional.uniform_simplex_sampling(n_classes, size=1)

Implements the Kraemer algorithm for sampling uniformly at random from the unit simplex. This implementation is adapted from this post <https://cs.stackexchange.com/questions/3227/uniform-sampling-from-a-simplex>_.

Parameters
  • n_classes – integer, number of classes (dimensionality of the simplex)

  • size – number of samples to return

Returns

np.ndarray of shape (size, n_classes,) if size>1, or of shape (n_classes,) otherwise

quapy.model_selection module

class quapy.model_selection.GridSearchQ(model: quapy.method.base.BaseQuantifier, param_grid: dict, sample_size: Optional[int], protocol='app', n_prevpoints: Optional[int] = None, n_repetitions: int = 1, eval_budget: Optional[int] = None, error: Union[Callable, str] = <function mae>, refit=True, val_split=0.4, n_jobs=1, random_seed=42, timeout=-1, verbose=False)

Bases: quapy.method.base.BaseQuantifier

Grid Search optimization targeting a quantification-oriented metric.

Optimizes the hyperparameters of a quantification method, based on an evaluation method and on an evaluation protocol for quantification.

Parameters
  • model (BaseQuantifier) – the quantifier to optimize

  • param_grid – a dictionary with keys the parameter names and values the list of values to explore

  • sample_size – the size of the samples to extract from the validation set (ignored if protocl=’gen’)

  • protocol – either ‘app’ for the artificial prevalence protocol, ‘npp’ for the natural prevalence protocol, or ‘gen’ for using a custom sampling generator function

  • n_prevpoints – if specified, indicates the number of equally distant points to extract from the interval [0,1] in order to define the prevalences of the samples; e.g., if n_prevpoints=5, then the prevalences for each class will be explored in [0.00, 0.25, 0.50, 0.75, 1.00]. If not specified, then eval_budget is requested. Ignored if protocol!=’app’.

  • n_repetitions – the number of repetitions for each combination of prevalences. This parameter is ignored for the protocol=’app’ if eval_budget is set and is lower than the number of combinations that would be generated using the value assigned to n_prevpoints (for the current number of classes and n_repetitions). Ignored for protocol=’npp’ and protocol=’gen’ (use eval_budget for setting a maximum number of samples in those cases).

  • eval_budget – if specified, sets a ceil on the number of evaluations to perform for each hyper-parameter combination. For example, if protocol=’app’, there are 3 classes, n_repetitions=1 and eval_budget=20, then n_prevpoints will be set to 5, since this will generate 15 different prevalences, i.e., [0, 0, 1], [0, 0.25, 0.75], [0, 0.5, 0.5] … [1, 0, 0], and since setting it to 6 would generate more than 20. When protocol=’gen’, indicates the maximum number of samples to generate, but less samples will be generated if the generator yields less samples.

  • error – an error function (callable) or a string indicating the name of an error function (valid ones are those in qp.error.QUANTIFICATION_ERROR

  • refit – whether or not to refit the model on the whole labelled collection (training+validation) with the best chosen hyperparameter combination. Ignored if protocol=’gen’

  • val_split – either a LabelledCollection on which to test the performance of the different settings, or a float in [0,1] indicating the proportion of labelled data to extract from the training set, or a callable returning a generator function each time it is invoked (only for protocol=’gen’).

  • n_jobs – number of parallel jobs

  • random_seed – set the seed of the random generator to replicate experiments. Ignored if protocol=’gen’.

  • timeout – establishes a timer (in seconds) for each of the hyperparameters configurations being tested. Whenever a run takes longer than this timer, that configuration will be ignored. If all configurations end up being ignored, a TimeoutError exception is raised. If -1 (default) then no time bound is set.

  • verbose – set to True to get information through the stdout

best_model()

Returns the best model found after calling the fit() method, i.e., the one trained on the combination of hyper-parameters that minimized the error function.

Returns

a trained quantifier

property classes_

Classes on which the quantifier has been trained on. :return: a ndarray of shape (n_classes) with the class identifiers

fit(training: quapy.data.base.LabelledCollection, val_split: Optional[Union[quapy.data.base.LabelledCollection, float, Callable]] = None)
Learning routine. Fits methods with all combinations of hyperparameters and selects the one minimizing

the error metric.

Parameters
  • training – the training set on which to optimize the hyperparameters

  • val_split – either a LabelledCollection on which to test the performance of the different settings, or a float in [0,1] indicating the proportion of labelled data to extract from the training set

Returns

self

get_params(deep=True)

Returns the dictionary of hyper-parameters to explore (param_grid)

Parameters

deep – Unused

Returns

the dictionary param_grid

quantify(instances)

Estimate class prevalence values using the best model found after calling the fit() method.

Parameters

instances – sample contanining the instances

Returns

a ndarray of shape (n_classes) with class prevalence estimates as according to the best model found by the model selection process.

set_params(**parameters)

Sets the hyper-parameters to explore.

Parameters

parameters – a dictionary with keys the parameter names and values the list of values to explore

quapy.plot module

quapy.plot.binary_bias_bins(method_names, true_prevs, estim_prevs, pos_class=1, title=None, nbins=5, colormap=<matplotlib.colors.ListedColormap object>, vertical_xticks=False, legend=True, savepath=None)

Box-plots displaying the local bias (i.e., signed error computed as the estimated value minus the true value) for different bins of (true) prevalence of the positive classs, for each quantification method.

Parameters
  • method_names – array-like with the method names for each experiment

  • true_prevs – array-like with the true prevalence values (each being a ndarray with n_classes components) for each experiment

  • estim_prevs – array-like with the estimated prevalence values (each being a ndarray with n_classes components) for each experiment

  • pos_class – index of the positive class

  • title – the title to be displayed in the plot

  • nbins – number of bins

  • colormap – the matplotlib colormap to use (default cm.tab10)

  • vertical_xticks – whether or not to add secondary grid (default is False)

  • legend – whether or not to display the legend (default is True)

  • savepath – path where to save the plot. If not indicated (as default), the plot is shown.

quapy.plot.binary_bias_global(method_names, true_prevs, estim_prevs, pos_class=1, title=None, savepath=None)

Box-plots displaying the global bias (i.e., signed error computed as the estimated value minus the true value) for each quantification method with respect to a given positive class.

Parameters
  • method_names – array-like with the method names for each experiment

  • true_prevs – array-like with the true prevalence values (each being a ndarray with n_classes components) for each experiment

  • estim_prevs – array-like with the estimated prevalence values (each being a ndarray with n_classes components) for each experiment

  • pos_class – index of the positive class

  • title – the title to be displayed in the plot

  • savepath – path where to save the plot. If not indicated (as default), the plot is shown.

quapy.plot.binary_diagonal(method_names, true_prevs, estim_prevs, pos_class=1, title=None, show_std=True, legend=True, train_prev=None, savepath=None, method_order=None)

The diagonal plot displays the predicted prevalence values (along the y-axis) as a function of the true prevalence values (along the x-axis). The optimal quantifier is described by the diagonal (0,0)-(1,1) of the plot (hence the name). It is convenient for binary quantification problems, though it can be used for multiclass problems by indicating which class is to be taken as the positive class. (For multiclass quantification problems, other plots like the error_by_drift() might be preferable though).

Parameters
  • method_names – array-like with the method names for each experiment

  • true_prevs – array-like with the true prevalence values (each being a ndarray with n_classes components) for each experiment

  • estim_prevs – array-like with the estimated prevalence values (each being a ndarray with n_classes components) for each experiment

  • pos_class – index of the positive class

  • title – the title to be displayed in the plot

  • show_std – whether or not to show standard deviations (represented by color bands). This might be inconvenient for cases in which many methods are compared, or when the standard deviations are high – default True)

  • legend – whether or not to display the leyend (default True)

  • train_prev – if indicated (default is None), the training prevalence (for the positive class) is hightlighted in the plot. This is convenient when all the experiments have been conducted in the same dataset.

  • savepath – path where to save the plot. If not indicated (as default), the plot is shown.

  • method_order – if indicated (default is None), imposes the order in which the methods are processed (i.e., listed in the legend and associated with matplotlib colors).

quapy.plot.brokenbar_supremacy_by_drift(method_names, true_prevs, estim_prevs, tr_prevs, n_bins=20, binning='isomerous', x_error='ae', y_error='ae', ttest_alpha=0.005, tail_density_threshold=0.005, method_order=None, savepath=None)

Displays (only) the top performing methods for different regions of the train-test shift in form of a broken bar chart, in which each method has bars only for those regions in which either one of the following conditions hold: (i) it is the best method (in average) for the bin, or (ii) it is not statistically significantly different (in average) as according to a two-sided t-test on independent samples at confidence ttest_alpha. The binning can be made “isometric” (same size), or “isomerous” (same number of experiments – default). A second plot is displayed on top, that displays the distribution of experiments for each bin (when binning=”isometric”) or the percentiles points of the distribution (when binning=”isomerous”).

Parameters
  • method_names – array-like with the method names for each experiment

  • true_prevs – array-like with the true prevalence values (each being a ndarray with n_classes components) for each experiment

  • estim_prevs – array-like with the estimated prevalence values (each being a ndarray with n_classes components) for each experiment

  • tr_prevs – training prevalence of each experiment

  • n_bins – number of bins in which the y-axis is to be divided (default is 20)

  • binning – type of binning, either “isomerous” (default) or “isometric”

  • x_error – a string representing the name of an error function (as defined in quapy.error) to be used for measuring the amount of train-test shift (default is “ae”)

  • y_error – a string representing the name of an error function (as defined in quapy.error) to be used for measuring the amount of error in the prevalence estimations (default is “ae”)

  • ttest_alpha – the confidence interval above which a p-value (two-sided t-test on independent samples) is to be considered as an indicator that the two means are not statistically significantly different. Default is 0.005, meaning that a p-value > 0.005 indicates the two methods involved are to be considered similar

  • tail_density_threshold – sets a threshold on the density of experiments (over the total number of experiments) below which a bin in the tail (i.e., the right-most ones) will be discarded. This is in order to avoid some bins to be shown for train-test outliers.

  • method_order – if indicated (default is None), imposes the order in which the methods are processed (i.e., listed in the legend and associated with matplotlib colors).

  • savepath – path where to save the plot. If not indicated (as default), the plot is shown.

Returns

quapy.plot.error_by_drift(method_names, true_prevs, estim_prevs, tr_prevs, n_bins=20, error_name='ae', show_std=False, show_density=True, logscale=False, title='Quantification error as a function of distribution shift', vlines=None, method_order=None, savepath=None)

Plots the error (along the x-axis, as measured in terms of error_name) as a function of the train-test shift (along the y-axis, as measured in terms of quapy.error.ae()). This plot is useful especially for multiclass problems, in which “diagonal plots” may be cumbersone, and in order to gain understanding about how methods fare in different regions of the prior probability shift spectrum (e.g., in the low-shift regime vs. in the high-shift regime).

Parameters
  • method_names – array-like with the method names for each experiment

  • true_prevs – array-like with the true prevalence values (each being a ndarray with n_classes components) for each experiment

  • estim_prevs – array-like with the estimated prevalence values (each being a ndarray with n_classes components) for each experiment

  • tr_prevs – training prevalence of each experiment

  • n_bins – number of bins in which the y-axis is to be divided (default is 20)

  • error_name – a string representing the name of an error function (as defined in quapy.error, default is “ae”)

  • show_std – whether or not to show standard deviations as color bands (default is False)

  • show_density – whether or not to display the distribution of experiments for each bin (default is True)

  • logscale – whether or not to log-scale the y-error measure (default is False)

  • title – title of the plot (default is “Quantification error as a function of distribution shift”)

  • vlines – array-like list of values (default is None). If indicated, highlights some regions of the space using vertical dotted lines.

  • method_order – if indicated (default is None), imposes the order in which the methods are processed (i.e., listed in the legend and associated with matplotlib colors).

  • savepath – path where to save the plot. If not indicated (as default), the plot is shown.

quapy.util module

class quapy.util.EarlyStop(patience, lower_is_better=True)

Bases: object

A class implementing the early-stopping condition typically used for training neural networks.

Parameters

patience – the number of (consecutive) times that a monitored evaluation metric (typically obtaind in a

held-out validation split) can be found to be worse than the best one obtained so far, before flagging the stopping condition. An instance of this class is callable, and is to be used as follows:

>>> earlystop = EarlyStop(patience=2, lower_is_better=True)
>>> earlystop(0.9, epoch=0)
>>> earlystop(0.7, epoch=1)
>>> earlystop.IMPROVED  # is True
>>> earlystop(1.0, epoch=2)
>>> earlystop.STOP  # is False (patience=1)
>>> earlystop(1.0, epoch=3)
>>> earlystop.STOP  # is True (patience=0)
>>> earlystop.best_epoch  # is 1
>>> earlystop.best_score  # is 0.7
Parameters

lower_is_better – if True (default) the metric is to be minimized.

Variables
  • best_score – keeps track of the best value seen so far

  • best_epoch – keeps track of the epoch in which the best score was set

  • STOP – flag (boolean) indicating the stopping condition

  • IMPROVED – flag (boolean) indicating whether there was an improvement in the last call

quapy.util.create_if_not_exist(path)

An alias to os.makedirs(path, exist_ok=True) that also returns the path. This is useful in cases like, e.g.:

>>> path = create_if_not_exist(os.path.join(dir, subdir, anotherdir))
Parameters

path – path to create

Returns

the path itself

quapy.util.create_parent_dir(path)

Creates the parent dir (if any) of a given path, if not exists. E.g., for ./path/to/file.txt, the path ./path/to is created.

Parameters

path – the path

quapy.util.download_file(url, archive_filename)

Downloads a file from a url

Parameters
  • url – the url

  • archive_filename – destination filename

quapy.util.download_file_if_not_exists(url, archive_filename)

Dowloads a function (using download_file()) if the file does not exist.

Parameters
  • url – the url

  • archive_filename – destination filename

quapy.util.get_quapy_home()

Gets the home directory of QuaPy, i.e., the directory where QuaPy saves permanent data, such as dowloaded datasets.

Returns

a string representing the path

quapy.util.map_parallel(func, args, n_jobs)

Applies func to n_jobs slices of args. E.g., if args is an array of 99 items and n_jobs=2, then func is applied in two parallel processes to args[0:50] and to args[50:99]

Parameters
  • func – function to be parallelized

  • args – array-like of arguments to be passed to the function in different parallel calls

  • n_jobs – the number of workers

quapy.util.parallel(func, args, n_jobs)

A wrapper of multiprocessing:

>>> Parallel(n_jobs=n_jobs)(
>>>      delayed(func)(args_i) for args_i in args
>>> )

that takes the quapy.environ variable as input silently

quapy.util.pickled_resource(pickle_path: str, generation_func: callable, *args)

Allows for fast reuse of resources that are generated only once by calling generation_func(*args). The next times this function is invoked, it loads the pickled resource. Example:

>>> def some_array(n):  # a mock resource created with one parameter (`n`)
>>>     return np.random.rand(n)
>>> pickled_resource('./my_array.pkl', some_array, 10)  # the resource does not exist: it is created by calling some_array(10)
>>> pickled_resource('./my_array.pkl', some_array, 10)  # the resource exists; it is loaded from './my_array.pkl'
Parameters
  • pickle_path – the path where to save (first time) and load (next times) the resource

  • generation_func – the function that generates the resource, in case it does not exist in pickle_path

  • args – any arg that generation_func uses for generating the resources

Returns

the resource

quapy.util.save_text_file(path, text)

Saves a text file to disk, given its full path, and creates the parent directory if missing.

Parameters
  • path – path where to save the path.

  • text – text to save.

quapy.util.temp_seed(seed)

Can be used in a “with” context to set a temporal seed without modifying the outer numpy’s current state. E.g.:

>>> with temp_seed(random_seed):
>>>  pass # do any computation depending on np.random functionality
Parameters

seed – the seed to set within the “with” context

Module contents

quapy.isbinary(x)