preparing to merge

2023-02-14 17:00:50 +01:00 · 2023-02-14 17:00:50 +01:00 · 49fc486c53
parent 25a829996e
commit 49fc486c53
27 changed files with 927 additions and 699 deletions
--- a/README.md
+++ b/README.md
@ -13,6 +13,7 @@ for facilitating the analysis and interpretation of the experimental results.

 ### Last updates:

+* Version 0.1.7 is released! major changes can be consulted [here](quapy/FCHANGE_LOG.txt).
 * A detailed documentation is now available [here](https://hlt-isti.github.io/QuaPy/)
 * The developer API documentation is available [here](https://hlt-isti.github.io/QuaPy/build/html/modules.html)

@ -59,13 +60,14 @@ See the [Wiki](https://github.com/HLT-ISTI/QuaPy/wiki) for detailed examples.
 ## Features

 * Implementation of many popular quantification methods (Classify-&-Count and its variants, Expectation Maximization,
-quantification methods based on structured output learning, HDy, QuaNet, and quantification ensembles).
-* Versatile functionality for performing evaluation based on artificial sampling protocols.
+quantification methods based on structured output learning, HDy, QuaNet, quantification ensembles, among others).
+* Versatile functionality for performing evaluation based on sampling generation protocols (e.g., APP, NPP, etc.).
 * Implementation of most commonly used evaluation metrics (e.g., AE, RAE, SE, KLD, NKLD, etc.).
 * Datasets frequently used in quantification (textual and numeric), including:
    * 32 UCI Machine Learning datasets.
    * 11 Twitter quantification-by-sentiment datasets.
    * 3 product reviews quantification-by-sentiment datasets. 
+    * 4 tasks from LeQua competition (_new in v0.1.7!_)
 * Native support for binary and single-label multiclass quantification scenarios.
 * Model selection functionality that minimizes quantification-oriented loss functions.
 * Visualization tools for analysing the experimental results.
@ -80,29 +82,6 @@ quantification methods based on structured output learning, HDy, QuaNet, and qua
 * pandas, xlrd
 * matplotlib

-## SVM-perf with quantification-oriented losses
-In order to run experiments involving SVM(Q), SVM(KLD), SVM(NKLD),
-SVM(AE), or SVM(RAE), you have to first download the 
-[svmperf](http://www.cs.cornell.edu/people/tj/svm_light/svm_perf.html) 
-package, apply the patch 
-[svm-perf-quantification-ext.patch](./svm-perf-quantification-ext.patch), and compile the sources.
-The script [prepare_svmperf.sh](prepare_svmperf.sh) does all the job. Simply run:
-
-```
-./prepare_svmperf.sh
-```
-
-The resulting directory [svm_perf_quantification](./svm_perf_quantification) contains the
-patched version of _svmperf_ with quantification-oriented losses. 
-
-The [svm-perf-quantification-ext.patch](./svm-perf-quantification-ext.patch) is an extension of the patch made available by
-[Esuli et al. 2015](https://dl.acm.org/doi/abs/10.1145/2700406?casa_token=8D2fHsGCVn0AAAAA:ZfThYOvrzWxMGfZYlQW_y8Cagg-o_l6X_PcF09mdETQ4Tu7jK98mxFbGSXp9ZSO14JkUIYuDGFG0) 
-that allows SVMperf to optimize for
-the _Q_ measure as proposed by [Barranquero et al. 2015](https://www.sciencedirect.com/science/article/abs/pii/S003132031400291X) 
-and for the _KLD_ and _NKLD_ measures as proposed by [Esuli et al. 2015](https://dl.acm.org/doi/abs/10.1145/2700406?casa_token=8D2fHsGCVn0AAAAA:ZfThYOvrzWxMGfZYlQW_y8Cagg-o_l6X_PcF09mdETQ4Tu7jK98mxFbGSXp9ZSO14JkUIYuDGFG0).
-This patch extends the above one by also allowing SVMperf to optimize for 
-_AE_ and _RAE_.
-  
  
 ## Documentation

@ -113,6 +92,8 @@ are provided:

 * [Datasets](https://github.com/HLT-ISTI/QuaPy/wiki/Datasets)
 * [Evaluation](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation)
+* [Protocols](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols)
 * [Methods](https://github.com/HLT-ISTI/QuaPy/wiki/Methods)
+* [SVMperf](https://github.com/HLT-ISTI/QuaPy/wiki/ExplicitLossMinimization)
 * [Model Selection](https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection)
 * [Plotting](https://github.com/HLT-ISTI/QuaPy/wiki/Plotting)
--- a/docs/build/html/Datasets.html
+++ b/docs/build/html/Datasets.html
@ -86,7 +86,7 @@ Take a look at the following code:</p>
 <span class="n">sample</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">sampling</span><span class="p">(</span><span class="n">sample_size</span><span class="p">,</span> <span class="o">*</span><span class="n">prev</span><span class="p">)</span>

 <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;instances:&#39;</span><span class="p">,</span> <span class="n">sample</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
-<span class="nb">print</span><span class="p">(</span><span class="s1">&#39;labels:&#39;</span><span class="p">,</span> <span class="n">sample</span><span class="o">.</span><span class="n">labels</span><span class="p">)</span>
+<span class="nb">print</span><span class="p">(</span><span class="s1">&#39;labels:&#39;</span><span class="p">,</span> <span class="n">sample</span><span class="o">.</span><span class="n">classes</span><span class="p">)</span>
 <span class="nb">print</span><span class="p">(</span><span class="s1">&#39;prevalence:&#39;</span><span class="p">,</span> <span class="n">F</span><span class="o">.</span><span class="n">strprev</span><span class="p">(</span><span class="n">sample</span><span class="o">.</span><span class="n">prevalence</span><span class="p">(),</span> <span class="n">prec</span><span class="o">=</span><span class="mi">2</span><span class="p">))</span>
 </pre></div>
 </div>
--- a/docs/build/html/Evaluation.html
+++ b/docs/build/html/Evaluation.html
@ -20,7 +20,7 @@
    <script src="_static/bizstyle.js"></script>
    <link rel="index" title="Index" href="genindex.html" />
    <link rel="search" title="Search" href="search.html" />
-    <link rel="next" title="Quantification Methods" href="Methods.html" />
+    <link rel="next" title="Protocols" href="Protocols.html" />
    <link rel="prev" title="Datasets" href="Datasets.html" />
    <meta name="viewport" content="width=device-width,initial-scale=1.0" />
    <!--[if lt IE 9]>
@ -37,7 +37,7 @@
          <a href="py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
-          <a href="Methods.html" title="Quantification Methods"
+          <a href="Protocols.html" title="Protocols"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="Datasets.html" title="Datasets"
@ -99,13 +99,13 @@ third argument, e.g.:</p>
 Traditionally, this value is set to 1/(2T) in past literature,
 with T the sampling size. One could either pass this value
 to the function each time, or to set a QuaPy’s environment
-variable <em>SAMPLE_SIZE</em> once, and ommit this argument
+variable <em>SAMPLE_SIZE</em> once, and omit this argument
 thereafter (recommended);
 e.g.:</p>
 <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;SAMPLE_SIZE&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="mi">100</span>  <span class="c1"># once for all</span>
 <span class="n">true_prev</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">asarray</span><span class="p">([</span><span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.3</span><span class="p">,</span> <span class="mf">0.2</span><span class="p">])</span>  <span class="c1"># let&#39;s assume 3 classes</span>
 <span class="n">estim_prev</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">asarray</span><span class="p">([</span><span class="mf">0.1</span><span class="p">,</span> <span class="mf">0.3</span><span class="p">,</span> <span class="mf">0.6</span><span class="p">])</span>
-<span class="n">error</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">ae_</span><span class="o">.</span><span class="n">mrae</span><span class="p">(</span><span class="n">true_prev</span><span class="p">,</span> <span class="n">estim_prev</span><span class="p">)</span>
+<span class="n">error</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">error</span><span class="o">.</span><span class="n">mrae</span><span class="p">(</span><span class="n">true_prev</span><span class="p">,</span> <span class="n">estim_prev</span><span class="p">)</span>
 <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;mrae(</span><span class="si">{</span><span class="n">true_prev</span><span class="si">}</span><span class="s1">, </span><span class="si">{</span><span class="n">estim_prev</span><span class="si">}</span><span class="s1">) = </span><span class="si">{</span><span class="n">error</span><span class="si">:</span><span class="s1">.3f</span><span class="si">}</span><span class="s1">&#39;</span><span class="p">)</span>
 </pre></div>
 </div>
@ -115,148 +115,93 @@ e.g.:</p>
 </div>
 <p>Finally, it is possible to instantiate QuaPy’s quantification
 error functions from strings using, e.g.:</p>
-<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">error_function</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">ae_</span><span class="o">.</span><span class="n">from_name</span><span class="p">(</span><span class="s1">&#39;mse&#39;</span><span class="p">)</span>
+<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">error_function</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">error</span><span class="o">.</span><span class="n">from_name</span><span class="p">(</span><span class="s1">&#39;mse&#39;</span><span class="p">)</span>
 <span class="n">error</span> <span class="o">=</span> <span class="n">error_function</span><span class="p">(</span><span class="n">true_prev</span><span class="p">,</span> <span class="n">estim_prev</span><span class="p">)</span>
 </pre></div>
 </div>
 </section>
 <section id="evaluation-protocols">
 <h2>Evaluation Protocols<a class="headerlink" href="#evaluation-protocols" title="Permalink to this heading">¶</a></h2>
-<p>QuaPy implements the so-called “artificial sampling protocol”,
-according to which a test set is used to generate samplings at
-desired prevalences of fixed size and covering the full spectrum
-of prevalences. This protocol is called “artificial” in contrast
-to the “natural prevalence sampling” protocol that,
-despite introducing some variability during sampling, approximately
-preserves the training class prevalence.</p>
-<p>In the artificial sampling procol, the user specifies the number
-of (equally distant) points to be generated from the interval [0,1].</p>
-<p>For example, if n_prevpoints=11 then, for each class, the prevalences
-[0., 0.1, 0.2, …, 1.] will be used. This means that, for two classes,
-the number of different prevalences will be 11 (since, once the prevalence
-of one class is determined, the other one is constrained). For 3 classes,
-the number of valid combinations can be obtained as 11 + 10 + … + 1 = 66.
-In general, the number of valid combinations that will be produced for a given
-value of n_prevpoints can be consulted by invoking
-quapy.functional.num_prevalence_combinations, e.g.:</p>
-<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">quapy.functional</span> <span class="k">as</span> <span class="nn">F</span>
-<span class="n">n_prevpoints</span> <span class="o">=</span> <span class="mi">21</span>
-<span class="n">n_classes</span> <span class="o">=</span> <span class="mi">4</span>
-<span class="n">n</span> <span class="o">=</span> <span class="n">F</span><span class="o">.</span><span class="n">num_prevalence_combinations</span><span class="p">(</span><span class="n">n_prevpoints</span><span class="p">,</span> <span class="n">n_classes</span><span class="p">,</span> <span class="n">n_repeats</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
+<p>An <em>evaluation protocol</em> is an evaluation procedure that uses
+one specific <em>sample generation procotol</em> to genereate many
+samples, typically characterized by widely varying amounts of
+<em>shift</em> with respect to the original distribution, that are then
+used to evaluate the performance of a (trained) quantifier.
+These protocols are explained in more detail in a dedicated <a class="reference internal" href="Protocols.html"><span class="doc std std-doc">entry
+in the wiki</span></a>. For the moment being, let us assume we already have
+chosen and instantiated one specific such protocol, that we here
+simply call <em>prot</em>. Let also assume our model is called
+<em>quantifier</em> and that our evaluatio measure of choice is
+<em>mae</em>. The evaluation comes down to:</p>
+<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">mae</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">evaluation</span><span class="o">.</span><span class="n">evaluate</span><span class="p">(</span><span class="n">quantifier</span><span class="p">,</span> <span class="n">protocol</span><span class="o">=</span><span class="n">prot</span><span class="p">,</span> <span class="n">error_metric</span><span class="o">=</span><span class="s1">&#39;mae&#39;</span><span class="p">)</span>
+<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;MAE = </span><span class="si">{</span><span class="n">mae</span><span class="si">:</span><span class="s1">.4f</span><span class="si">}</span><span class="s1">&#39;</span><span class="p">)</span>
 </pre></div>
 </div>
-<p>in this example, n=1771. Note the last argument, n_repeats, that
-informs of the number of examples that will be generated for any
-valid combination (typical values are, e.g., 1 for a single sample,
-or 10 or higher for computing standard deviations of performing statistical
-significance tests).</p>
-<p>One can instead work the other way around, i.e., one could set a
-maximum budged of evaluations and get the number of prevalence points that
-will generate a number of evaluations close, but not higher, than
-the fixed budget. This can be achieved with the function
-quapy.functional.get_nprevpoints_approximation, e.g.:</p>
-<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">budget</span> <span class="o">=</span> <span class="mi">5000</span>
-<span class="n">n_prevpoints</span> <span class="o">=</span> <span class="n">F</span><span class="o">.</span><span class="n">get_nprevpoints_approximation</span><span class="p">(</span><span class="n">budget</span><span class="p">,</span> <span class="n">n_classes</span><span class="p">,</span> <span class="n">n_repeats</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
-<span class="n">n</span> <span class="o">=</span> <span class="n">F</span><span class="o">.</span><span class="n">num_prevalence_combinations</span><span class="p">(</span><span class="n">n_prevpoints</span><span class="p">,</span> <span class="n">n_classes</span><span class="p">,</span> <span class="n">n_repeats</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
-<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;by setting n_prevpoints=</span><span class="si">{</span><span class="n">n_prevpoints</span><span class="si">}</span><span class="s1"> the number of evaluations for </span><span class="si">{</span><span class="n">n_classes</span><span class="si">}</span><span class="s1"> classes will be </span><span class="si">{</span><span class="n">n</span><span class="si">}</span><span class="s1">&#39;</span><span class="p">)</span>
+<p>It is often desirable to evaluate our system using more than one
+single evaluatio measure. In this case, it is convenient to generate
+a <em>report</em>. A report in QuaPy is a dataframe accounting for all the
+true prevalence values with their corresponding prevalence values
+as estimated by the quantifier, along with the error each has given
+rise.</p>
+<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">report</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">evaluation</span><span class="o">.</span><span class="n">evaluation_report</span><span class="p">(</span><span class="n">quantifier</span><span class="p">,</span> <span class="n">protocol</span><span class="o">=</span><span class="n">prot</span><span class="p">,</span> <span class="n">error_metrics</span><span class="o">=</span><span class="p">[</span><span class="s1">&#39;mae&#39;</span><span class="p">,</span> <span class="s1">&#39;mrae&#39;</span><span class="p">,</span> <span class="s1">&#39;mkld&#39;</span><span class="p">])</span>
 </pre></div>
 </div>
-<p>that will print:</p>
-<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">by</span> <span class="n">setting</span> <span class="n">n_prevpoints</span><span class="o">=</span><span class="mi">30</span> <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">evaluations</span> <span class="k">for</span> <span class="mi">4</span> <span class="n">classes</span> <span class="n">will</span> <span class="n">be</span> <span class="mi">4960</span>
+<p>From a pandas’ dataframe, it is straightforward to visualize all the results,
+and compute the averaged values, e.g.:</p>
+<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">pd</span><span class="o">.</span><span class="n">set_option</span><span class="p">(</span><span class="s1">&#39;display.expand_frame_repr&#39;</span><span class="p">,</span> <span class="kc">False</span><span class="p">)</span>
+<span class="n">report</span><span class="p">[</span><span class="s1">&#39;estim-prev&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">report</span><span class="p">[</span><span class="s1">&#39;estim-prev&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">F</span><span class="o">.</span><span class="n">strprev</span><span class="p">)</span>
+<span class="nb">print</span><span class="p">(</span><span class="n">report</span><span class="p">)</span>
+
+<span class="nb">print</span><span class="p">(</span><span class="s1">&#39;Averaged values:&#39;</span><span class="p">)</span>
+<span class="nb">print</span><span class="p">(</span><span class="n">report</span><span class="o">.</span><span class="n">mean</span><span class="p">())</span>
 </pre></div>
 </div>
-<p>The cost of evaluation will depend on the values of <em>n_prevpoints</em>, <em>n_classes</em>,
-and <em>n_repeats</em>. Since it might sometimes be cumbersome to control the overall
-cost of an experiment having to do with the number of combinations that
-will be generated for a particular setting of these arguments (particularly
-when <em>n_classes&gt;2</em>), evaluation functions
-typically allow the user to rather specify an <em>evaluation budget</em>, i.e., a maximum
-number of samplings to generate. By specifying this argument, one could avoid
-specifying <em>n_prevpoints</em>, and the value for it that would lead to a closer
-number of evaluation budget, without surpassing it, will be automatically set.</p>
-<p>The following script shows a full example in which a PACC model relying
-on a Logistic Regressor classifier is
-tested on the <em>kindle</em> dataset by means of the artificial prevalence
-sampling protocol on samples of size 500, in terms of various
-evaluation metrics.</p>
-<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">quapy</span> <span class="k">as</span> <span class="nn">qp</span>
-<span class="kn">import</span> <span class="nn">quapy.functional</span> <span class="k">as</span> <span class="nn">F</span>
-<span class="kn">from</span> <span class="nn">sklearn.linear_model</span> <span class="kn">import</span> <span class="n">LogisticRegression</span>
-
-<span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;SAMPLE_SIZE&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="mi">500</span>
-
-<span class="n">dataset</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_reviews</span><span class="p">(</span><span class="s1">&#39;kindle&#39;</span><span class="p">)</span>
-<span class="n">qp</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">preprocessing</span><span class="o">.</span><span class="n">text2tfidf</span><span class="p">(</span><span class="n">dataset</span><span class="p">,</span> <span class="n">min_df</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span> <span class="n">inplace</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
-
-<span class="n">training</span> <span class="o">=</span> <span class="n">dataset</span><span class="o">.</span><span class="n">training</span>
-<span class="n">test</span> <span class="o">=</span> <span class="n">dataset</span><span class="o">.</span><span class="n">test</span>
-
-<span class="n">lr</span> <span class="o">=</span> <span class="n">LogisticRegression</span><span class="p">()</span>
-<span class="n">pacc</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">method</span><span class="o">.</span><span class="n">aggregative</span><span class="o">.</span><span class="n">PACC</span><span class="p">(</span><span class="n">lr</span><span class="p">)</span>
-
-<span class="n">pacc</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">training</span><span class="p">)</span>
-
-<span class="n">df</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">evaluation</span><span class="o">.</span><span class="n">artificial_sampling_report</span><span class="p">(</span>
-    <span class="n">pacc</span><span class="p">,</span>  <span class="c1"># the quantification method</span>
-    <span class="n">test</span><span class="p">,</span>  <span class="c1"># the test set on which the method will be evaluated</span>
-    <span class="n">sample_size</span><span class="o">=</span><span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;SAMPLE_SIZE&#39;</span><span class="p">],</span>  <span class="c1">#indicates the size of samples to be drawn</span>
-    <span class="n">n_prevpoints</span><span class="o">=</span><span class="mi">11</span><span class="p">,</span>  <span class="c1"># how many prevalence points will be extracted from the interval [0, 1] for each category</span>
-    <span class="n">n_repetitions</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>  <span class="c1"># number of times each prevalence will be used to generate a test sample</span>
-    <span class="n">n_jobs</span><span class="o">=-</span><span class="mi">1</span><span class="p">,</span>  <span class="c1"># indicates the number of parallel workers (-1 indicates, as in sklearn, all CPUs)</span>
-    <span class="n">random_seed</span><span class="o">=</span><span class="mi">42</span><span class="p">,</span>  <span class="c1"># setting a random seed allows to replicate the test samples across runs</span>
-    <span class="n">error_metrics</span><span class="o">=</span><span class="p">[</span><span class="s1">&#39;mae&#39;</span><span class="p">,</span> <span class="s1">&#39;mrae&#39;</span><span class="p">,</span> <span class="s1">&#39;mkld&#39;</span><span class="p">],</span>  <span class="c1"># specify the evaluation metrics</span>
-    <span class="n">verbose</span><span class="o">=</span><span class="kc">True</span>  <span class="c1"># set to True to show some standard-line outputs</span>
-<span class="p">)</span>
-</pre></div>
-</div>
-<p>The resulting report is a pandas’ dataframe that can be directly printed.
-Here, we set some display options from pandas just to make the output clearer;
-note also that the estimated prevalences are shown as strings using the
-function strprev function that simply converts a prevalence into a
-string representing it, with a fixed decimal precision (default 3):</p>
-<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span>
-<span class="n">pd</span><span class="o">.</span><span class="n">set_option</span><span class="p">(</span><span class="s1">&#39;display.expand_frame_repr&#39;</span><span class="p">,</span> <span class="kc">False</span><span class="p">)</span>
-<span class="n">pd</span><span class="o">.</span><span class="n">set_option</span><span class="p">(</span><span class="s2">&quot;precision&quot;</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
-<span class="n">df</span><span class="p">[</span><span class="s1">&#39;estim-prev&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">df</span><span class="p">[</span><span class="s1">&#39;estim-prev&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">F</span><span class="o">.</span><span class="n">strprev</span><span class="p">)</span>
-<span class="nb">print</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
-</pre></div>
-</div>
-<p>The output should look like:</p>
+<p>This will produce an output like:</p>
 <div class="highlight-default notranslate"><div class="highlight"><pre><span></span>           <span class="n">true</span><span class="o">-</span><span class="n">prev</span>      <span class="n">estim</span><span class="o">-</span><span class="n">prev</span>       <span class="n">mae</span>      <span class="n">mrae</span>      <span class="n">mkld</span>
-<span class="mi">0</span>   <span class="p">[</span><span class="mf">0.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.000</span><span class="p">,</span> <span class="mf">1.000</span><span class="p">]</span>  <span class="mf">0.000</span>   <span class="mf">0.000</span>  <span class="mf">0.000e+00</span>
-<span class="mi">1</span>   <span class="p">[</span><span class="mf">0.1</span><span class="p">,</span> <span class="mf">0.9</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.091</span><span class="p">,</span> <span class="mf">0.909</span><span class="p">]</span>  <span class="mf">0.009</span>   <span class="mf">0.048</span>  <span class="mf">4.426e-04</span>
-<span class="mi">2</span>   <span class="p">[</span><span class="mf">0.2</span><span class="p">,</span> <span class="mf">0.8</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.163</span><span class="p">,</span> <span class="mf">0.837</span><span class="p">]</span>  <span class="mf">0.037</span>   <span class="mf">0.114</span>  <span class="mf">4.633e-03</span>
-<span class="mi">3</span>   <span class="p">[</span><span class="mf">0.3</span><span class="p">,</span> <span class="mf">0.7</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.283</span><span class="p">,</span> <span class="mf">0.717</span><span class="p">]</span>  <span class="mf">0.017</span>   <span class="mf">0.041</span>  <span class="mf">7.383e-04</span>
-<span class="mi">4</span>   <span class="p">[</span><span class="mf">0.4</span><span class="p">,</span> <span class="mf">0.6</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.366</span><span class="p">,</span> <span class="mf">0.634</span><span class="p">]</span>  <span class="mf">0.034</span>   <span class="mf">0.070</span>  <span class="mf">2.412e-03</span>
-<span class="mi">5</span>   <span class="p">[</span><span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.459</span><span class="p">,</span> <span class="mf">0.541</span><span class="p">]</span>  <span class="mf">0.041</span>   <span class="mf">0.082</span>  <span class="mf">3.387e-03</span>
-<span class="mi">6</span>   <span class="p">[</span><span class="mf">0.6</span><span class="p">,</span> <span class="mf">0.4</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.565</span><span class="p">,</span> <span class="mf">0.435</span><span class="p">]</span>  <span class="mf">0.035</span>   <span class="mf">0.073</span>  <span class="mf">2.535e-03</span>
-<span class="mi">7</span>   <span class="p">[</span><span class="mf">0.7</span><span class="p">,</span> <span class="mf">0.3</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.654</span><span class="p">,</span> <span class="mf">0.346</span><span class="p">]</span>  <span class="mf">0.046</span>   <span class="mf">0.108</span>  <span class="mf">4.701e-03</span>
-<span class="mi">8</span>   <span class="p">[</span><span class="mf">0.8</span><span class="p">,</span> <span class="mf">0.2</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.725</span><span class="p">,</span> <span class="mf">0.275</span><span class="p">]</span>  <span class="mf">0.075</span>   <span class="mf">0.235</span>  <span class="mf">1.515e-02</span>
-<span class="mi">9</span>   <span class="p">[</span><span class="mf">0.9</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.858</span><span class="p">,</span> <span class="mf">0.142</span><span class="p">]</span>  <span class="mf">0.042</span>   <span class="mf">0.229</span>  <span class="mf">7.740e-03</span>
-<span class="mi">10</span>  <span class="p">[</span><span class="mf">1.0</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.945</span><span class="p">,</span> <span class="mf">0.055</span><span class="p">]</span>  <span class="mf">0.055</span>  <span class="mf">27.357</span>  <span class="mf">5.219e-02</span>
-</pre></div>
-</div>
-<p>One can get the averaged scores using standard pandas’
-functions, i.e.:</p>
-<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="nb">print</span><span class="p">(</span><span class="n">df</span><span class="o">.</span><span class="n">mean</span><span class="p">())</span>
-</pre></div>
-</div>
-<p>will produce the following output:</p>
-<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">true</span><span class="o">-</span><span class="n">prev</span>    <span class="mf">0.500</span>
-<span class="n">mae</span>          <span class="mf">0.035</span>
-<span class="n">mrae</span>         <span class="mf">2.578</span>
-<span class="n">mkld</span>         <span class="mf">0.009</span>
+<span class="mi">0</span>     <span class="p">[</span><span class="mf">0.308</span><span class="p">,</span> <span class="mf">0.692</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.314</span><span class="p">,</span> <span class="mf">0.686</span><span class="p">]</span>  <span class="mf">0.005649</span>  <span class="mf">0.013182</span>  <span class="mf">0.000074</span>
+<span class="mi">1</span>     <span class="p">[</span><span class="mf">0.896</span><span class="p">,</span> <span class="mf">0.104</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.909</span><span class="p">,</span> <span class="mf">0.091</span><span class="p">]</span>  <span class="mf">0.013145</span>  <span class="mf">0.069323</span>  <span class="mf">0.000985</span>
+<span class="mi">2</span>     <span class="p">[</span><span class="mf">0.848</span><span class="p">,</span> <span class="mf">0.152</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.809</span><span class="p">,</span> <span class="mf">0.191</span><span class="p">]</span>  <span class="mf">0.039063</span>  <span class="mf">0.149806</span>  <span class="mf">0.005175</span>
+<span class="mi">3</span>     <span class="p">[</span><span class="mf">0.016</span><span class="p">,</span> <span class="mf">0.984</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.033</span><span class="p">,</span> <span class="mf">0.967</span><span class="p">]</span>  <span class="mf">0.017236</span>  <span class="mf">0.487529</span>  <span class="mf">0.005298</span>
+<span class="mi">4</span>     <span class="p">[</span><span class="mf">0.728</span><span class="p">,</span> <span class="mf">0.272</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.751</span><span class="p">,</span> <span class="mf">0.249</span><span class="p">]</span>  <span class="mf">0.022769</span>  <span class="mf">0.057146</span>  <span class="mf">0.001350</span>
+<span class="o">...</span>              <span class="o">...</span>             <span class="o">...</span>       <span class="o">...</span>       <span class="o">...</span>       <span class="o">...</span>
+<span class="mi">4995</span>    <span class="p">[</span><span class="mf">0.72</span><span class="p">,</span> <span class="mf">0.28</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.698</span><span class="p">,</span> <span class="mf">0.302</span><span class="p">]</span>  <span class="mf">0.021752</span>  <span class="mf">0.053631</span>  <span class="mf">0.001133</span>
+<span class="mi">4996</span>  <span class="p">[</span><span class="mf">0.868</span><span class="p">,</span> <span class="mf">0.132</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.888</span><span class="p">,</span> <span class="mf">0.112</span><span class="p">]</span>  <span class="mf">0.020490</span>  <span class="mf">0.088230</span>  <span class="mf">0.001985</span>
+<span class="mi">4997</span>  <span class="p">[</span><span class="mf">0.292</span><span class="p">,</span> <span class="mf">0.708</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.298</span><span class="p">,</span> <span class="mf">0.702</span><span class="p">]</span>  <span class="mf">0.006149</span>  <span class="mf">0.014788</span>  <span class="mf">0.000090</span>
+<span class="mi">4998</span>    <span class="p">[</span><span class="mf">0.24</span><span class="p">,</span> <span class="mf">0.76</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.220</span><span class="p">,</span> <span class="mf">0.780</span><span class="p">]</span>  <span class="mf">0.019950</span>  <span class="mf">0.054309</span>  <span class="mf">0.001127</span>
+<span class="mi">4999</span>  <span class="p">[</span><span class="mf">0.948</span><span class="p">,</span> <span class="mf">0.052</span><span class="p">]</span>  <span class="p">[</span><span class="mf">0.965</span><span class="p">,</span> <span class="mf">0.035</span><span class="p">]</span>  <span class="mf">0.016941</span>  <span class="mf">0.165776</span>  <span class="mf">0.003538</span>
+
+<span class="p">[</span><span class="mi">5000</span> <span class="n">rows</span> <span class="n">x</span> <span class="mi">5</span> <span class="n">columns</span><span class="p">]</span>
+<span class="n">Averaged</span> <span class="n">values</span><span class="p">:</span>
+<span class="n">mae</span>     <span class="mf">0.023588</span>
+<span class="n">mrae</span>    <span class="mf">0.108779</span>
+<span class="n">mkld</span>    <span class="mf">0.003631</span>
 <span class="n">dtype</span><span class="p">:</span> <span class="n">float64</span>
+
+<span class="n">Process</span> <span class="n">finished</span> <span class="k">with</span> <span class="n">exit</span> <span class="n">code</span> <span class="mi">0</span>
 </pre></div>
 </div>
-<p>Other evaluation functions include:</p>
-<ul class="simple">
-<li><p><em>artificial_sampling_eval</em>: that computes the evaluation for a
-given evaluation metric, returning the average instead of a dataframe.</p></li>
-<li><p><em>artificial_sampling_prediction</em>: that returns two np.arrays containing the
-true prevalences and the estimated prevalences.</p></li>
-</ul>
-<p>See the documentation for further details.</p>
+<p>Alternatively, we can simply generate all the predictions by:</p>
+<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">true_prevs</span><span class="p">,</span> <span class="n">estim_prevs</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">evaluation</span><span class="o">.</span><span class="n">prediction</span><span class="p">(</span><span class="n">quantifier</span><span class="p">,</span> <span class="n">protocol</span><span class="o">=</span><span class="n">prot</span><span class="p">)</span>
+</pre></div>
+</div>
+<p>All the evaluation functions implement specific optimizations for speeding-up
+the evaluation of aggregative quantifiers (i.e., of instances of <em>AggregativeQuantifier</em>).
+The optimization comes down to generating classification predictions (either crisp or soft)
+only once for the entire test set, and then applying the sampling procedure to the
+predictions, instead of generating samples of instances and then computing the
+classification predictions every time. This is only possible when the protocol
+is an instance of <em>OnLabelledCollectionProtocol</em>. The optimization is only
+carried out when the number of classification predictions thus generated would be
+smaller than the number of predictions required for the entire protocol; e.g.,
+if the original dataset contains 1M instances, but the protocol is such that it would
+at most generate 20 samples of 100 instances, then it would be preferable to postpone the
+classification for each sample. This behaviour is indicated by setting
+<em>aggr_speedup=”auto”</em>. Conversely, when indicating <em>aggr_speedup=”force”</em> QuaPy will
+precompute all the predictions irrespectively of the number of instances and number of samples.
+Finally, this can be deactivated by setting <em>aggr_speedup=False</em>. Note that this optimization
+is not only applied for the final evaluation, but also for the internal evaluations carried
+out during <em>model selection</em>. Since these are typically many, the heuristic can help reduce the
+execution time a lot.</p>
 </section>
 </section>

@ -285,8 +230,8 @@ true prevalences and the estimated prevalences.</p></li>
  </div>
  <div>
    <h4>Next topic</h4>
-    <p class="topless"><a href="Methods.html"
-                          title="next chapter">Quantification Methods</a></p>
+    <p class="topless"><a href="Protocols.html"
+                          title="next chapter">Protocols</a></p>
  </div>
  <div role="note" aria-label="source link">
    <h3>This Page</h3>
@ -319,7 +264,7 @@ true prevalences and the estimated prevalences.</p></li>
          <a href="py-modindex.html" title="Python Module Index"
             >modules</a> |</li>
        <li class="right" >
-          <a href="Methods.html" title="Quantification Methods"
+          <a href="Protocols.html" title="Protocols"
             >next</a> |</li>
        <li class="right" >
          <a href="Datasets.html" title="Datasets"
--- a/docs/build/html/Methods.html
+++ b/docs/build/html/Methods.html
@ -21,7 +21,7 @@
    <link rel="index" title="Index" href="genindex.html" />
    <link rel="search" title="Search" href="search.html" />
    <link rel="next" title="Model Selection" href="Model-Selection.html" />
-    <link rel="prev" title="Evaluation" href="Evaluation.html" />
+    <link rel="prev" title="Protocols" href="Protocols.html" />
    <meta name="viewport" content="width=device-width,initial-scale=1.0" />
    <!--[if lt IE 9]>
    <script src="_static/css3-mediaqueries.js"></script>
@ -40,7 +40,7 @@
          <a href="Model-Selection.html" title="Model Selection"
             accesskey="N">next</a> |</li>
        <li class="right" >
-          <a href="Evaluation.html" title="Evaluation"
+          <a href="Protocols.html" title="Protocols"
             accesskey="P">previous</a> |</li>
        <li class="nav-item nav-item-0"><a href="index.html">QuaPy 0.1.7 documentation</a> &#187;</li>
        <li class="nav-item nav-item-this"><a href="">Quantification Methods</a></li> 
@ -68,12 +68,6 @@ and implement some abstract methods:</p>

    <span class="nd">@abstractmethod</span>
    <span class="k">def</span> <span class="nf">quantify</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">instances</span><span class="p">):</span> <span class="o">...</span>
-
-    <span class="nd">@abstractmethod</span>
-    <span class="k">def</span> <span class="nf">set_params</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">**</span><span class="n">parameters</span><span class="p">):</span> <span class="o">...</span>
-
-    <span class="nd">@abstractmethod</span>
-    <span class="k">def</span> <span class="nf">get_params</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">deep</span><span class="o">=</span><span class="kc">True</span><span class="p">):</span> <span class="o">...</span>
 </pre></div>
 </div>
 <p>The meaning of those functions should be familiar to those
@ -85,10 +79,10 @@ scikit-learn’ structure has not been adopted <em>as is</em> in QuaPy responds
 the fact that scikit-learn’s <em>predict</em> function is expected to return
 one output for each input element –e.g., a predicted label for each
 instance in a sample– while in quantification the output for a sample
-is one single array of class prevalences), while functions <em>set_params</em>
-and <em>get_params</em> allow a
-<a class="reference external" href="https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection">model selector</a>
-to automate the process of hyperparameter search.</p>
+is one single array of class prevalences).
+Quantifiers also extend from scikit-learn’s <code class="docutils literal notranslate"><span class="pre">BaseEstimator</span></code>, in order
+to simplify the use of <em>set_params</em> and <em>get_params</em> used in
+<a class="reference external" href="https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection">model selector</a>.</p>
 <section id="aggregative-methods">
 <h2>Aggregative Methods<a class="headerlink" href="#aggregative-methods" title="Permalink to this heading">¶</a></h2>
 <p>All quantification methods are implemented as part of the
@ -106,12 +100,12 @@ The methods that any <em>aggregative</em> quantifier must implement are:</p>
 individual predictions of a classifier. Indeed, a default implementation
 of <em>BaseQuantifier.quantify</em> is already provided, which looks like:</p>
 <div class="highlight-python notranslate"><div class="highlight"><pre><span></span>    <span class="k">def</span> <span class="nf">quantify</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">instances</span><span class="p">):</span>
-    <span class="n">classif_predictions</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">preclassify</span><span class="p">(</span><span class="n">instances</span><span class="p">)</span>
+    <span class="n">classif_predictions</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">classify</span><span class="p">(</span><span class="n">instances</span><span class="p">)</span>
    <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">aggregate</span><span class="p">(</span><span class="n">classif_predictions</span><span class="p">)</span>
 </pre></div>
 </div>
 <p>Aggregative quantifiers are expected to maintain a classifier (which is
-accessed through the <em>&#64;property</em> <em>learner</em>). This classifier is
+accessed through the <em>&#64;property</em> <em>classifier</em>). This classifier is
 given as input to the quantifier, and can be already fit
 on external data (in which case, the <em>fit_learner</em> argument should
 be set to False), or be fit by the quantifier’s fit (default).</p>
@ -121,12 +115,8 @@ aggregative methods, that should inherit from the abstract class
 The particularity of <em>probabilistic</em> aggregative methods (w.r.t.
 non-probabilistic ones), is that the default quantifier is defined
 in terms of the posterior probabilities returned by a probabilistic
-classifier, and not by the crisp decisions of a hard classifier; i.e.:</p>
-<div class="highlight-python notranslate"><div class="highlight"><pre><span></span>    <span class="k">def</span> <span class="nf">quantify</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">instances</span><span class="p">):</span>
-        <span class="n">classif_posteriors</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">posterior_probabilities</span><span class="p">(</span><span class="n">instances</span><span class="p">)</span>
-        <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">aggregate</span><span class="p">(</span><span class="n">classif_posteriors</span><span class="p">)</span>
-</pre></div>
-</div>
+classifier, and not by the crisp decisions of a hard classifier.
+In any case, the interface <em>classify(instances)</em> remains unchanged.</p>
 <p>One advantage of <em>aggregative</em> methods (either probabilistic or not)
 is that the evaluation according to any sampling procedure (e.g.,
 the <a class="reference external" href="https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation">artificial sampling protocol</a>)
@ -153,9 +143,7 @@ with a SVM as the classifier:</p>
 <span class="kn">import</span> <span class="nn">quapy.functional</span> <span class="k">as</span> <span class="nn">F</span>
 <span class="kn">from</span> <span class="nn">sklearn.svm</span> <span class="kn">import</span> <span class="n">LinearSVC</span>

-<span class="n">dataset</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_twitter</span><span class="p">(</span><span class="s1">&#39;hcr&#39;</span><span class="p">,</span> <span class="n">pickle</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
-<span class="n">training</span> <span class="o">=</span> <span class="n">dataset</span><span class="o">.</span><span class="n">training</span>
-<span class="n">test</span> <span class="o">=</span> <span class="n">dataset</span><span class="o">.</span><span class="n">test</span>
+<span class="n">training</span><span class="p">,</span> <span class="n">test</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_twitter</span><span class="p">(</span><span class="s1">&#39;hcr&#39;</span><span class="p">,</span> <span class="n">pickle</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span><span class="o">.</span><span class="n">train_test</span>

 <span class="c1"># instantiate a classifier learner, in this case a SVM</span>
 <span class="n">svm</span> <span class="o">=</span> <span class="n">LinearSVC</span><span class="p">()</span>
@ -199,7 +187,7 @@ e.g.:</p>
 <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">model</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">method</span><span class="o">.</span><span class="n">aggregative</span><span class="o">.</span><span class="n">PCC</span><span class="p">(</span><span class="n">svm</span><span class="p">)</span>
 <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">training</span><span class="p">)</span>
 <span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
-<span class="nb">print</span><span class="p">(</span><span class="s1">&#39;classifier:&#39;</span><span class="p">,</span> <span class="n">model</span><span class="o">.</span><span class="n">learner</span><span class="p">)</span>
+<span class="nb">print</span><span class="p">(</span><span class="s1">&#39;classifier:&#39;</span><span class="p">,</span> <span class="n">model</span><span class="o">.</span><span class="n">classifier</span><span class="p">)</span>
 </pre></div>
 </div>
 <p>In this case, QuaPy will print:</p>
@ -244,13 +232,21 @@ experiments we have carried out.</p>
 <span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
 </pre></div>
 </div>
+<p><em>New in v0.1.7</em>: EMQ now accepts two new parameters in the construction method, namely
+<em>exact_train_prev</em> which allows to use the true training prevalence as the departing
+prevalence estimation (default behaviour), or instead an approximation of it as
+suggested by <a class="reference external" href="http://proceedings.mlr.press/v119/alexandari20a.html">Alexandari et al. (2020)</a>
+(by setting <em>exact_train_prev=False</em>).
+The other parameter is <em>recalib</em> which allows to indicate a calibration method, among those
+proposed by <a class="reference external" href="http://proceedings.mlr.press/v119/alexandari20a.html">Alexandari et al. (2020)</a>,
+including the Bias-Corrected Temperature Scaling, Vector Scaling, etc.
+See the API documentation for further details.</p>
 </section>
 <section id="hellinger-distance-y-hdy">
 <h3>Hellinger Distance y (HDy)<a class="headerlink" href="#hellinger-distance-y-hdy" title="Permalink to this heading">¶</a></h3>
-<p>The method HDy is described in:</p>
-<p><em>Implementation of the method based on the Hellinger Distance y (HDy) proposed by
-González-Castro, V., Alaiz-Rodrı́guez, R., and Alegre, E. (2013). Class distribution
-estimation based on the Hellinger distance. Information Sciences, 218:146–164.</em></p>
+<p>Implementation of the method based on the Hellinger Distance y (HDy) proposed by
+<a class="reference external" href="https://www.sciencedirect.com/science/article/pii/S0020025512004069">González-Castro, V., Alaiz-Rodrı́guez, R., and Alegre, E. (2013). Class distribution
+estimation based on the Hellinger distance. Information Sciences, 218:146–164.</a></p>
 <p>It is implemented in <em>qp.method.aggregative.HDy</em> (also accessible
 through the allias <em>qp.method.aggregative.HellingerDistanceY</em>).
 This method works with a probabilistic classifier (hard classifiers
@ -277,30 +273,48 @@ provided in QuaPy accepts only binary datasets.</p>
 <span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
 </pre></div>
 </div>
+<p><em>New in v0.1.7:</em> QuaPy now provides an implementation of the generalized
+“Distribution Matching” approaches for multiclass, inspired by the framework
+of <a class="reference external" href="https://arxiv.org/abs/1606.00868">Firat (2016)</a>. One can instantiate
+a variant of HDy for multiclass quantification as follows:</p>
+<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">mutliclassHDy</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">method</span><span class="o">.</span><span class="n">aggregative</span><span class="o">.</span><span class="n">DistributionMatching</span><span class="p">(</span><span class="n">classifier</span><span class="o">=</span><span class="n">LogisticRegression</span><span class="p">(),</span> <span class="n">divergence</span><span class="o">=</span><span class="s1">&#39;HD&#39;</span><span class="p">,</span> <span class="n">cdf</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
+</pre></div>
+</div>
+<p><em>New in v0.1.7:</em> QuaPy now provides an implementation of the “DyS”
+framework proposed by <a class="reference external" href="https://ojs.aaai.org/index.php/AAAI/article/view/4376">Maletzke et al (2020)</a>
+and the “SMM” method proposed by <a class="reference external" href="https://ieeexplore.ieee.org/document/9260028">Hassan et al (2019)</a>
+(thanks to <em>Pablo González</em> for the contributions!)</p>
+</section>
+<section id="threshold-optimization-methods">
+<h3>Threshold Optimization methods<a class="headerlink" href="#threshold-optimization-methods" title="Permalink to this heading">¶</a></h3>
+<p><em>New in v0.1.7:</em> QuaPy now implements Forman’s threshold optimization methods;
+see, e.g., <a class="reference external" href="https://dl.acm.org/doi/abs/10.1145/1150402.1150423">(Forman 2006)</a>
+and <a class="reference external" href="https://link.springer.com/article/10.1007/s10618-008-0097-y">(Forman 2008)</a>.
+These include: T50, MAX, X, Median Sweep (MS), and its variant MS2.</p>
 </section>
 <section id="explicit-loss-minimization">
 <h3>Explicit Loss Minimization<a class="headerlink" href="#explicit-loss-minimization" title="Permalink to this heading">¶</a></h3>
 <p>The Explicit Loss Minimization (ELM) represent a family of methods
 based on structured output learning, i.e., quantifiers relying on
 classifiers that have been optimized targeting a
-quantification-oriented evaluation measure.</p>
-<p>In QuaPy, the following methods, all relying on Joachim’s
-<a class="reference external" href="https://www.cs.cornell.edu/people/tj/svm_light/svm_perf.html">SVMperf</a>
-implementation, are available in <em>qp.method.aggregative</em>:</p>
+quantification-oriented evaluation measure.
+The original methods are implemented in QuaPy as classify &amp; count (CC)
+quantifiers that use Joachim’s <a class="reference external" href="https://www.cs.cornell.edu/people/tj/svm_light/svm_perf.html">SVMperf</a>
+as the underlying classifier, properly set to optimize for the desired loss.</p>
+<p>In QuaPy, this can be more achieved by calling the functions:</p>
 <ul class="simple">
-<li><p>SVMQ (SVM-Q) is a quantification method optimizing the metric <em>Q</em> defined
-in <em>Barranquero, J., Díez, J., and del Coz, J. J. (2015). Quantification-oriented learning based
-on reliable classifiers. Pattern Recognition, 48(2):591–604.</em></p></li>
-<li><p>SVMKLD (SVM for Kullback-Leibler Divergence) proposed in <em>Esuli, A. and Sebastiani, F. (2015).
+<li><p><em>newSVMQ</em>: returns the quantification method called SVM(Q) that optimizes for the metric <em>Q</em> defined
+in <a class="reference external" href="https://www.sciencedirect.com/science/article/pii/S003132031400291X"><em>Barranquero, J., Díez, J., and del Coz, J. J. (2015). Quantification-oriented learning based
+on reliable classifiers. Pattern Recognition, 48(2):591–604.</em></a></p></li>
+<li><p><em>newSVMKLD</em> and <em>newSVMNKLD</em>: returns the quantification method called SVM(KLD) and SVM(nKLD), standing for
+Kullback-Leibler Divergence and Normalized Kullback-Leibler Divergence, as proposed in <a class="reference external" href="https://dl.acm.org/doi/abs/10.1145/2700406"><em>Esuli, A. and Sebastiani, F. (2015).
 Optimizing text quantifiers for multivariate loss functions.
-ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27.</em></p></li>
-<li><p>SVMNKLD (SVM for Normalized Kullback-Leibler Divergence) proposed in <em>Esuli, A. and Sebastiani, F. (2015).
-Optimizing text quantifiers for multivariate loss functions.
-ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27.</em></p></li>
-<li><p>SVMAE (SVM for Mean Absolute Error)</p></li>
-<li><p>SVMRAE (SVM for Mean Relative Absolute Error)</p></li>
+ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27.</em></a></p></li>
+<li><p><em>newSVMAE</em> and <em>newSVMRAE</em>: returns a quantification method called SVM(AE) and SVM(RAE) that optimizes for the (Mean) Absolute Error and for the
+(Mean) Relative Absolute Error, as first used by
+<a class="reference external" href="https://arxiv.org/abs/2011.02552"><em>Moreo, A. and Sebastiani, F. (2021). Tweet sentiment quantification: An experimental re-evaluation. PLOS ONE 17 (9), 1-23.</em></a></p></li>
 </ul>
-<p>the last two methods (SVMAE and SVMRAE) have been implemented in
+<p>the last two methods (SVM(AE) and SVM(RAE)) have been implemented in
 QuaPy in order to make available ELM variants for what nowadays
 are considered the most well-behaved evaluation metrics in quantification.</p>
 <p>In order to make these models work, you would need to run the script
@ -330,11 +344,15 @@ currently supports only binary classification.
 ELM variants (any binary quantifier in general) can be extended
 to operate in single-label scenarios trivially by adopting a
 “one-vs-all” strategy (as, e.g., in
-<em>Gao, W. and Sebastiani, F. (2016). From classification to quantification in tweet sentiment
-analysis. Social Network Analysis and Mining, 6(19):1–22</em>).
-In QuaPy this is possible by using the <em>OneVsAll</em> class:</p>
+<a class="reference external" href="https://link.springer.com/article/10.1007/s13278-016-0327-z"><em>Gao, W. and Sebastiani, F. (2016). From classification to quantification in tweet sentiment
+analysis. Social Network Analysis and Mining, 6(19):1–22</em></a>).
+In QuaPy this is possible by using the <em>OneVsAll</em> class.</p>
+<p>There are two ways for instantiating this class, <em>OneVsAllGeneric</em> that works for
+any quantifier, and <em>OneVsAllAggregative</em> that is optimized for aggregative quantifiers.
+In general, you can simply use the <em>getOneVsAll</em> function and QuaPy will choose
+the more convenient of the two.</p>
 <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">quapy</span> <span class="k">as</span> <span class="nn">qp</span>
-<span class="kn">from</span> <span class="nn">quapy.method.aggregative</span> <span class="kn">import</span> <span class="n">SVMQ</span><span class="p">,</span> <span class="n">OneVsAll</span>
+<span class="kn">from</span> <span class="nn">quapy.method.aggregative</span> <span class="kn">import</span> <span class="n">SVMQ</span>

 <span class="c1"># load a single-label dataset (this one contains 3 classes)</span>
 <span class="n">dataset</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_twitter</span><span class="p">(</span><span class="s1">&#39;hcr&#39;</span><span class="p">,</span> <span class="n">pickle</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
@ -342,11 +360,13 @@ In QuaPy this is possible by using the <em>OneVsAll</em> class:</p>
 <span class="c1"># let qp know where svmperf is</span>
 <span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;SVMPERF_HOME&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s1">&#39;../svm_perf_quantification&#39;</span>

-<span class="n">model</span> <span class="o">=</span> <span class="n">OneVsAll</span><span class="p">(</span><span class="n">SVMQ</span><span class="p">(),</span> <span class="n">n_jobs</span><span class="o">=-</span><span class="mi">1</span><span class="p">)</span>  <span class="c1"># run them on parallel</span>
+<span class="n">model</span> <span class="o">=</span> <span class="n">getOneVsAll</span><span class="p">(</span><span class="n">SVMQ</span><span class="p">(),</span> <span class="n">n_jobs</span><span class="o">=-</span><span class="mi">1</span><span class="p">)</span>  <span class="c1"># run them on parallel</span>
 <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">training</span><span class="p">)</span>
 <span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
 </pre></div>
 </div>
+<p>Check the examples <em><span class="xref myst">explicit_loss_minimization.py</span></em>
+and <span class="xref myst">one_vs_all.py</span> for more details.</p>
 </section>
 </section>
 <section id="meta-models">
@ -360,12 +380,12 @@ groups).
 <h3>Ensembles<a class="headerlink" href="#ensembles" title="Permalink to this heading">¶</a></h3>
 <p>QuaPy implements (some of) the variants proposed in:</p>
 <ul class="simple">
-<li><p><em>Pérez-Gállego, P., Quevedo, J. R., &amp; del Coz, J. J. (2017).
+<li><p><a class="reference external" href="https://www.sciencedirect.com/science/article/pii/S1566253516300628"><em>Pérez-Gállego, P., Quevedo, J. R., &amp; del Coz, J. J. (2017).
 Using ensembles for problems with characterizable changes in data distribution: A case study on quantification.
-Information Fusion, 34, 87-100.</em></p></li>
-<li><p><em>Pérez-Gállego, P., Castano, A., Quevedo, J. R., &amp; del Coz, J. J. (2019).
+Information Fusion, 34, 87-100.</em></a></p></li>
+<li><p><a class="reference external" href="https://www.sciencedirect.com/science/article/pii/S1566253517303652"><em>Pérez-Gállego, P., Castano, A., Quevedo, J. R., &amp; del Coz, J. J. (2019).
 Dynamic ensemble selection for quantification tasks.
-Information Fusion, 45, 1-15.</em></p></li>
+Information Fusion, 45, 1-15.</em></a></p></li>
 </ul>
 <p>The following code shows how to instantiate an Ensemble of 30 <em>Adjusted Classify &amp; Count</em> (ACC)
 quantifiers operating with a <em>Logistic Regressor</em> (LR) as the base classifier, and using the
@ -398,10 +418,10 @@ wiki if you want to optimize the hyperparameters of ensemble for classification
 <section id="the-quanet-neural-network">
 <h3>The QuaNet neural network<a class="headerlink" href="#the-quanet-neural-network" title="Permalink to this heading">¶</a></h3>
 <p>QuaPy offers an implementation of QuaNet, a deep learning model presented in:</p>
-<p><em>Esuli, A., Moreo, A., &amp; Sebastiani, F. (2018, October).
+<p><a class="reference external" href="https://dl.acm.org/doi/abs/10.1145/3269206.3269287"><em>Esuli, A., Moreo, A., &amp; Sebastiani, F. (2018, October).
 A recurrent neural network for sentiment quantification.
 In Proceedings of the 27th ACM International Conference on
-Information and Knowledge Management (pp. 1775-1778).</em></p>
+Information and Knowledge Management (pp. 1775-1778).</em></a></p>
 <p>This model requires <em>torch</em> to be installed.
 QuaNet also requires a classifier that can provide embedded representations
 of the inputs.
@ -423,7 +443,7 @@ In the following example, we show an instantiation of QuaNet that instead uses C
 <span class="n">learner</span> <span class="o">=</span> <span class="n">NeuralClassifierTrainer</span><span class="p">(</span><span class="n">cnn</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="s1">&#39;cuda&#39;</span><span class="p">)</span>

 <span class="c1"># train QuaNet</span>
-<span class="n">model</span> <span class="o">=</span> <span class="n">QuaNet</span><span class="p">(</span><span class="n">learner</span><span class="p">,</span> <span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;SAMPLE_SIZE&#39;</span><span class="p">],</span> <span class="n">device</span><span class="o">=</span><span class="s1">&#39;cuda&#39;</span><span class="p">)</span>
+<span class="n">model</span> <span class="o">=</span> <span class="n">QuaNet</span><span class="p">(</span><span class="n">learner</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="s1">&#39;cuda&#39;</span><span class="p">)</span>
 <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">training</span><span class="p">)</span>
 <span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
 </pre></div>
@ -447,6 +467,7 @@ In the following example, we show an instantiation of QuaNet that instead uses C
 <li><a class="reference internal" href="#the-classify-count-variants">The Classify &amp; Count variants</a></li>
 <li><a class="reference internal" href="#expectation-maximization-emq">Expectation Maximization (EMQ)</a></li>
 <li><a class="reference internal" href="#hellinger-distance-y-hdy">Hellinger Distance y (HDy)</a></li>
+<li><a class="reference internal" href="#threshold-optimization-methods">Threshold Optimization methods</a></li>
 <li><a class="reference internal" href="#explicit-loss-minimization">Explicit Loss Minimization</a></li>
 </ul>
 </li>
@ -462,8 +483,8 @@ In the following example, we show an instantiation of QuaNet that instead uses C
  </div>
  <div>
    <h4>Previous topic</h4>
-    <p class="topless"><a href="Evaluation.html"
-                          title="previous chapter">Evaluation</a></p>
+    <p class="topless"><a href="Protocols.html"
+                          title="previous chapter">Protocols</a></p>
  </div>
  <div>
    <h4>Next topic</h4>
@ -504,7 +525,7 @@ In the following example, we show an instantiation of QuaNet that instead uses C
          <a href="Model-Selection.html" title="Model Selection"
             >next</a> |</li>
        <li class="right" >
-          <a href="Evaluation.html" title="Evaluation"
+          <a href="Protocols.html" title="Protocols"
             >previous</a> |</li>
        <li class="nav-item nav-item-0"><a href="index.html">QuaPy 0.1.7 documentation</a> &#187;</li>
        <li class="nav-item nav-item-this"><a href="">Quantification Methods</a></li> 
--- a/docs/build/html/Model-Selection.html
+++ b/docs/build/html/Model-Selection.html
@ -74,81 +74,91 @@ specifically designed for the task of quantification.</p>
 classification, and thus the model selection strategies
 customarily adopted in classification have simply been
 applied to quantification (see the next section).
-It has been argued in <em>Moreo, Alejandro, and Fabrizio Sebastiani.
-“Re-Assessing the” Classify and Count” Quantification Method.”
-arXiv preprint arXiv:2011.02552 (2020).</em>
+It has been argued in <a class="reference external" href="https://link.springer.com/chapter/10.1007/978-3-030-72240-1_6">Moreo, Alejandro, and Fabrizio Sebastiani.
+Re-Assessing the “Classify and Count” Quantification Method.
+ECIR 2021: Advances in Information Retrieval pp 75–91.</a>
 that specific model selection strategies should
 be adopted for quantification. That is, model selection
 strategies for quantification should target
 quantification-oriented losses and be tested in a variety
 of scenarios exhibiting different degrees of prior
 probability shift.</p>
-<p>The class
-<em>qp.model_selection.GridSearchQ</em>
-implements a grid-search exploration over the space of
-hyper-parameter combinations that evaluates each<br />
-combination of hyper-parameters
-by means of a given quantification-oriented
+<p>The class <em>qp.model_selection.GridSearchQ</em> implements a grid-search exploration over the space of
+hyper-parameter combinations that <a class="reference external" href="https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation">evaluates</a>
+each combination of hyper-parameters by means of a given quantification-oriented
 error metric (e.g., any of the error functions implemented
-in <em>qp.error</em>) and according to the
-<a class="reference external" href="https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation"><em>artificial sampling protocol</em></a>.</p>
-<p>The following is an example of model selection for quantification:</p>
+in <em>qp.error</em>) and according to a
+<a class="reference external" href="https://github.com/HLT-ISTI/QuaPy/wiki/Protocols">sampling generation protocol</a>.</p>
+<p>The following is an example (also included in the examples folder) of model selection for quantification:</p>
 <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">quapy</span> <span class="k">as</span> <span class="nn">qp</span>
-<span class="kn">from</span> <span class="nn">quapy.method.aggregative</span> <span class="kn">import</span> <span class="n">PCC</span>
+<span class="kn">from</span> <span class="nn">quapy.protocol</span> <span class="kn">import</span> <span class="n">APP</span>
+<span class="kn">from</span> <span class="nn">quapy.method.aggregative</span> <span class="kn">import</span> <span class="n">DistributionMatching</span>
 <span class="kn">from</span> <span class="nn">sklearn.linear_model</span> <span class="kn">import</span> <span class="n">LogisticRegression</span>
 <span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span>

-<span class="c1"># set a seed to replicate runs</span>
-<span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
-<span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;SAMPLE_SIZE&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="mi">500</span>
+<span class="sd">&quot;&quot;&quot;</span>
+<span class="sd">In this example, we show how to perform model selection on a DistributionMatching quantifier.</span>
+<span class="sd">&quot;&quot;&quot;</span>

-<span class="n">dataset</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_reviews</span><span class="p">(</span><span class="s1">&#39;hp&#39;</span><span class="p">,</span> <span class="n">tfidf</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">min_df</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
+<span class="n">model</span> <span class="o">=</span> <span class="n">DistributionMatching</span><span class="p">(</span><span class="n">LogisticRegression</span><span class="p">())</span>
+
+<span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;SAMPLE_SIZE&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="mi">100</span>
+<span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;N_JOBS&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span>  <span class="c1"># explore hyper-parameters in parallel</span>
+
+<span class="n">training</span><span class="p">,</span> <span class="n">test</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_reviews</span><span class="p">(</span><span class="s1">&#39;imdb&#39;</span><span class="p">,</span> <span class="n">tfidf</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">min_df</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span><span class="o">.</span><span class="n">train_test</span>

 <span class="c1"># The model will be returned by the fit method of GridSearchQ.</span>
-<span class="c1"># Model selection will be performed with a fixed budget of 1000 evaluations</span>
-<span class="c1"># for each hyper-parameter combination. The error to optimize is the MAE for</span>
-<span class="c1"># quantification, as evaluated on artificially drawn samples at prevalences </span>
-<span class="c1"># covering the entire spectrum on a held-out portion (40%) of the training set.</span>
+<span class="c1"># Every combination of hyper-parameters will be evaluated by confronting the</span>
+<span class="c1"># quantifier thus configured against a series of samples generated by means</span>
+<span class="c1"># of a sample generation protocol. For this example, we will use the</span>
+<span class="c1"># artificial-prevalence protocol (APP), that generates samples with prevalence</span>
+<span class="c1"># values in the entire range of values from a grid (e.g., [0, 0.1, 0.2, ..., 1]).</span>
+<span class="c1"># We devote 30% of the dataset for this exploration.</span>
+<span class="n">training</span><span class="p">,</span> <span class="n">validation</span> <span class="o">=</span> <span class="n">training</span><span class="o">.</span><span class="n">split_stratified</span><span class="p">(</span><span class="n">train_prop</span><span class="o">=</span><span class="mf">0.7</span><span class="p">)</span>
+<span class="n">protocol</span> <span class="o">=</span> <span class="n">APP</span><span class="p">(</span><span class="n">validation</span><span class="p">)</span>
+
+<span class="c1"># We will explore a classification-dependent hyper-parameter (e.g., the &#39;C&#39;</span>
+<span class="c1"># hyper-parameter of LogisticRegression) and a quantification-dependent hyper-parameter</span>
+<span class="c1"># (e.g., the number of bins in a DistributionMatching quantifier.</span>
+<span class="c1"># Classifier-dependent hyper-parameters have to be marked with a prefix &quot;classifier__&quot;</span>
+<span class="c1"># in order to let the quantifier know this hyper-parameter belongs to its underlying</span>
+<span class="c1"># classifier.</span>
+<span class="n">param_grid</span> <span class="o">=</span> <span class="p">{</span>
+    <span class="s1">&#39;classifier__C&#39;</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">logspace</span><span class="p">(</span><span class="o">-</span><span class="mi">3</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">7</span><span class="p">),</span>
+    <span class="s1">&#39;nbins&#39;</span><span class="p">:</span> <span class="p">[</span><span class="mi">8</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">64</span><span class="p">],</span>
+<span class="p">}</span>
+
 <span class="n">model</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">model_selection</span><span class="o">.</span><span class="n">GridSearchQ</span><span class="p">(</span>
-    <span class="n">model</span><span class="o">=</span><span class="n">PCC</span><span class="p">(</span><span class="n">LogisticRegression</span><span class="p">()),</span>
-    <span class="n">param_grid</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;C&#39;</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">logspace</span><span class="p">(</span><span class="o">-</span><span class="mi">4</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">10</span><span class="p">),</span> <span class="s1">&#39;class_weight&#39;</span><span class="p">:</span> <span class="p">[</span><span class="s1">&#39;balanced&#39;</span><span class="p">,</span> <span class="kc">None</span><span class="p">]},</span>
-    <span class="n">sample_size</span><span class="o">=</span><span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;SAMPLE_SIZE&#39;</span><span class="p">],</span>
-    <span class="n">eval_budget</span><span class="o">=</span><span class="mi">1000</span><span class="p">,</span>
-    <span class="n">error</span><span class="o">=</span><span class="s1">&#39;mae&#39;</span><span class="p">,</span>
-    <span class="n">refit</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>  <span class="c1"># retrain on the whole labelled set</span>
-    <span class="n">val_split</span><span class="o">=</span><span class="mf">0.4</span><span class="p">,</span>
+    <span class="n">model</span><span class="o">=</span><span class="n">model</span><span class="p">,</span>
+    <span class="n">param_grid</span><span class="o">=</span><span class="n">param_grid</span><span class="p">,</span>
+    <span class="n">protocol</span><span class="o">=</span><span class="n">protocol</span><span class="p">,</span>
+    <span class="n">error</span><span class="o">=</span><span class="s1">&#39;mae&#39;</span><span class="p">,</span>  <span class="c1"># the error to optimize is the MAE (a quantification-oriented loss)</span>
+    <span class="n">refit</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>   <span class="c1"># retrain on the whole labelled set once done</span>
    <span class="n">verbose</span><span class="o">=</span><span class="kc">True</span>  <span class="c1"># show information as the process goes on</span>
-<span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">training</span><span class="p">)</span>
+<span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">training</span><span class="p">)</span>

 <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;model selection ended: best hyper-parameters=</span><span class="si">{</span><span class="n">model</span><span class="o">.</span><span class="n">best_params_</span><span class="si">}</span><span class="s1">&#39;</span><span class="p">)</span>
 <span class="n">model</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">best_model_</span>

 <span class="c1"># evaluation in terms of MAE</span>
-<span class="n">results</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">evaluation</span><span class="o">.</span><span class="n">artificial_sampling_eval</span><span class="p">(</span>
-    <span class="n">model</span><span class="p">,</span>
-    <span class="n">dataset</span><span class="o">.</span><span class="n">test</span><span class="p">,</span>
-    <span class="n">sample_size</span><span class="o">=</span><span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;SAMPLE_SIZE&#39;</span><span class="p">],</span>
-    <span class="n">n_prevpoints</span><span class="o">=</span><span class="mi">101</span><span class="p">,</span>
-    <span class="n">n_repetitions</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span>
-    <span class="n">error_metric</span><span class="o">=</span><span class="s1">&#39;mae&#39;</span>
-<span class="p">)</span>
+<span class="c1"># we use the same evaluation protocol (APP) on the test set</span>
+<span class="n">mae_score</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">evaluation</span><span class="o">.</span><span class="n">evaluate</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">protocol</span><span class="o">=</span><span class="n">APP</span><span class="p">(</span><span class="n">test</span><span class="p">),</span> <span class="n">error_metric</span><span class="o">=</span><span class="s1">&#39;mae&#39;</span><span class="p">)</span>

-<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;MAE=</span><span class="si">{</span><span class="n">results</span><span class="si">:</span><span class="s1">.5f</span><span class="si">}</span><span class="s1">&#39;</span><span class="p">)</span>
+<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;MAE=</span><span class="si">{</span><span class="n">mae_score</span><span class="si">:</span><span class="s1">.5f</span><span class="si">}</span><span class="s1">&#39;</span><span class="p">)</span>
 </pre></div>
 </div>
 <p>In this example, the system outputs:</p>
-<div class="highlight-default notranslate"><div class="highlight"><pre><span></span>[GridSearchQ]: starting optimization with n_jobs=1
-[GridSearchQ]: checking hyperparams={&#39;C&#39;: 0.0001, &#39;class_weight&#39;: &#39;balanced&#39;} got mae score 0.24987
-[GridSearchQ]: checking hyperparams={&#39;C&#39;: 0.0001, &#39;class_weight&#39;: None} got mae score 0.48135
-[GridSearchQ]: checking hyperparams={&#39;C&#39;: 0.001, &#39;class_weight&#39;: &#39;balanced&#39;} got mae score 0.24866
-[...]
-[GridSearchQ]: checking hyperparams={&#39;C&#39;: 100000.0, &#39;class_weight&#39;: None} got mae score 0.43676
-[GridSearchQ]: optimization finished: best params {&#39;C&#39;: 0.1, &#39;class_weight&#39;: &#39;balanced&#39;} (score=0.19982)
-[GridSearchQ]: refitting on the whole development set
-model selection ended: best hyper-parameters={&#39;C&#39;: 0.1, &#39;class_weight&#39;: &#39;balanced&#39;}
-1010 evaluations will be performed for each combination of hyper-parameters
-[artificial sampling protocol] generating predictions: 100%|██████████| 1010/1010 [00:00&lt;00:00, 5005.54it/s]
-MAE=0.20342
+<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="p">[</span><span class="n">GridSearchQ</span><span class="p">]:</span> <span class="n">starting</span> <span class="n">model</span> <span class="n">selection</span> <span class="k">with</span> <span class="bp">self</span><span class="o">.</span><span class="n">n_jobs</span> <span class="o">=-</span><span class="mi">1</span>
+<span class="p">[</span><span class="n">GridSearchQ</span><span class="p">]:</span> <span class="n">hyperparams</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;classifier__C&#39;</span><span class="p">:</span> <span class="mf">0.01</span><span class="p">,</span> <span class="s1">&#39;nbins&#39;</span><span class="p">:</span> <span class="mi">64</span><span class="p">}</span>	 <span class="n">got</span> <span class="n">mae</span> <span class="n">score</span> <span class="mf">0.04021</span> <span class="p">[</span><span class="n">took</span> <span class="mf">1.1356</span><span class="n">s</span><span class="p">]</span>
+<span class="p">[</span><span class="n">GridSearchQ</span><span class="p">]:</span> <span class="n">hyperparams</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;classifier__C&#39;</span><span class="p">:</span> <span class="mf">0.01</span><span class="p">,</span> <span class="s1">&#39;nbins&#39;</span><span class="p">:</span> <span class="mi">32</span><span class="p">}</span>	 <span class="n">got</span> <span class="n">mae</span> <span class="n">score</span> <span class="mf">0.04286</span> <span class="p">[</span><span class="n">took</span> <span class="mf">1.2139</span><span class="n">s</span><span class="p">]</span>
+<span class="p">[</span><span class="n">GridSearchQ</span><span class="p">]:</span> <span class="n">hyperparams</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;classifier__C&#39;</span><span class="p">:</span> <span class="mf">0.01</span><span class="p">,</span> <span class="s1">&#39;nbins&#39;</span><span class="p">:</span> <span class="mi">16</span><span class="p">}</span>	 <span class="n">got</span> <span class="n">mae</span> <span class="n">score</span> <span class="mf">0.04888</span> <span class="p">[</span><span class="n">took</span> <span class="mf">1.2491</span><span class="n">s</span><span class="p">]</span>
+<span class="p">[</span><span class="n">GridSearchQ</span><span class="p">]:</span> <span class="n">hyperparams</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;classifier__C&#39;</span><span class="p">:</span> <span class="mf">0.001</span><span class="p">,</span> <span class="s1">&#39;nbins&#39;</span><span class="p">:</span> <span class="mi">8</span><span class="p">}</span>	 <span class="n">got</span> <span class="n">mae</span> <span class="n">score</span> <span class="mf">0.05163</span> <span class="p">[</span><span class="n">took</span> <span class="mf">1.5372</span><span class="n">s</span><span class="p">]</span>
+<span class="p">[</span><span class="o">...</span><span class="p">]</span>
+<span class="p">[</span><span class="n">GridSearchQ</span><span class="p">]:</span> <span class="n">hyperparams</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;classifier__C&#39;</span><span class="p">:</span> <span class="mf">1000.0</span><span class="p">,</span> <span class="s1">&#39;nbins&#39;</span><span class="p">:</span> <span class="mi">32</span><span class="p">}</span>	 <span class="n">got</span> <span class="n">mae</span> <span class="n">score</span> <span class="mf">0.02445</span> <span class="p">[</span><span class="n">took</span> <span class="mf">2.9056</span><span class="n">s</span><span class="p">]</span>
+<span class="p">[</span><span class="n">GridSearchQ</span><span class="p">]:</span> <span class="n">optimization</span> <span class="n">finished</span><span class="p">:</span> <span class="n">best</span> <span class="n">params</span> <span class="p">{</span><span class="s1">&#39;classifier__C&#39;</span><span class="p">:</span> <span class="mf">100.0</span><span class="p">,</span> <span class="s1">&#39;nbins&#39;</span><span class="p">:</span> <span class="mi">32</span><span class="p">}</span> <span class="p">(</span><span class="n">score</span><span class="o">=</span><span class="mf">0.02234</span><span class="p">)</span> <span class="p">[</span><span class="n">took</span> <span class="mf">7.3114</span><span class="n">s</span><span class="p">]</span>
+<span class="p">[</span><span class="n">GridSearchQ</span><span class="p">]:</span> <span class="n">refitting</span> <span class="n">on</span> <span class="n">the</span> <span class="n">whole</span> <span class="n">development</span> <span class="nb">set</span>
+<span class="n">model</span> <span class="n">selection</span> <span class="n">ended</span><span class="p">:</span> <span class="n">best</span> <span class="n">hyper</span><span class="o">-</span><span class="n">parameters</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;classifier__C&#39;</span><span class="p">:</span> <span class="mf">100.0</span><span class="p">,</span> <span class="s1">&#39;nbins&#39;</span><span class="p">:</span> <span class="mi">32</span><span class="p">}</span>
+<span class="n">MAE</span><span class="o">=</span><span class="mf">0.03102</span>
 </pre></div>
 </div>
 <p>The parameter <em>val_split</em> can alternatively be used to indicate
@ -172,30 +182,13 @@ The following code illustrates how to do that:</p>
    <span class="n">LogisticRegression</span><span class="p">(),</span>
    <span class="n">param_grid</span><span class="o">=</span><span class="p">{</span><span class="s1">&#39;C&#39;</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">logspace</span><span class="p">(</span><span class="o">-</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">10</span><span class="p">),</span> <span class="s1">&#39;class_weight&#39;</span><span class="p">:</span> <span class="p">[</span><span class="s1">&#39;balanced&#39;</span><span class="p">,</span> <span class="kc">None</span><span class="p">]},</span>
    <span class="n">cv</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
-<span class="n">model</span> <span class="o">=</span> <span class="n">PCC</span><span class="p">(</span><span class="n">learner</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">training</span><span class="p">)</span>
-<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">&#39;model selection ended: best hyper-parameters=</span><span class="si">{</span><span class="n">model</span><span class="o">.</span><span class="n">learner</span><span class="o">.</span><span class="n">best_params_</span><span class="si">}</span><span class="s1">&#39;</span><span class="p">)</span>
+<span class="n">model</span> <span class="o">=</span> <span class="n">DistributionMatching</span><span class="p">(</span><span class="n">learner</span><span class="p">)</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">training</span><span class="p">)</span>
 </pre></div>
 </div>
-<p>In this example, the system outputs:</p>
-<div class="highlight-default notranslate"><div class="highlight"><pre><span></span>model selection ended: best hyper-parameters={&#39;C&#39;: 10000.0, &#39;class_weight&#39;: None}
-1010 evaluations will be performed for each combination of hyper-parameters
-[artificial sampling protocol] generating predictions: 100%|██████████| 1010/1010 [00:00&lt;00:00, 5379.55it/s]
-MAE=0.41734
-</pre></div>
-</div>
-<p>Note that the MAE is worse than the one we obtained when optimizing
-for quantification and, indeed, the hyper-parameters found optimal
-largely differ between the two selection modalities. The
-hyper-parameters C=10000 and class_weight=None have been found
-to work well for the specific training prevalence of the HP dataset,
-but these hyper-parameters turned out to be suboptimal when the
-class prevalences of the test set differs (as is indeed tested
-in scenarios of quantification).</p>
-<p>This is, however, not always the case, and one could, in practice,
-find examples
-in which optimizing for classification ends up resulting in a better
-quantifier than when optimizing for quantification.
-Nonetheless, this is theoretically unlikely to happen.</p>
+<p>However, this is conceptually flawed, since the model should be
+optimized for the task at hand (quantification), and not for a surrogate task (classification),
+i.e., the model should be requested to deliver low quantification errors, rather
+than low classification errors.</p>
 </section>
 </section>

--- a/docs/build/html/Plotting.html
+++ b/docs/build/html/Plotting.html
@ -94,7 +94,7 @@ quantification methods across different scenarios showcasing
 the accuracy of the quantifier in predicting class prevalences
 for a wide range of prior distributions. This can easily be
 achieved by means of the
-<a class="reference external" href="https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation">artificial sampling protocol</a>
+<a class="reference external" href="https://github.com/HLT-ISTI/QuaPy/wiki/Protocols">artificial sampling protocol</a>
 that is implemented in QuaPy.</p>
 <p>The following code shows how to perform one simple experiment
 in which the 4 <em>CC-variants</em>, all equipped with a linear SVM, are
@ -103,6 +103,7 @@ tested across the entire spectrum of class priors (taking 21 splits
 of the interval [0,1], i.e., using prevalence steps of 0.05, and
 generating 100 random samples at each prevalence).</p>
 <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">quapy</span> <span class="k">as</span> <span class="nn">qp</span>
+<span class="kn">from</span> <span class="nn">protocol</span> <span class="kn">import</span> <span class="n">APP</span>
 <span class="kn">from</span> <span class="nn">quapy.method.aggregative</span> <span class="kn">import</span> <span class="n">CC</span><span class="p">,</span> <span class="n">ACC</span><span class="p">,</span> <span class="n">PCC</span><span class="p">,</span> <span class="n">PACC</span>
 <span class="kn">from</span> <span class="nn">sklearn.svm</span> <span class="kn">import</span> <span class="n">LinearSVC</span>

@ -111,28 +112,26 @@ generating 100 random samples at each prevalence).</p>
 <span class="k">def</span> <span class="nf">gen_data</span><span class="p">():</span>

    <span class="k">def</span> <span class="nf">base_classifier</span><span class="p">():</span>
-        <span class="k">return</span> <span class="n">LinearSVC</span><span class="p">()</span>
+        <span class="k">return</span> <span class="n">LinearSVC</span><span class="p">(</span><span class="n">class_weight</span><span class="o">=</span><span class="s1">&#39;balanced&#39;</span><span class="p">)</span>

    <span class="k">def</span> <span class="nf">models</span><span class="p">():</span>
-        <span class="k">yield</span> <span class="n">CC</span><span class="p">(</span><span class="n">base_classifier</span><span class="p">())</span>
-        <span class="k">yield</span> <span class="n">ACC</span><span class="p">(</span><span class="n">base_classifier</span><span class="p">())</span>
-        <span class="k">yield</span> <span class="n">PCC</span><span class="p">(</span><span class="n">base_classifier</span><span class="p">())</span>
-        <span class="k">yield</span> <span class="n">PACC</span><span class="p">(</span><span class="n">base_classifier</span><span class="p">())</span>
+        <span class="k">yield</span> <span class="s1">&#39;CC&#39;</span><span class="p">,</span> <span class="n">CC</span><span class="p">(</span><span class="n">base_classifier</span><span class="p">())</span>
+        <span class="k">yield</span> <span class="s1">&#39;ACC&#39;</span><span class="p">,</span> <span class="n">ACC</span><span class="p">(</span><span class="n">base_classifier</span><span class="p">())</span>
+        <span class="k">yield</span> <span class="s1">&#39;PCC&#39;</span><span class="p">,</span> <span class="n">PCC</span><span class="p">(</span><span class="n">base_classifier</span><span class="p">())</span>
+        <span class="k">yield</span> <span class="s1">&#39;PACC&#39;</span><span class="p">,</span> <span class="n">PACC</span><span class="p">(</span><span class="n">base_classifier</span><span class="p">())</span>

-    <span class="n">data</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_reviews</span><span class="p">(</span><span class="s1">&#39;kindle&#39;</span><span class="p">,</span> <span class="n">tfidf</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">min_df</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
+    <span class="n">train</span><span class="p">,</span> <span class="n">test</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_reviews</span><span class="p">(</span><span class="s1">&#39;kindle&#39;</span><span class="p">,</span> <span class="n">tfidf</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">min_df</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span><span class="o">.</span><span class="n">train_test</span>

    <span class="n">method_names</span><span class="p">,</span> <span class="n">true_prevs</span><span class="p">,</span> <span class="n">estim_prevs</span><span class="p">,</span> <span class="n">tr_prevs</span> <span class="o">=</span> <span class="p">[],</span> <span class="p">[],</span> <span class="p">[],</span> <span class="p">[]</span>

-    <span class="k">for</span> <span class="n">model</span> <span class="ow">in</span> <span class="n">models</span><span class="p">():</span>
-        <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">training</span><span class="p">)</span>
-        <span class="n">true_prev</span><span class="p">,</span> <span class="n">estim_prev</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">evaluation</span><span class="o">.</span><span class="n">artificial_sampling_prediction</span><span class="p">(</span>
-            <span class="n">model</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">test</span><span class="p">,</span> <span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;SAMPLE_SIZE&#39;</span><span class="p">],</span> <span class="n">n_repetitions</span><span class="o">=</span><span class="mi">100</span><span class="p">,</span> <span class="n">n_prevpoints</span><span class="o">=</span><span class="mi">21</span>
-        <span class="p">)</span>
+    <span class="k">for</span> <span class="n">method_name</span><span class="p">,</span> <span class="n">model</span> <span class="ow">in</span> <span class="n">models</span><span class="p">():</span>
+        <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">train</span><span class="p">)</span>
+        <span class="n">true_prev</span><span class="p">,</span> <span class="n">estim_prev</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">evaluation</span><span class="o">.</span><span class="n">prediction</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">APP</span><span class="p">(</span><span class="n">test</span><span class="p">,</span> <span class="n">repeats</span><span class="o">=</span><span class="mi">100</span><span class="p">,</span> <span class="n">random_state</span><span class="o">=</span><span class="mi">0</span><span class="p">))</span>

-        <span class="n">method_names</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="vm">__class__</span><span class="o">.</span><span class="vm">__name__</span><span class="p">)</span>
+        <span class="n">method_names</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">method_name</span><span class="p">)</span>
        <span class="n">true_prevs</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">true_prev</span><span class="p">)</span>
        <span class="n">estim_prevs</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">estim_prev</span><span class="p">)</span>
-        <span class="n">tr_prevs</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">data</span><span class="o">.</span><span class="n">training</span><span class="o">.</span><span class="n">prevalence</span><span class="p">())</span>
+        <span class="n">tr_prevs</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">train</span><span class="o">.</span><span class="n">prevalence</span><span class="p">())</span>

    <span class="k">return</span> <span class="n">method_names</span><span class="p">,</span> <span class="n">true_prevs</span><span class="p">,</span> <span class="n">estim_prevs</span><span class="p">,</span> <span class="n">tr_prevs</span>

@ -199,21 +198,19 @@ IMDb dataset, and generate the bias plot again.
 This example can be run by rewritting the <em>gen_data()</em> function
 like this:</p>
 <div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">gen_data</span><span class="p">():</span>
-    <span class="n">data</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_reviews</span><span class="p">(</span><span class="s1">&#39;imdb&#39;</span><span class="p">,</span> <span class="n">tfidf</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">min_df</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
+
+    <span class="n">train</span><span class="p">,</span> <span class="n">test</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_reviews</span><span class="p">(</span><span class="s1">&#39;imdb&#39;</span><span class="p">,</span> <span class="n">tfidf</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">min_df</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span><span class="o">.</span><span class="n">train_test</span>
    <span class="n">model</span> <span class="o">=</span> <span class="n">CC</span><span class="p">(</span><span class="n">LinearSVC</span><span class="p">())</span>

    <span class="n">method_data</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="k">for</span> <span class="n">training_prevalence</span> <span class="ow">in</span> <span class="n">np</span><span class="o">.</span><span class="n">linspace</span><span class="p">(</span><span class="mf">0.1</span><span class="p">,</span> <span class="mf">0.9</span><span class="p">,</span> <span class="mi">9</span><span class="p">):</span>
        <span class="n">training_size</span> <span class="o">=</span> <span class="mi">5000</span>
-        <span class="c1"># since the problem is binary, it suffices to specify the negative prevalence (the positive is constrained)</span>
-        <span class="n">training</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">training</span><span class="o">.</span><span class="n">sampling</span><span class="p">(</span><span class="n">training_size</span><span class="p">,</span> <span class="mi">1</span> <span class="o">-</span> <span class="n">training_prevalence</span><span class="p">)</span>
-        <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">training</span><span class="p">)</span>
-        <span class="n">true_prev</span><span class="p">,</span> <span class="n">estim_prev</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">evaluation</span><span class="o">.</span><span class="n">artificial_sampling_prediction</span><span class="p">(</span>
-            <span class="n">model</span><span class="p">,</span> <span class="n">data</span><span class="o">.</span><span class="n">sample</span><span class="p">,</span> <span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">&#39;SAMPLE_SIZE&#39;</span><span class="p">],</span> <span class="n">n_repetitions</span><span class="o">=</span><span class="mi">100</span><span class="p">,</span> <span class="n">n_prevpoints</span><span class="o">=</span><span class="mi">21</span>
-        <span class="p">)</span>
-        <span class="c1"># method names can contain Latex syntax</span>
-        <span class="n">method_name</span> <span class="o">=</span> <span class="s1">&#39;CC$_{&#39;</span> <span class="o">+</span> <span class="sa">f</span><span class="s1">&#39;</span><span class="si">{</span><span class="nb">int</span><span class="p">(</span><span class="mi">100</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">training_prevalence</span><span class="p">)</span><span class="si">}</span><span class="s1">&#39;</span> <span class="o">+</span> <span class="s1">&#39;\%}$&#39;</span>
-        <span class="n">method_data</span><span class="o">.</span><span class="n">append</span><span class="p">((</span><span class="n">method_name</span><span class="p">,</span> <span class="n">true_prev</span><span class="p">,</span> <span class="n">estim_prev</span><span class="p">,</span> <span class="n">training</span><span class="o">.</span><span class="n">prevalence</span><span class="p">()))</span>
+        <span class="c1"># since the problem is binary, it suffices to specify the negative prevalence, since the positive is constrained</span>
+        <span class="n">train_sample</span> <span class="o">=</span> <span class="n">train</span><span class="o">.</span><span class="n">sampling</span><span class="p">(</span><span class="n">training_size</span><span class="p">,</span> <span class="mi">1</span><span class="o">-</span><span class="n">training_prevalence</span><span class="p">)</span>
+        <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">train_sample</span><span class="p">)</span>
+        <span class="n">true_prev</span><span class="p">,</span> <span class="n">estim_prev</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">evaluation</span><span class="o">.</span><span class="n">prediction</span><span class="p">(</span><span class="n">model</span><span class="p">,</span> <span class="n">APP</span><span class="p">(</span><span class="n">test</span><span class="p">,</span> <span class="n">repeats</span><span class="o">=</span><span class="mi">100</span><span class="p">,</span> <span class="n">random_state</span><span class="o">=</span><span class="mi">0</span><span class="p">))</span>
+        <span class="n">method_name</span> <span class="o">=</span> <span class="s1">&#39;CC$_{&#39;</span><span class="o">+</span><span class="sa">f</span><span class="s1">&#39;</span><span class="si">{</span><span class="nb">int</span><span class="p">(</span><span class="mi">100</span><span class="o">*</span><span class="n">training_prevalence</span><span class="p">)</span><span class="si">}</span><span class="s1">&#39;</span> <span class="o">+</span> <span class="s1">&#39;\%}$&#39;</span>
+        <span class="n">method_data</span><span class="o">.</span><span class="n">append</span><span class="p">((</span><span class="n">method_name</span><span class="p">,</span> <span class="n">true_prev</span><span class="p">,</span> <span class="n">estim_prev</span><span class="p">,</span> <span class="n">train_sample</span><span class="o">.</span><span class="n">prevalence</span><span class="p">()))</span>

    <span class="k">return</span> <span class="nb">zip</span><span class="p">(</span><span class="o">*</span><span class="n">method_data</span><span class="p">)</span>
 </pre></div>
--- a/docs/build/html/_sources/Datasets.md.txt
+++ b/docs/build/html/_sources/Datasets.md.txt
@ -31,13 +31,14 @@ Output the class prevalences (showing 2 digit precision):
 ```

 One can easily produce new samples at desired class prevalences:
+
 ```python
 sample_size = 10
 prev = [0.4, 0.1, 0.5]
 sample = data.sampling(sample_size, *prev)

 print('instances:', sample.instances)
-print('labels:', sample.labels)
+print('labels:', sample.classes)
 print('prevalence:', F.strprev(sample.prevalence(), prec=2))
 ```

--- a/docs/build/html/_sources/Evaluation.md.txt
+++ b/docs/build/html/_sources/Evaluation.md.txt
@ -50,7 +50,7 @@ indicating the value for the smoothing parameter epsilon.
 Traditionally, this value is set to 1/(2T) in past literature,
 with T the sampling size. One could either pass this value
 to the function each time, or to set a QuaPy's environment 
-variable _SAMPLE_SIZE_ once, and ommit this argument 
+variable _SAMPLE_SIZE_ once, and omit this argument 
 thereafter (recommended);
 e.g.:

@ -58,7 +58,7 @@ e.g.:
 qp.environ['SAMPLE_SIZE'] = 100  # once for all
 true_prev = np.asarray([0.5, 0.3, 0.2])  # let's assume 3 classes
 estim_prev = np.asarray([0.1, 0.3, 0.6])
-error = qp.ae_.mrae(true_prev, estim_prev)
+error = qp.error.mrae(true_prev, estim_prev)
 print(f'mrae({true_prev}, {estim_prev}) = {error:.3f}')
 ```

@ -71,162 +71,99 @@ Finally, it is possible to instantiate QuaPy's quantification
 error functions from strings using, e.g.:

 ```python
-error_function = qp.ae_.from_name('mse')
+error_function = qp.error.from_name('mse')
 error = error_function(true_prev, estim_prev)
 ```

 ## Evaluation Protocols

-QuaPy implements the so-called "artificial sampling protocol", 
-according to which a test set is used to generate samplings at
-desired prevalences of fixed size and covering the full spectrum
-of prevalences. This protocol is called "artificial" in contrast 
-to the "natural prevalence sampling" protocol that,
-despite introducing some variability during sampling, approximately 
-preserves the training class prevalence.
-
-In the artificial sampling procol, the user specifies the number
-of (equally distant) points to be generated from the interval [0,1].
-
-For example, if n_prevpoints=11 then, for each class, the prevalences
-[0., 0.1, 0.2, ..., 1.] will be used. This means that, for two classes,
-the number of different prevalences will be 11 (since, once the prevalence
-of one class is determined, the other one is constrained). For 3 classes,
-the number of valid combinations can be obtained as 11 + 10 + ... + 1 = 66.
-In general, the number of valid combinations that will be produced for a given
-value of n_prevpoints can be consulted by invoking 
-quapy.functional.num_prevalence_combinations, e.g.:
+An _evaluation protocol_ is an evaluation procedure that uses
+one specific _sample generation procotol_ to genereate many
+samples, typically characterized by widely varying amounts of 
+_shift_ with respect to the original distribution, that are then
+used to evaluate the performance of a (trained) quantifier. 
+These protocols are explained in more detail in a dedicated [entry 
+in the wiki](Protocols.md). For the moment being, let us assume we already have
+chosen and instantiated one specific such protocol, that we here
+simply call _prot_. Let also assume our model is called
+_quantifier_ and that our evaluatio measure of choice is 
+_mae_. The evaluation comes down to:

 ```python
-import quapy.functional as F
-n_prevpoints = 21
-n_classes = 4
-n = F.num_prevalence_combinations(n_prevpoints, n_classes, n_repeats=1)
+mae = qp.evaluation.evaluate(quantifier, protocol=prot, error_metric='mae')
+print(f'MAE = {mae:.4f}')
 ```

-in this example, n=1771. Note the last argument, n_repeats, that
-informs of the number of examples that will be generated for any 
-valid combination (typical values are, e.g., 1 for a single sample,
-or 10 or higher for computing standard deviations of performing statistical
-significance tests).
-
-One can instead work the other way around, i.e., one could set a 
-maximum budged of evaluations and get the number of prevalence points that
-will generate a number of evaluations close, but not higher, than
-the fixed budget. This can be achieved with the function
-quapy.functional.get_nprevpoints_approximation, e.g.:
+It is often desirable to evaluate our system using more than one
+single evaluatio measure. In this case, it is convenient to generate
+a _report_. A report in QuaPy is a dataframe accounting for all the
+true prevalence values with their corresponding prevalence values
+as estimated by the quantifier, along with the error each has given
+rise. 

 ```python
-budget = 5000
-n_prevpoints = F.get_nprevpoints_approximation(budget, n_classes, n_repeats=1)
-n = F.num_prevalence_combinations(n_prevpoints, n_classes, n_repeats=1)
-print(f'by setting n_prevpoints={n_prevpoints} the number of evaluations for {n_classes} classes will be {n}')
-```
-that will print:
-```
-by setting n_prevpoints=30 the number of evaluations for 4 classes will be 4960
+report = qp.evaluation.evaluation_report(quantifier, protocol=prot, error_metrics=['mae', 'mrae', 'mkld'])
 ```

-The cost of evaluation will depend on the values of _n_prevpoints_, _n_classes_, 
-and _n_repeats_. Since it might sometimes be cumbersome to control the overall
-cost of an experiment having to do with the number of combinations that
-will be generated for a particular setting of these arguments (particularly
-when _n_classes>2_), evaluation functions
-typically allow the user to rather specify an _evaluation budget_, i.e., a maximum
-number of samplings to generate. By specifying this argument, one could avoid
-specifying _n_prevpoints_, and the value for it that would lead to a closer 
-number of evaluation budget, without surpassing it, will be automatically set.  
-
-The following script shows a full example in which a PACC model relying 
-on a Logistic Regressor classifier is
-tested on the _kindle_ dataset by means of the artificial prevalence
-sampling protocol on samples of size 500, in terms of various
-evaluation metrics.
-
-````python
-import quapy as qp
-import quapy.functional as F
-from sklearn.linear_model import LogisticRegression
-
-qp.environ['SAMPLE_SIZE'] = 500
-
-dataset = qp.datasets.fetch_reviews('kindle')
-qp.data.preprocessing.text2tfidf(dataset, min_df=5, inplace=True)
-
-training = dataset.training
-test = dataset.test
-
-lr = LogisticRegression()
-pacc = qp.method.aggregative.PACC(lr)
-
-pacc.fit(training)
-
-df = qp.evaluation.artificial_sampling_report(
-    pacc,  # the quantification method
-    test,  # the test set on which the method will be evaluated
-    sample_size=qp.environ['SAMPLE_SIZE'],  #indicates the size of samples to be drawn
-    n_prevpoints=11,  # how many prevalence points will be extracted from the interval [0, 1] for each category
-    n_repetitions=1,  # number of times each prevalence will be used to generate a test sample
-    n_jobs=-1,  # indicates the number of parallel workers (-1 indicates, as in sklearn, all CPUs)
-    random_seed=42,  # setting a random seed allows to replicate the test samples across runs
-    error_metrics=['mae', 'mrae', 'mkld'],  # specify the evaluation metrics
-    verbose=True  # set to True to show some standard-line outputs
-)
-````
-
-The resulting report is a pandas' dataframe that can be directly printed.
-Here, we set some display options from pandas just to make the output clearer; 
-note also that the estimated prevalences are shown as strings using the
-function strprev function that simply converts a prevalence into a 
-string representing it, with a fixed decimal precision (default 3):
+From a pandas' dataframe, it is straightforward to visualize all the results, 
+and compute the averaged values, e.g.: 

 ```python
-import pandas as pd
 pd.set_option('display.expand_frame_repr', False)
-pd.set_option("precision", 3)
-df['estim-prev'] = df['estim-prev'].map(F.strprev)
-print(df)
+report['estim-prev'] = report['estim-prev'].map(F.strprev)
+print(report)
+
+print('Averaged values:')
+print(report.mean())
 ```

-The output should look like:
+This will produce an output like:

 ```
           true-prev      estim-prev       mae      mrae      mkld
-0   [0.0, 1.0]  [0.000, 1.000]  0.000   0.000  0.000e+00
-1   [0.1, 0.9]  [0.091, 0.909]  0.009   0.048  4.426e-04
-2   [0.2, 0.8]  [0.163, 0.837]  0.037   0.114  4.633e-03
-3   [0.3, 0.7]  [0.283, 0.717]  0.017   0.041  7.383e-04
-4   [0.4, 0.6]  [0.366, 0.634]  0.034   0.070  2.412e-03
-5   [0.5, 0.5]  [0.459, 0.541]  0.041   0.082  3.387e-03
-6   [0.6, 0.4]  [0.565, 0.435]  0.035   0.073  2.535e-03
-7   [0.7, 0.3]  [0.654, 0.346]  0.046   0.108  4.701e-03
-8   [0.8, 0.2]  [0.725, 0.275]  0.075   0.235  1.515e-02
-9   [0.9, 0.1]  [0.858, 0.142]  0.042   0.229  7.740e-03
-10  [1.0, 0.0]  [0.945, 0.055]  0.055  27.357  5.219e-02
+0     [0.308, 0.692]  [0.314, 0.686]  0.005649  0.013182  0.000074
+1     [0.896, 0.104]  [0.909, 0.091]  0.013145  0.069323  0.000985
+2     [0.848, 0.152]  [0.809, 0.191]  0.039063  0.149806  0.005175
+3     [0.016, 0.984]  [0.033, 0.967]  0.017236  0.487529  0.005298
+4     [0.728, 0.272]  [0.751, 0.249]  0.022769  0.057146  0.001350
+...              ...             ...       ...       ...       ...
+4995    [0.72, 0.28]  [0.698, 0.302]  0.021752  0.053631  0.001133
+4996  [0.868, 0.132]  [0.888, 0.112]  0.020490  0.088230  0.001985
+4997  [0.292, 0.708]  [0.298, 0.702]  0.006149  0.014788  0.000090
+4998    [0.24, 0.76]  [0.220, 0.780]  0.019950  0.054309  0.001127
+4999  [0.948, 0.052]  [0.965, 0.035]  0.016941  0.165776  0.003538
+
+[5000 rows x 5 columns]
+Averaged values:
+mae     0.023588
+mrae    0.108779
+mkld    0.003631
+dtype: float64
+
+Process finished with exit code 0
 ```

-One can get the averaged scores using standard pandas' 
-functions, i.e.:
+Alternatively, we can simply generate all the predictions by:

 ```python
-print(df.mean())
+true_prevs, estim_prevs = qp.evaluation.prediction(quantifier, protocol=prot)
 ```

-will produce the following output:
-
-```
-true-prev    0.500
-mae          0.035
-mrae         2.578
-mkld         0.009
-dtype: float64
-```
-
-Other evaluation functions include:
-
-* _artificial_sampling_eval_: that computes the evaluation for a 
-given evaluation metric, returning the average instead of a dataframe.
-* _artificial_sampling_prediction_: that returns two np.arrays containing the
-true prevalences and the estimated prevalences. 
-
-See the documentation for further details.
+All the evaluation functions implement specific optimizations for speeding-up 
+the evaluation of aggregative quantifiers (i.e., of instances of _AggregativeQuantifier_).
+The optimization comes down to generating classification predictions (either crisp or soft) 
+only once for the entire test set, and then applying the sampling procedure to the
+predictions, instead of generating samples of instances and then computing the 
+classification predictions every time. This is only possible when the protocol
+is an instance of _OnLabelledCollectionProtocol_. The optimization is only 
+carried out when the number of classification predictions thus generated would be
+smaller than the number of predictions required for the entire protocol; e.g., 
+if the original dataset contains 1M instances, but the protocol is such that it would
+at most generate 20 samples of 100 instances, then it would be preferable to postpone the
+classification for each sample. This behaviour is indicated by setting 
+_aggr_speedup="auto"_. Conversely, when indicating _aggr_speedup="force"_ QuaPy will
+precompute all the predictions irrespectively of the number of instances and number of samples.
+Finally, this can be deactivated by setting _aggr_speedup=False_. Note that this optimization
+is not only applied for the final evaluation, but also for the internal evaluations carried
+out during _model selection_. Since these are typically many, the heuristic can help reduce the
+execution time a lot.
--- a/docs/build/html/_sources/Methods.md.txt
+++ b/docs/build/html/_sources/Methods.md.txt
@ -16,12 +16,6 @@ and implement some abstract methods:

    @abstractmethod
    def quantify(self, instances): ...
-
-    @abstractmethod
-    def set_params(self, **parameters): ...
-
-    @abstractmethod
-    def get_params(self, deep=True): ...
 ```
 The meaning of those functions should be familiar to those
 used to work with scikit-learn since the class structure of QuaPy
@ -32,10 +26,10 @@ scikit-learn' structure has not been adopted _as is_ in QuaPy responds to
 the fact that scikit-learn's _predict_ function is expected to return
 one output for each input element --e.g., a predicted label for each
 instance in a sample-- while in quantification the output for a sample
-is one single array of class prevalences), while functions _set_params_
-and _get_params_ allow a 
-[model selector](https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection) 
-to automate the process of hyperparameter search.
+is one single array of class prevalences).
+Quantifiers also extend from scikit-learn's `BaseEstimator`, in order
+to simplify the use of _set_params_ and _get_params_ used in 
+[model selector](https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection).

 ## Aggregative Methods

@ -58,11 +52,11 @@ of _BaseQuantifier.quantify_ is already provided, which looks like:

 ```python
    def quantify(self, instances):
-    classif_predictions = self.preclassify(instances)
+    classif_predictions = self.classify(instances)
    return self.aggregate(classif_predictions)
 ```
 Aggregative quantifiers are expected to maintain a classifier (which is
-accessed through the _@property_ _learner_). This classifier is
+accessed through the _@property_ _classifier_). This classifier is
 given as input to the quantifier, and can be already fit
 on external data (in which case, the _fit_learner_ argument should
 be set to False), or be fit by the quantifier's fit (default).
@ -73,13 +67,8 @@ _AggregativeProbabilisticQuantifier(AggregativeQuantifier)_.
 The particularity of _probabilistic_ aggregative methods (w.r.t. 
 non-probabilistic ones), is that the default quantifier is defined
 in terms of the posterior probabilities returned by a probabilistic
-classifier, and not by the crisp decisions of a hard classifier; i.e.:
-
-```python
-    def quantify(self, instances):
-        classif_posteriors = self.posterior_probabilities(instances)
-        return self.aggregate(classif_posteriors)
-```
+classifier, and not by the crisp decisions of a hard classifier.
+In any case, the interface _classify(instances)_ remains unchanged. 

 One advantage of _aggregative_ methods (either probabilistic or not)
 is that the evaluation according to any sampling procedure (e.g., 
@ -110,9 +99,7 @@ import quapy as qp
 import quapy.functional as F
 from sklearn.svm import LinearSVC

-dataset = qp.datasets.fetch_twitter('hcr', pickle=True)
-training = dataset.training
-test = dataset.test
+training, test = qp.datasets.fetch_twitter('hcr', pickle=True).train_test

 # instantiate a classifier learner, in this case a SVM
 svm = LinearSVC()
@ -156,11 +143,12 @@ model.fit(training, val_split=5)
 ```

 The following code illustrates the case in which PCC is used:
+
 ```python
 model = qp.method.aggregative.PCC(svm)
 model.fit(training)
 estim_prevalence = model.quantify(test.instances)
-print('classifier:', model.learner)
+print('classifier:', model.classifier)
 ```
 In this case, QuaPy will print:
 ```
@ -211,14 +199,22 @@ model.fit(dataset.training)
 estim_prevalence = model.quantify(dataset.test.instances)
 ```

+_New in v0.1.7_: EMQ now accepts two new parameters in the construction method, namely
+_exact_train_prev_ which allows to use the true training prevalence as the departing
+prevalence estimation (default behaviour), or instead an approximation of it as 
+suggested by [Alexandari et al. (2020)](http://proceedings.mlr.press/v119/alexandari20a.html) 
+(by setting _exact_train_prev=False_).
+The other parameter is _recalib_ which allows to indicate a calibration method, among those
+proposed by [Alexandari et al. (2020)](http://proceedings.mlr.press/v119/alexandari20a.html),
+including the Bias-Corrected Temperature Scaling, Vector Scaling, etc.
+See the API documentation for further details. 
+

 ### Hellinger Distance y (HDy)

-The method HDy is described in:
-
-_Implementation of the method based on the Hellinger Distance y (HDy) proposed by
-González-Castro, V., Alaiz-Rodrı́guez, R., and Alegre, E. (2013). Class distribution
-estimation based on the Hellinger distance. Information Sciences, 218:146–164._
+Implementation of the method based on the Hellinger Distance y (HDy) proposed by
+[González-Castro, V., Alaiz-Rodrı́guez, R., and Alegre, E. (2013). Class distribution
+estimation based on the Hellinger distance. Information Sciences, 218:146–164.](https://www.sciencedirect.com/science/article/pii/S0020025512004069)

 It is implemented in _qp.method.aggregative.HDy_ (also accessible
 through the allias _qp.method.aggregative.HellingerDistanceY_).
@ -249,30 +245,51 @@ model.fit(dataset.training)
 estim_prevalence = model.quantify(dataset.test.instances)
 ```

+_New in v0.1.7:_ QuaPy now provides an implementation of the generalized
+"Distribution Matching" approaches for multiclass, inspired by the framework
+of [Firat (2016)](https://arxiv.org/abs/1606.00868). One can instantiate
+a variant of HDy for multiclass quantification as follows:
+
+```python
+mutliclassHDy = qp.method.aggregative.DistributionMatching(classifier=LogisticRegression(), divergence='HD', cdf=False)
+``` 
+
+_New in v0.1.7:_ QuaPy now provides an implementation of the "DyS"
+framework proposed by [Maletzke et al (2020)](https://ojs.aaai.org/index.php/AAAI/article/view/4376)
+and the "SMM" method proposed by [Hassan et al (2019)](https://ieeexplore.ieee.org/document/9260028)
+(thanks to _Pablo González_ for the contributions!)
+
+### Threshold Optimization methods
+
+_New in v0.1.7:_ QuaPy now implements Forman's threshold optimization methods;
+see, e.g., [(Forman 2006)](https://dl.acm.org/doi/abs/10.1145/1150402.1150423) 
+and [(Forman 2008)](https://link.springer.com/article/10.1007/s10618-008-0097-y).
+These include: T50, MAX, X, Median Sweep (MS), and its variant MS2.
+
 ### Explicit Loss Minimization

 The Explicit Loss Minimization (ELM) represent a family of methods
 based on structured output learning, i.e., quantifiers relying on 
 classifiers that have been optimized targeting a 
 quantification-oriented evaluation measure.
+The original methods are implemented in QuaPy as classify & count (CC) 
+quantifiers that use Joachim's [SVMperf](https://www.cs.cornell.edu/people/tj/svm_light/svm_perf.html) 
+as the underlying classifier, properly set to optimize for the desired loss.
 
-In QuaPy, the following methods, all relying on Joachim's 
-[SVMperf](https://www.cs.cornell.edu/people/tj/svm_light/svm_perf.html)
-implementation, are available in _qp.method.aggregative_:
+In QuaPy, this can be more achieved by calling the functions:

-* SVMQ (SVM-Q) is a quantification method optimizing the metric _Q_ defined 
-in _Barranquero, J., Díez, J., and del Coz, J. J. (2015). Quantification-oriented learning based
-on reliable classifiers. Pattern Recognition, 48(2):591–604._
-* SVMKLD (SVM for Kullback-Leibler Divergence) proposed in _Esuli, A. and Sebastiani, F. (2015). 
+* _newSVMQ_: returns the quantification method called SVM(Q) that optimizes for the metric _Q_ defined 
+in [_Barranquero, J., Díez, J., and del Coz, J. J. (2015). Quantification-oriented learning based
+on reliable classifiers. Pattern Recognition, 48(2):591–604._](https://www.sciencedirect.com/science/article/pii/S003132031400291X) 
+* _newSVMKLD_ and _newSVMNKLD_: returns the quantification method called SVM(KLD) and SVM(nKLD), standing for 
+    Kullback-Leibler Divergence and Normalized Kullback-Leibler Divergence, as proposed in [_Esuli, A. and Sebastiani, F. (2015). 
    Optimizing text quantifiers for multivariate loss functions. 
-    ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27._
-* SVMNKLD (SVM for Normalized Kullback-Leibler Divergence) proposed in _Esuli, A. and Sebastiani, F. (2015). 
-    Optimizing text quantifiers for multivariate loss functions. 
-    ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27._
-* SVMAE (SVM for Mean Absolute Error) 
-* SVMRAE (SVM for Mean Relative Absolute Error)
+    ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27._](https://dl.acm.org/doi/abs/10.1145/2700406)
+* _newSVMAE_ and _newSVMRAE_: returns a quantification method called SVM(AE) and SVM(RAE) that optimizes for the (Mean) Absolute Error and for the
+  (Mean) Relative Absolute Error, as first used by 
+    [_Moreo, A. and Sebastiani, F. (2021). Tweet sentiment quantification: An experimental re-evaluation. PLOS ONE 17 (9), 1-23._](https://arxiv.org/abs/2011.02552)

-the last two methods (SVMAE and SVMRAE) have been implemented in 
+the last two methods (SVM(AE) and SVM(RAE)) have been implemented in 
 QuaPy in order to make available ELM variants for what nowadays
 are considered the most well-behaved evaluation metrics in quantification.

@ -306,13 +323,18 @@ currently supports only binary classification.
 ELM variants (any binary quantifier in general) can be extended
 to operate in single-label scenarios trivially by adopting a 
 "one-vs-all" strategy (as, e.g., in 
-_Gao, W. and Sebastiani, F. (2016). From classification to quantification in tweet sentiment
-analysis. Social Network Analysis and Mining, 6(19):1–22_).
-In QuaPy this is possible by using the _OneVsAll_ class:
+[_Gao, W. and Sebastiani, F. (2016). From classification to quantification in tweet sentiment
+analysis. Social Network Analysis and Mining, 6(19):1–22_](https://link.springer.com/article/10.1007/s13278-016-0327-z)).
+In QuaPy this is possible by using the _OneVsAll_ class.
+
+There are two ways for instantiating this class, _OneVsAllGeneric_ that works for
+any quantifier, and _OneVsAllAggregative_ that is optimized for aggregative quantifiers.
+In general, you can simply use the _getOneVsAll_ function and QuaPy will choose
+the more convenient of the two.

 ```python
 import quapy as qp
-from quapy.method.aggregative import SVMQ, OneVsAll
+from quapy.method.aggregative import SVMQ

 # load a single-label dataset (this one contains 3 classes)
 dataset = qp.datasets.fetch_twitter('hcr', pickle=True)
@ -320,11 +342,14 @@ dataset = qp.datasets.fetch_twitter('hcr', pickle=True)
 # let qp know where svmperf is
 qp.environ['SVMPERF_HOME'] = '../svm_perf_quantification'

-model = OneVsAll(SVMQ(), n_jobs=-1)  # run them on parallel
+model = getOneVsAll(SVMQ(), n_jobs=-1)  # run them on parallel
 model.fit(dataset.training)
 estim_prevalence = model.quantify(dataset.test.instances)
 ```

+Check the examples _[explicit_loss_minimization.py](..%2Fexamples%2Fexplicit_loss_minimization.py)_
+and [one_vs_all.py](..%2Fexamples%2Fone_vs_all.py) for more details.
+
 ## Meta Models

 By _meta_ models we mean quantification methods that are defined on top of other
@ -337,12 +362,12 @@ _Meta_ models are implemented in the _qp.method.meta_ module.

 QuaPy implements (some of) the variants proposed in:

-* _Pérez-Gállego, P., Quevedo, J. R., & del Coz, J. J. (2017).
+* [_Pérez-Gállego, P., Quevedo, J. R., & del Coz, J. J. (2017).
 Using ensembles for problems with characterizable changes in data distribution: A case study on quantification.
-Information Fusion, 34, 87-100._
-* _Pérez-Gállego, P., Castano, A., Quevedo, J. R., & del Coz, J. J. (2019). 
+Information Fusion, 34, 87-100._](https://www.sciencedirect.com/science/article/pii/S1566253516300628)
+* [_Pérez-Gállego, P., Castano, A., Quevedo, J. R., & del Coz, J. J. (2019). 
    Dynamic ensemble selection for quantification tasks. 
-    Information Fusion, 45, 1-15._
+    Information Fusion, 45, 1-15._](https://www.sciencedirect.com/science/article/pii/S1566253517303652)

 The following code shows how to instantiate an Ensemble of 30 _Adjusted Classify & Count_ (ACC) 
 quantifiers operating with a _Logistic Regressor_ (LR) as the base classifier, and using the
@ -378,10 +403,10 @@ wiki if you want to optimize the hyperparameters of ensemble for classification

 QuaPy offers an implementation of QuaNet, a deep learning model presented in:

-_Esuli, A., Moreo, A., & Sebastiani, F. (2018, October). 
+[_Esuli, A., Moreo, A., & Sebastiani, F. (2018, October). 
 A recurrent neural network for sentiment quantification. 
 In Proceedings of the 27th ACM International Conference on 
-Information and Knowledge Management (pp. 1775-1778)._
+Information and Knowledge Management (pp. 1775-1778)._](https://dl.acm.org/doi/abs/10.1145/3269206.3269287)

 This model requires _torch_ to be installed. 
 QuaNet also requires a classifier that can provide embedded representations
@ -406,7 +431,8 @@ cnn = CNNnet(dataset.vocabulary_size, dataset.n_classes)
 learner = NeuralClassifierTrainer(cnn, device='cuda')

 # train QuaNet
-model = QuaNet(learner, qp.environ['SAMPLE_SIZE'], device='cuda')
+model = QuaNet(learner, device='cuda')
 model.fit(dataset.training)
 estim_prevalence = model.quantify(dataset.test.instances)
 ```
+
--- a/docs/build/html/_sources/Model-Selection.md.txt
+++ b/docs/build/html/_sources/Model-Selection.md.txt
@ -22,9 +22,9 @@ Quantification has long been regarded as an add-on of
 classification, and thus the model selection strategies
 customarily adopted in classification have simply been
 applied to quantification (see the next section).
-It has been argued in _Moreo, Alejandro, and Fabrizio Sebastiani. 
-"Re-Assessing the" Classify and Count" Quantification Method." 
-arXiv preprint arXiv:2011.02552 (2020)._
+It has been argued in [Moreo, Alejandro, and Fabrizio Sebastiani. 
+Re-Assessing the "Classify and Count" Quantification Method. 
+ECIR 2021: Advances in Information Retrieval pp 75–91.](https://link.springer.com/chapter/10.1007/978-3-030-72240-1_6)
 that specific model selection strategies should
 be adopted for quantification. That is, model selection
 strategies for quantification should target 
@ -32,76 +32,86 @@ quantification-oriented losses and be tested in a variety
 of scenarios exhibiting different degrees of prior 
 probability shift.

-The class
-_qp.model_selection.GridSearchQ_
-implements a grid-search exploration over the space of 
-hyper-parameter combinations that evaluates each  
-combination of hyper-parameters 
-by means of a given quantification-oriented
+The class _qp.model_selection.GridSearchQ_ implements a grid-search exploration over the space of 
+hyper-parameter combinations that [evaluates](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation) 
+each combination of hyper-parameters by means of a given quantification-oriented
 error metric (e.g., any of the error functions implemented
-in _qp.error_) and according to the 
-[_artificial sampling protocol_](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation).
+in _qp.error_) and according to a 
+[sampling generation protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols).

-The following is an example of model selection for quantification:
+The following is an example (also included in the examples folder) of model selection for quantification:

 ```python
 import quapy as qp
-from quapy.method.aggregative import PCC
+from quapy.protocol import APP
+from quapy.method.aggregative import DistributionMatching
 from sklearn.linear_model import LogisticRegression
 import numpy as np

-# set a seed to replicate runs
-np.random.seed(0)
-qp.environ['SAMPLE_SIZE'] = 500
+"""
+In this example, we show how to perform model selection on a DistributionMatching quantifier.
+"""

-dataset = qp.datasets.fetch_reviews('hp', tfidf=True, min_df=5)
+model = DistributionMatching(LogisticRegression())
+
+qp.environ['SAMPLE_SIZE'] = 100
+qp.environ['N_JOBS'] = -1  # explore hyper-parameters in parallel
+
+training, test = qp.datasets.fetch_reviews('imdb', tfidf=True, min_df=5).train_test

 # The model will be returned by the fit method of GridSearchQ.
-# Model selection will be performed with a fixed budget of 1000 evaluations
-# for each hyper-parameter combination. The error to optimize is the MAE for
-# quantification, as evaluated on artificially drawn samples at prevalences 
-# covering the entire spectrum on a held-out portion (40%) of the training set.
+# Every combination of hyper-parameters will be evaluated by confronting the
+# quantifier thus configured against a series of samples generated by means
+# of a sample generation protocol. For this example, we will use the
+# artificial-prevalence protocol (APP), that generates samples with prevalence
+# values in the entire range of values from a grid (e.g., [0, 0.1, 0.2, ..., 1]).
+# We devote 30% of the dataset for this exploration.
+training, validation = training.split_stratified(train_prop=0.7)
+protocol = APP(validation)
+
+# We will explore a classification-dependent hyper-parameter (e.g., the 'C'
+# hyper-parameter of LogisticRegression) and a quantification-dependent hyper-parameter
+# (e.g., the number of bins in a DistributionMatching quantifier.
+# Classifier-dependent hyper-parameters have to be marked with a prefix "classifier__"
+# in order to let the quantifier know this hyper-parameter belongs to its underlying
+# classifier.
+param_grid = {
+    'classifier__C': np.logspace(-3,3,7),
+    'nbins': [8, 16, 32, 64],
+}
+
 model = qp.model_selection.GridSearchQ(
-    model=PCC(LogisticRegression()),
-    param_grid={'C': np.logspace(-4,5,10), 'class_weight': ['balanced', None]},
-    sample_size=qp.environ['SAMPLE_SIZE'],
-    eval_budget=1000,
-    error='mae',
-    refit=True,  # retrain on the whole labelled set
-    val_split=0.4,
+    model=model,
+    param_grid=param_grid,
+    protocol=protocol,
+    error='mae',  # the error to optimize is the MAE (a quantification-oriented loss)
+    refit=True,   # retrain on the whole labelled set once done
    verbose=True  # show information as the process goes on
-).fit(dataset.training)
+).fit(training)

 print(f'model selection ended: best hyper-parameters={model.best_params_}')
 model = model.best_model_

 # evaluation in terms of MAE
-results = qp.evaluation.artificial_sampling_eval(
-    model,
-    dataset.test,
-    sample_size=qp.environ['SAMPLE_SIZE'],
-    n_prevpoints=101,
-    n_repetitions=10,
-    error_metric='mae'
-)
+# we use the same evaluation protocol (APP) on the test set
+mae_score = qp.evaluation.evaluate(model, protocol=APP(test), error_metric='mae')

-print(f'MAE={results:.5f}')
+print(f'MAE={mae_score:.5f}')
 ```

 In this example, the system outputs:
 ```
-[GridSearchQ]: starting optimization with n_jobs=1
-[GridSearchQ]: checking hyperparams={'C': 0.0001, 'class_weight': 'balanced'} got mae score 0.24987
-[GridSearchQ]: checking hyperparams={'C': 0.0001, 'class_weight': None} got mae score 0.48135
-[GridSearchQ]: checking hyperparams={'C': 0.001, 'class_weight': 'balanced'} got mae score 0.24866
+[GridSearchQ]: starting model selection with self.n_jobs =-1
+[GridSearchQ]: hyperparams={'classifier__C': 0.01, 'nbins': 64}	 got mae score 0.04021 [took 1.1356s]
+[GridSearchQ]: hyperparams={'classifier__C': 0.01, 'nbins': 32}	 got mae score 0.04286 [took 1.2139s]
+[GridSearchQ]: hyperparams={'classifier__C': 0.01, 'nbins': 16}	 got mae score 0.04888 [took 1.2491s]
+[GridSearchQ]: hyperparams={'classifier__C': 0.001, 'nbins': 8}	 got mae score 0.05163 [took 1.5372s]
 [...]
-[GridSearchQ]: checking hyperparams={'C': 100000.0, 'class_weight': None} got mae score 0.43676
-[GridSearchQ]: optimization finished: best params {'C': 0.1, 'class_weight': 'balanced'} (score=0.19982)
+[GridSearchQ]: hyperparams={'classifier__C': 1000.0, 'nbins': 32}	 got mae score 0.02445 [took 2.9056s]
+[GridSearchQ]: optimization finished: best params {'classifier__C': 100.0, 'nbins': 32} (score=0.02234) [took 7.3114s]
 [GridSearchQ]: refitting on the whole development set
-model selection ended: best hyper-parameters={'C': 0.1, 'class_weight': 'balanced'}
-1010 evaluations will be performed for each combination of hyper-parameters
-[artificial sampling protocol] generating predictions: 100%|██████████| 1010/1010 [00:00<00:00, 5005.54it/s]
-MAE=0.20342
+model selection ended: best hyper-parameters={'classifier__C': 100.0, 'nbins': 32}
+MAE=0.03102
 ```

 The parameter _val_split_ can alternatively be used to indicate
@ -128,32 +138,13 @@ learner = GridSearchCV(
    LogisticRegression(),
    param_grid={'C': np.logspace(-4, 5, 10), 'class_weight': ['balanced', None]},
    cv=5)
-model = PCC(learner).fit(dataset.training)
-print(f'model selection ended: best hyper-parameters={model.learner.best_params_}')
+model = DistributionMatching(learner).fit(dataset.training)
 ```

-In this example, the system outputs:
-```
-model selection ended: best hyper-parameters={'C': 10000.0, 'class_weight': None}
-1010 evaluations will be performed for each combination of hyper-parameters
-[artificial sampling protocol] generating predictions: 100%|██████████| 1010/1010 [00:00<00:00, 5379.55it/s]
-MAE=0.41734
-```
-
-Note that the MAE is worse than the one we obtained when optimizing 
-for quantification and, indeed, the hyper-parameters found optimal
-largely differ between the two selection modalities. The 
-hyper-parameters C=10000 and class_weight=None have been found 
-to work well for the specific training prevalence of the HP dataset,
-but these hyper-parameters turned out to be suboptimal when the
-class prevalences of the test set differs (as is indeed tested
-in scenarios of quantification).
-
-This is, however, not always the case, and one could, in practice, 
-find examples
-in which optimizing for classification ends up resulting in a better
-quantifier than when optimizing for quantification. 
-Nonetheless, this is theoretically unlikely to happen.
+However, this is conceptually flawed, since the model should be
+optimized for the task at hand (quantification), and not for a surrogate task (classification),
+i.e., the model should be requested to deliver low quantification errors, rather
+than low classification errors.



--- a/docs/build/html/_sources/Plotting.md.txt
+++ b/docs/build/html/_sources/Plotting.md.txt
@ -43,7 +43,7 @@ quantification methods across different scenarios showcasing
 the accuracy of the quantifier in predicting class prevalences
 for a wide range of prior distributions. This can easily be
 achieved by means of the 
-[artificial sampling protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation)
+[artificial sampling protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols)
 that is implemented in QuaPy.

 The following code shows how to perform one simple experiment
@ -55,6 +55,7 @@ generating 100 random samples at each prevalence).

 ```python
 import quapy as qp
+from protocol import APP
 from quapy.method.aggregative import CC, ACC, PCC, PACC
 from sklearn.svm import LinearSVC

@ -63,28 +64,26 @@ qp.environ['SAMPLE_SIZE'] = 500
 def gen_data():

    def base_classifier():
-        return LinearSVC()
+        return LinearSVC(class_weight='balanced')

    def models():
-        yield CC(base_classifier())
-        yield ACC(base_classifier())
-        yield PCC(base_classifier())
-        yield PACC(base_classifier())
+        yield 'CC', CC(base_classifier())
+        yield 'ACC', ACC(base_classifier())
+        yield 'PCC', PCC(base_classifier())
+        yield 'PACC', PACC(base_classifier())

-    data = qp.datasets.fetch_reviews('kindle', tfidf=True, min_df=5)
+    train, test = qp.datasets.fetch_reviews('kindle', tfidf=True, min_df=5).train_test

    method_names, true_prevs, estim_prevs, tr_prevs = [], [], [], []

-    for model in models():
-        model.fit(data.training)
-        true_prev, estim_prev = qp.evaluation.artificial_sampling_prediction(
-            model, data.test, qp.environ['SAMPLE_SIZE'], n_repetitions=100, n_prevpoints=21
-        )
+    for method_name, model in models():
+        model.fit(train)
+        true_prev, estim_prev = qp.evaluation.prediction(model, APP(test, repeats=100, random_state=0))

-        method_names.append(model.__class__.__name__)
+        method_names.append(method_name)
        true_prevs.append(true_prev)
        estim_prevs.append(estim_prev)
-        tr_prevs.append(data.training.prevalence())
+        tr_prevs.append(train.prevalence())

    return method_names, true_prevs, estim_prevs, tr_prevs

@ -163,21 +162,19 @@ like this:

 ```python
 def gen_data():
-    data = qp.datasets.fetch_reviews('imdb', tfidf=True, min_df=5)
+
+    train, test = qp.datasets.fetch_reviews('imdb', tfidf=True, min_df=5).train_test
    model = CC(LinearSVC())

    method_data = []
    for training_prevalence in np.linspace(0.1, 0.9, 9):
        training_size = 5000
-        # since the problem is binary, it suffices to specify the negative prevalence (the positive is constrained)
-        training = data.training.sampling(training_size, 1 - training_prevalence)
-        model.fit(training)
-        true_prev, estim_prev = qp.evaluation.artificial_sampling_prediction(
-            model, data.sample, qp.environ['SAMPLE_SIZE'], n_repetitions=100, n_prevpoints=21
-        )
-        # method names can contain Latex syntax
+        # since the problem is binary, it suffices to specify the negative prevalence, since the positive is constrained
+        train_sample = train.sampling(training_size, 1-training_prevalence)
+        model.fit(train_sample)
+        true_prev, estim_prev = qp.evaluation.prediction(model, APP(test, repeats=100, random_state=0))
        method_name = 'CC$_{'+f'{int(100*training_prevalence)}' + '\%}$'
-        method_data.append((method_name, true_prev, estim_prev, training.prevalence()))
+        method_data.append((method_name, true_prev, estim_prev, train_sample.prevalence()))

    return zip(*method_data)
 ```
--- a/docs/build/html/_sources/index.rst.txt
+++ b/docs/build/html/_sources/index.rst.txt
@ -64,6 +64,7 @@ Features
    * 32 UCI Machine Learning datasets.
    * 11 Twitter Sentiment datasets.
    * 3 Reviews Sentiment datasets.
+    * 4 tasks from LeQua competition (_new in v0.1.7!_)
 * Native supports for binary and single-label scenarios of quantification.
 * Model selection functionality targeting quantification-oriented losses.
 * Visualization tools for analysing results.
@ -75,6 +76,7 @@ Features
   Installation
   Datasets
   Evaluation
+   Protocols
   Methods
   Model-Selection
   Plotting
--- a/docs/build/html/genindex.html
+++ b/docs/build/html/genindex.html
@ -56,6 +56,7 @@
 | <a href="#G"><strong>G</strong></a>
 | <a href="#H"><strong>H</strong></a>
 | <a href="#I"><strong>I</strong></a>
+ | <a href="#J"><strong>J</strong></a>
 | <a href="#K"><strong>K</strong></a>
 | <a href="#L"><strong>L</strong></a>
 | <a href="#M"><strong>M</strong></a>
@ -131,6 +132,8 @@
      <li><a href="quapy.method.html#quapy.method.aggregative.AggregativeQuantifier">AggregativeQuantifier (class in quapy.method.aggregative)</a>
 </li>
      <li><a href="quapy.html#quapy.protocol.APP">APP (class in quapy.protocol)</a>
+</li>
+      <li><a href="quapy.html#quapy.protocol.ArtificialPrevalenceProtocol">ArtificialPrevalenceProtocol (in module quapy.protocol)</a>
 </li>
      <li><a href="quapy.classification.html#quapy.classification.neural.TorchDataset.asDataloader">asDataloader() (quapy.classification.neural.TorchDataset method)</a>
 </li>
@ -284,10 +287,10 @@
 </li>
      <li><a href="quapy.method.html#quapy.method.aggregative.EMQ">EMQ (class in quapy.method.aggregative)</a>
 </li>
-  </ul></td>
-  <td style="width: 33%; vertical-align: top;"><ul>
      <li><a href="quapy.method.html#quapy.method.meta.Ensemble">Ensemble (class in quapy.method.meta)</a>
 </li>
+  </ul></td>
+  <td style="width: 33%; vertical-align: top;"><ul>
      <li><a href="quapy.method.html#quapy.method.meta.ensembleFactory">ensembleFactory() (in module quapy.method.meta)</a>
 </li>
      <li><a href="quapy.method.html#quapy.method.meta.EPACC">EPACC() (in module quapy.method.meta)</a>
@ -297,6 +300,8 @@
      <li><a href="quapy.html#quapy.plot.error_by_drift">error_by_drift() (in module quapy.plot)</a>
 </li>
      <li><a href="quapy.html#quapy.evaluation.evaluate">evaluate() (in module quapy.evaluation)</a>
+</li>
+      <li><a href="quapy.html#quapy.evaluation.evaluate_on_samples">evaluate_on_samples() (in module quapy.evaluation)</a>
 </li>
      <li><a href="quapy.html#quapy.evaluation.evaluation_report">evaluation_report() (in module quapy.evaluation)</a>
 </li>
@ -459,6 +464,16 @@
  </ul></td>
  <td style="width: 33%; vertical-align: top;"><ul>
      <li><a href="quapy.data.html#quapy.data.preprocessing.IndexTransformer">IndexTransformer (class in quapy.data.preprocessing)</a>
+</li>
+      <li><a href="quapy.html#quapy.protocol.IterateProtocol">IterateProtocol (class in quapy.protocol)</a>
+</li>
+  </ul></td>
+</tr></table>
+
+<h2 id="J">J</h2>
+<table style="width: 100%" class="indextable genindextable"><tr>
+  <td style="width: 33%; vertical-align: top;"><ul>
+      <li><a href="quapy.data.html#quapy.data.base.LabelledCollection.join">join() (quapy.data.base.LabelledCollection class method)</a>
 </li>
  </ul></td>
 </tr></table>
@ -521,8 +536,6 @@
      <li><a href="quapy.method.html#quapy.method.aggregative.MedianSweep">MedianSweep (in module quapy.method.aggregative)</a>
 </li>
      <li><a href="quapy.method.html#quapy.method.aggregative.MedianSweep2">MedianSweep2 (in module quapy.method.aggregative)</a>
-</li>
-      <li><a href="quapy.data.html#quapy.data.base.LabelledCollection.mix">mix() (quapy.data.base.LabelledCollection class method)</a>
 </li>
      <li><a href="quapy.html#quapy.error.mkld">mkld() (in module quapy.error)</a>
 </li>
@ -603,6 +616,8 @@
        <li><a href="quapy.data.html#quapy.data.base.LabelledCollection.n_classes">(quapy.data.base.LabelledCollection property)</a>
 </li>
      </ul></li>
+      <li><a href="quapy.html#quapy.protocol.NaturalPrevalenceProtocol">NaturalPrevalenceProtocol (in module quapy.protocol)</a>
+</li>
      <li><a href="quapy.classification.html#quapy.classification.calibration.NBVSCalibration">NBVSCalibration (class in quapy.classification.calibration)</a>
 </li>
      <li><a href="quapy.classification.html#quapy.classification.neural.NeuralClassifierTrainer">NeuralClassifierTrainer (class in quapy.classification.neural)</a>
@ -610,11 +625,11 @@
      <li><a href="quapy.method.html#quapy.method.aggregative.newELM">newELM() (in module quapy.method.aggregative)</a>
 </li>
      <li><a href="quapy.method.html#quapy.method.base.newOneVsAll">newOneVsAll() (in module quapy.method.base)</a>
-</li>
-      <li><a href="quapy.method.html#quapy.method.aggregative.newSVMAE">newSVMAE() (in module quapy.method.aggregative)</a>
 </li>
  </ul></td>
  <td style="width: 33%; vertical-align: top;"><ul>
+      <li><a href="quapy.method.html#quapy.method.aggregative.newSVMAE">newSVMAE() (in module quapy.method.aggregative)</a>
+</li>
      <li><a href="quapy.method.html#quapy.method.aggregative.newSVMKLD">newSVMKLD() (in module quapy.method.aggregative)</a>
 </li>
      <li><a href="quapy.method.html#quapy.method.aggregative.newSVMQ">newSVMQ() (in module quapy.method.aggregative)</a>
@ -914,6 +929,8 @@
      <li><a href="quapy.classification.html#quapy.classification.calibration.RecalibratedProbabilisticClassifier">RecalibratedProbabilisticClassifier (class in quapy.classification.calibration)</a>
 </li>
      <li><a href="quapy.classification.html#quapy.classification.calibration.RecalibratedProbabilisticClassifierBase">RecalibratedProbabilisticClassifierBase (class in quapy.classification.calibration)</a>
+</li>
+      <li><a href="quapy.data.html#quapy.data.base.Dataset.reduce">reduce() (quapy.data.base.Dataset method)</a>
 </li>
  </ul></td>
  <td style="width: 33%; vertical-align: top;"><ul>
@ -942,7 +959,7 @@
 </li>
        <li><a href="quapy.html#quapy.protocol.NPP.sample">(quapy.protocol.NPP method)</a>
 </li>
-        <li><a href="quapy.html#quapy.protocol.USimplexPP.sample">(quapy.protocol.USimplexPP method)</a>
+        <li><a href="quapy.html#quapy.protocol.UPP.sample">(quapy.protocol.UPP method)</a>
 </li>
      </ul></li>
      <li><a href="quapy.html#quapy.protocol.AbstractStochasticSeededProtocol.samples_parameters">samples_parameters() (quapy.protocol.AbstractStochasticSeededProtocol method)</a>
@ -954,7 +971,7 @@
 </li>
        <li><a href="quapy.html#quapy.protocol.NPP.samples_parameters">(quapy.protocol.NPP method)</a>
 </li>
-        <li><a href="quapy.html#quapy.protocol.USimplexPP.samples_parameters">(quapy.protocol.USimplexPP method)</a>
+        <li><a href="quapy.html#quapy.protocol.UPP.samples_parameters">(quapy.protocol.UPP method)</a>
 </li>
      </ul></li>
      <li><a href="quapy.data.html#quapy.data.base.LabelledCollection.sampling">sampling() (quapy.data.base.LabelledCollection method)</a>
@ -1033,10 +1050,12 @@
        <li><a href="quapy.html#quapy.protocol.APP.total">(quapy.protocol.APP method)</a>
 </li>
        <li><a href="quapy.html#quapy.protocol.DomainMixer.total">(quapy.protocol.DomainMixer method)</a>
+</li>
+        <li><a href="quapy.html#quapy.protocol.IterateProtocol.total">(quapy.protocol.IterateProtocol method)</a>
 </li>
        <li><a href="quapy.html#quapy.protocol.NPP.total">(quapy.protocol.NPP method)</a>
 </li>
-        <li><a href="quapy.html#quapy.protocol.USimplexPP.total">(quapy.protocol.USimplexPP method)</a>
+        <li><a href="quapy.html#quapy.protocol.UPP.total">(quapy.protocol.UPP method)</a>
 </li>
      </ul></li>
  </ul></td>
@ -1073,13 +1092,15 @@
 </li>
      <li><a href="quapy.data.html#quapy.data.base.LabelledCollection.uniform_sampling">uniform_sampling() (quapy.data.base.LabelledCollection method)</a>
 </li>
-  </ul></td>
-  <td style="width: 33%; vertical-align: top;"><ul>
      <li><a href="quapy.data.html#quapy.data.base.LabelledCollection.uniform_sampling_index">uniform_sampling_index() (quapy.data.base.LabelledCollection method)</a>
 </li>
+  </ul></td>
+  <td style="width: 33%; vertical-align: top;"><ul>
      <li><a href="quapy.html#quapy.functional.uniform_simplex_sampling">uniform_simplex_sampling() (in module quapy.functional)</a>
 </li>
-      <li><a href="quapy.html#quapy.protocol.USimplexPP">USimplexPP (class in quapy.protocol)</a>
+      <li><a href="quapy.html#quapy.protocol.UniformPrevalenceProtocol">UniformPrevalenceProtocol (in module quapy.protocol)</a>
+</li>
+      <li><a href="quapy.html#quapy.protocol.UPP">UPP (class in quapy.protocol)</a>
 </li>
  </ul></td>
 </tr></table>
--- a/docs/build/html/index.html
+++ b/docs/build/html/index.html
@ -102,6 +102,7 @@ See the <a class="reference internal" href="Evaluation.html"><span class="doc">E
 <li><p>32 UCI Machine Learning datasets.</p></li>
 <li><p>11 Twitter Sentiment datasets.</p></li>
 <li><p>3 Reviews Sentiment datasets.</p></li>
+<li><p>4 tasks from LeQua competition (_new in v0.1.7!_)</p></li>
 </ul>
 </dd>
 </dl>
@ -130,6 +131,13 @@ See the <a class="reference internal" href="Evaluation.html"><span class="doc">E
 <li class="toctree-l2"><a class="reference internal" href="Evaluation.html#evaluation-protocols">Evaluation Protocols</a></li>
 </ul>
 </li>
+<li class="toctree-l1"><a class="reference internal" href="Protocols.html">Protocols</a><ul>
+<li class="toctree-l2"><a class="reference internal" href="Protocols.html#artificial-prevalence-protocol">Artificial-Prevalence Protocol</a></li>
+<li class="toctree-l2"><a class="reference internal" href="Protocols.html#sampling-from-the-unit-simplex-the-uniform-prevalence-protocol-upp">Sampling from the unit-simplex, the Uniform-Prevalence Protocol (UPP)</a></li>
+<li class="toctree-l2"><a class="reference internal" href="Protocols.html#natural-prevalence-protocol">Natural-Prevalence Protocol</a></li>
+<li class="toctree-l2"><a class="reference internal" href="Protocols.html#other-protocols">Other protocols</a></li>
+</ul>
+</li>
 <li class="toctree-l1"><a class="reference internal" href="Methods.html">Quantification Methods</a><ul>
 <li class="toctree-l2"><a class="reference internal" href="Methods.html#aggregative-methods">Aggregative Methods</a></li>
 <li class="toctree-l2"><a class="reference internal" href="Methods.html#meta-models">Meta Models</a></li>
--- a/docs/build/html/objects.inv
+++ b/docs/build/html/objects.inv
--- a/docs/build/html/quapy.data.html
+++ b/docs/build/html/quapy.data.html
@ -170,6 +170,23 @@ See <a class="reference internal" href="#quapy.data.base.LabelledCollection.load
 </dl>
 </dd></dl>

+<dl class="py method">
+<dt class="sig sig-object py" id="quapy.data.base.Dataset.reduce">
+<span class="sig-name descname"><span class="pre">reduce</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">n_train</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_test</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.data.base.Dataset.reduce" title="Permalink to this definition">¶</a></dt>
+<dd><p>Reduce the number of instances in place for quick experiments. Preserves the prevalence of each set.</p>
+<dl class="field-list simple">
+<dt class="field-odd">Parameters<span class="colon">:</span></dt>
+<dd class="field-odd"><ul class="simple">
+<li><p><strong>n_train</strong> – number of training documents to keep (default 100)</p></li>
+<li><p><strong>n_test</strong> – number of test documents to keep (default 100)</p></li>
+</ul>
+</dd>
+<dt class="field-even">Returns<span class="colon">:</span></dt>
+<dd class="field-even"><p>self</p>
+</dd>
+</dl>
+</dd></dl>
+
 <dl class="py method">
 <dt class="sig sig-object py" id="quapy.data.base.Dataset.stats">
 <span class="sig-name descname"><span class="pre">stats</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">show</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.data.base.Dataset.stats" title="Permalink to this definition">¶</a></dt>
@ -297,6 +314,20 @@ as listed by <cite>self.classes_</cite></p>
 </dl>
 </dd></dl>

+<dl class="py method">
+<dt class="sig sig-object py" id="quapy.data.base.LabelledCollection.join">
+<em class="property"><span class="pre">classmethod</span><span class="w"> </span></em><span class="sig-name descname"><span class="pre">join</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="o"><span class="pre">*</span></span><span class="n"><span class="pre">args</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Iterable</span><span class="p"><span class="pre">[</span></span><a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><span class="pre">LabelledCollection</span></a><span class="p"><span class="pre">]</span></span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.data.base.LabelledCollection.join" title="Permalink to this definition">¶</a></dt>
+<dd><p>Returns a new <a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">LabelledCollection</span></code></a> as the union of the collections given in input.</p>
+<dl class="field-list simple">
+<dt class="field-odd">Parameters<span class="colon">:</span></dt>
+<dd class="field-odd"><p><strong>args</strong> – instances of <a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">LabelledCollection</span></code></a></p>
+</dd>
+<dt class="field-even">Returns<span class="colon">:</span></dt>
+<dd class="field-even"><p>a <a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">LabelledCollection</span></code></a> representing the union of both collections</p>
+</dd>
+</dl>
+</dd></dl>
+
 <dl class="py method">
 <dt class="sig sig-object py" id="quapy.data.base.LabelledCollection.kFCV">
 <span class="sig-name descname"><span class="pre">kFCV</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">nfolds</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">5</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">nrepeats</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">1</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">random_state</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.data.base.LabelledCollection.kFCV" title="Permalink to this definition">¶</a></dt>
@ -338,23 +369,6 @@ these arguments are used to call <cite>loader_func(path, **loader_kwargs)</cite>
 </dl>
 </dd></dl>

-<dl class="py method">
-<dt class="sig sig-object py" id="quapy.data.base.LabelledCollection.mix">
-<em class="property"><span class="pre">classmethod</span><span class="w"> </span></em><span class="sig-name descname"><span class="pre">mix</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">a</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><span class="pre">LabelledCollection</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">b</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><span class="pre">LabelledCollection</span></a></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.data.base.LabelledCollection.mix" title="Permalink to this definition">¶</a></dt>
-<dd><p>Returns a new <a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">LabelledCollection</span></code></a> as the union of this collection with another collection.</p>
-<dl class="field-list simple">
-<dt class="field-odd">Parameters<span class="colon">:</span></dt>
-<dd class="field-odd"><ul class="simple">
-<li><p><strong>a</strong> – instance of <a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">LabelledCollection</span></code></a></p></li>
-<li><p><strong>b</strong> – instance of <a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">LabelledCollection</span></code></a></p></li>
-</ul>
-</dd>
-<dt class="field-even">Returns<span class="colon">:</span></dt>
-<dd class="field-even"><p>a <a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">LabelledCollection</span></code></a> representing the union of both collections</p>
-</dd>
-</dl>
-</dd></dl>
-
 <dl class="py property">
 <dt class="sig sig-object py" id="quapy.data.base.LabelledCollection.n_classes">
 <em class="property"><span class="pre">property</span><span class="w"> </span></em><span class="sig-name descname"><span class="pre">n_classes</span></span><a class="headerlink" href="#quapy.data.base.LabelledCollection.n_classes" title="Permalink to this definition">¶</a></dt>
--- a/docs/build/html/quapy.html
+++ b/docs/build/html/quapy.html
@ -481,18 +481,117 @@ will be taken from the environment variable <cite>SAMPLE_SIZE</cite> (which has
 <span id="quapy-evaluation"></span><h2>quapy.evaluation<a class="headerlink" href="#module-quapy.evaluation" title="Permalink to this heading">¶</a></h2>
 <dl class="py function">
 <dt class="sig sig-object py" id="quapy.evaluation.evaluate">
-<span class="sig-prename descclassname"><span class="pre">quapy.evaluation.</span></span><span class="sig-name descname"><span class="pre">evaluate</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">model</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><span class="pre">BaseQuantifier</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">protocol</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="#quapy.protocol.AbstractProtocol" title="quapy.protocol.AbstractProtocol"><span class="pre">AbstractProtocol</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">error_metric</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">str</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">Callable</span><span class="p"><span class="pre">]</span></span></span></em>, <em class="sig-param"><span class="n"><span class="pre">aggr_speedup</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'auto'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">verbose</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.evaluation.evaluate" title="Permalink to this definition">¶</a></dt>
-<dd></dd></dl>
+<span class="sig-prename descclassname"><span class="pre">quapy.evaluation.</span></span><span class="sig-name descname"><span class="pre">evaluate</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">model</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><span class="pre">BaseQuantifier</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">protocol</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="#quapy.protocol.AbstractProtocol" title="quapy.protocol.AbstractProtocol"><span class="pre">AbstractProtocol</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">error_metric</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">str</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">Callable</span><span class="p"><span class="pre">]</span></span></span></em>, <em class="sig-param"><span class="n"><span class="pre">aggr_speedup</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">str</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">bool</span><span class="p"><span class="pre">]</span></span></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">'auto'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">verbose</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.evaluation.evaluate" title="Permalink to this definition">¶</a></dt>
+<dd><p>Evaluates a quantification model according to a specific sample generation protocol and in terms of one
+evaluation metric (error).</p>
+<dl class="field-list simple">
+<dt class="field-odd">Parameters<span class="colon">:</span></dt>
+<dd class="field-odd"><ul class="simple">
+<li><p><strong>model</strong> – a quantifier, instance of <a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.method.base.BaseQuantifier</span></code></a></p></li>
+<li><p><strong>protocol</strong> – <a class="reference internal" href="#quapy.protocol.AbstractProtocol" title="quapy.protocol.AbstractProtocol"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.protocol.AbstractProtocol</span></code></a>; if this object is also instance of
+<a class="reference internal" href="#quapy.protocol.OnLabelledCollectionProtocol" title="quapy.protocol.OnLabelledCollectionProtocol"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.protocol.OnLabelledCollectionProtocol</span></code></a>, then the aggregation speed-up can be run. This is the
+protocol in charge of generating the samples in which the model is evaluated.</p></li>
+<li><p><strong>error_metric</strong> – a string representing the name(s) of an error function in <cite>qp.error</cite>
+(e.g., ‘mae’), or a callable function implementing the error function itself.</p></li>
+<li><p><strong>aggr_speedup</strong> – whether or not to apply the speed-up. Set to “force” for applying it even if the number of
+instances in the original collection on which the protocol acts is larger than the number of instances
+in the samples to be generated. Set to True or “auto” (default) for letting QuaPy decide whether it is
+convenient or not. Set to False to deactivate.</p></li>
+<li><p><strong>verbose</strong> – boolean, show or not information in stdout</p></li>
+</ul>
+</dd>
+<dt class="field-even">Returns<span class="colon">:</span></dt>
+<dd class="field-even"><p>if the error metric is not averaged (e.g., ‘ae’, ‘rae’), returns an array of shape <cite>(n_samples,)</cite> with
+the error scores for each sample; if the error metric is averaged (e.g., ‘mae’, ‘mrae’) then returns
+a single float</p>
+</dd>
+</dl>
+</dd></dl>
+
+<dl class="py function">
+<dt class="sig sig-object py" id="quapy.evaluation.evaluate_on_samples">
+<span class="sig-prename descclassname"><span class="pre">quapy.evaluation.</span></span><span class="sig-name descname"><span class="pre">evaluate_on_samples</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">model</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><span class="pre">BaseQuantifier</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">samples</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Iterable</span><span class="p"><span class="pre">[</span></span><a class="reference internal" href="quapy.data.html#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><span class="pre">LabelledCollection</span></a><span class="p"><span class="pre">]</span></span></span></em>, <em class="sig-param"><span class="n"><span class="pre">error_metric</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">str</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">Callable</span><span class="p"><span class="pre">]</span></span></span></em>, <em class="sig-param"><span class="n"><span class="pre">verbose</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.evaluation.evaluate_on_samples" title="Permalink to this definition">¶</a></dt>
+<dd><p>Evaluates a quantification model on a given set of samples and in terms of one evaluation metric (error).</p>
+<dl class="field-list simple">
+<dt class="field-odd">Parameters<span class="colon">:</span></dt>
+<dd class="field-odd"><ul class="simple">
+<li><p><strong>model</strong> – a quantifier, instance of <a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.method.base.BaseQuantifier</span></code></a></p></li>
+<li><p><strong>samples</strong> – a list of samples on which the quantifier is to be evaluated</p></li>
+<li><p><strong>error_metric</strong> – a string representing the name(s) of an error function in <cite>qp.error</cite>
+(e.g., ‘mae’), or a callable function implementing the error function itself.</p></li>
+<li><p><strong>verbose</strong> – boolean, show or not information in stdout</p></li>
+</ul>
+</dd>
+<dt class="field-even">Returns<span class="colon">:</span></dt>
+<dd class="field-even"><p>if the error metric is not averaged (e.g., ‘ae’, ‘rae’), returns an array of shape <cite>(n_samples,)</cite> with
+the error scores for each sample; if the error metric is averaged (e.g., ‘mae’, ‘mrae’) then returns
+a single float</p>
+</dd>
+</dl>
+</dd></dl>

 <dl class="py function">
 <dt class="sig sig-object py" id="quapy.evaluation.evaluation_report">
-<span class="sig-prename descclassname"><span class="pre">quapy.evaluation.</span></span><span class="sig-name descname"><span class="pre">evaluation_report</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">model</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><span class="pre">BaseQuantifier</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">protocol</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="#quapy.protocol.AbstractProtocol" title="quapy.protocol.AbstractProtocol"><span class="pre">AbstractProtocol</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">error_metrics</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Iterable</span><span class="p"><span class="pre">[</span></span><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">str</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">Callable</span><span class="p"><span class="pre">]</span></span><span class="p"><span class="pre">]</span></span></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">'mae'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">aggr_speedup</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'auto'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">verbose</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.evaluation.evaluation_report" title="Permalink to this definition">¶</a></dt>
-<dd></dd></dl>
+<span class="sig-prename descclassname"><span class="pre">quapy.evaluation.</span></span><span class="sig-name descname"><span class="pre">evaluation_report</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">model</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><span class="pre">BaseQuantifier</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">protocol</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="#quapy.protocol.AbstractProtocol" title="quapy.protocol.AbstractProtocol"><span class="pre">AbstractProtocol</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">error_metrics</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Iterable</span><span class="p"><span class="pre">[</span></span><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">str</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">Callable</span><span class="p"><span class="pre">]</span></span><span class="p"><span class="pre">]</span></span></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">'mae'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">aggr_speedup</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">str</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">bool</span><span class="p"><span class="pre">]</span></span></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">'auto'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">verbose</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.evaluation.evaluation_report" title="Permalink to this definition">¶</a></dt>
+<dd><p>Generates a report (a pandas’ DataFrame) containing information of the evaluation of the model as according
+to a specific protocol and in terms of one or more evaluation metrics (errors).</p>
+<dl class="field-list simple">
+<dt class="field-odd">Parameters<span class="colon">:</span></dt>
+<dd class="field-odd"><ul class="simple">
+<li><p><strong>model</strong> – a quantifier, instance of <a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.method.base.BaseQuantifier</span></code></a></p></li>
+<li><p><strong>protocol</strong> – <a class="reference internal" href="#quapy.protocol.AbstractProtocol" title="quapy.protocol.AbstractProtocol"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.protocol.AbstractProtocol</span></code></a>; if this object is also instance of
+<a class="reference internal" href="#quapy.protocol.OnLabelledCollectionProtocol" title="quapy.protocol.OnLabelledCollectionProtocol"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.protocol.OnLabelledCollectionProtocol</span></code></a>, then the aggregation speed-up can be run. This is the protocol
+in charge of generating the samples in which the model is evaluated.</p></li>
+<li><p><strong>error_metrics</strong> – a string, or list of strings, representing the name(s) of an error function in <cite>qp.error</cite>
+(e.g., ‘mae’, the default value), or a callable function, or a list of callable functions, implementing
+the error function itself.</p></li>
+<li><p><strong>aggr_speedup</strong> – whether or not to apply the speed-up. Set to “force” for applying it even if the number of
+instances in the original collection on which the protocol acts is larger than the number of instances
+in the samples to be generated. Set to True or “auto” (default) for letting QuaPy decide whether it is
+convenient or not. Set to False to deactivate.</p></li>
+<li><p><strong>verbose</strong> – boolean, show or not information in stdout</p></li>
+</ul>
+</dd>
+<dt class="field-even">Returns<span class="colon">:</span></dt>
+<dd class="field-even"><p>a pandas’ DataFrame containing the columns ‘true-prev’ (the true prevalence of each sample),
+‘estim-prev’ (the prevalence estimated by the model for each sample), and as many columns as error metrics
+have been indicated, each displaying the score in terms of that metric for every sample.</p>
+</dd>
+</dl>
+</dd></dl>

 <dl class="py function">
 <dt class="sig sig-object py" id="quapy.evaluation.prediction">
-<span class="sig-prename descclassname"><span class="pre">quapy.evaluation.</span></span><span class="sig-name descname"><span class="pre">prediction</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">model</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><span class="pre">BaseQuantifier</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">protocol</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="#quapy.protocol.AbstractProtocol" title="quapy.protocol.AbstractProtocol"><span class="pre">AbstractProtocol</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">aggr_speedup</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'auto'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">verbose</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.evaluation.prediction" title="Permalink to this definition">¶</a></dt>
-<dd></dd></dl>
+<span class="sig-prename descclassname"><span class="pre">quapy.evaluation.</span></span><span class="sig-name descname"><span class="pre">prediction</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">model</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><span class="pre">BaseQuantifier</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">protocol</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="#quapy.protocol.AbstractProtocol" title="quapy.protocol.AbstractProtocol"><span class="pre">AbstractProtocol</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">aggr_speedup</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">str</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">bool</span><span class="p"><span class="pre">]</span></span></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">'auto'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">verbose</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.evaluation.prediction" title="Permalink to this definition">¶</a></dt>
+<dd><p>Uses a quantification model to generate predictions for the samples generated via a specific protocol.
+This function is central to all evaluation processes, and is endowed with an optimization to speed-up the
+prediction of protocols that generate samples from a large collection. The optimization applies to aggregative
+quantifiers only, and to OnLabelledCollectionProtocol protocols, and comes down to generating the classification
+predictions once and for all, and then generating samples over the classification predictions (instead of over
+the raw instances), so that the classifier prediction is never called again. This behaviour is obtained by
+setting <cite>aggr_speedup</cite> to ‘auto’ or True, and is only carried out if the overall process is convenient in terms
+of computations (e.g., if the number of classification predictions needed for the original collection exceed the
+number of classification predictions needed for all samples, then the optimization is not undertaken).</p>
+<dl class="field-list simple">
+<dt class="field-odd">Parameters<span class="colon">:</span></dt>
+<dd class="field-odd"><ul class="simple">
+<li><p><strong>model</strong> – a quantifier, instance of <a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.method.base.BaseQuantifier</span></code></a></p></li>
+<li><p><strong>protocol</strong> – <a class="reference internal" href="#quapy.protocol.AbstractProtocol" title="quapy.protocol.AbstractProtocol"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.protocol.AbstractProtocol</span></code></a>; if this object is also instance of
+<a class="reference internal" href="#quapy.protocol.OnLabelledCollectionProtocol" title="quapy.protocol.OnLabelledCollectionProtocol"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.protocol.OnLabelledCollectionProtocol</span></code></a>, then the aggregation speed-up can be run. This is the protocol
+in charge of generating the samples for which the model has to issue class prevalence predictions.</p></li>
+<li><p><strong>aggr_speedup</strong> – whether or not to apply the speed-up. Set to “force” for applying it even if the number of
+instances in the original collection on which the protocol acts is larger than the number of instances
+in the samples to be generated. Set to True or “auto” (default) for letting QuaPy decide whether it is
+convenient or not. Set to False to deactivate.</p></li>
+<li><p><strong>verbose</strong> – boolean, show or not information in stdout</p></li>
+</ul>
+</dd>
+<dt class="field-even">Returns<span class="colon">:</span></dt>
+<dd class="field-even"><p>a tuple <cite>(true_prevs, estim_prevs)</cite> in which each element in the tuple is an array of shape
+<cite>(n_samples, n_classes)</cite> containing the true, or predicted, prevalence values for each sample</p>
+</dd>
+</dl>
+</dd></dl>

 </section>
 <section id="quapy-protocol">
@ -624,7 +723,21 @@ the sequence will be consistent every time the protocol is called.</p>
 <dl class="py method">
 <dt class="sig sig-object py" id="quapy.protocol.AbstractStochasticSeededProtocol.collator">
 <span class="sig-name descname"><span class="pre">collator</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">sample</span></span></em>, <em class="sig-param"><span class="o"><span class="pre">*</span></span><span class="n"><span class="pre">args</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.AbstractStochasticSeededProtocol.collator" title="Permalink to this definition">¶</a></dt>
-<dd></dd></dl>
+<dd><p>The collator prepares the sample to accommodate the desired output format before returning the output.
+This collator simply returns the sample as it is. Classes inheriting from this abstract class can
+implement their custom collators.</p>
+<dl class="field-list simple">
+<dt class="field-odd">Parameters<span class="colon">:</span></dt>
+<dd class="field-odd"><ul class="simple">
+<li><p><strong>sample</strong> – the sample to be returned</p></li>
+<li><p><strong>args</strong> – additional arguments</p></li>
+</ul>
+</dd>
+<dt class="field-even">Returns<span class="colon">:</span></dt>
+<dd class="field-even"><p>the sample adhering to a desired output format (in this case, the sample is returned as it is)</p>
+</dd>
+</dl>
+</dd></dl>

 <dl class="py property">
 <dt class="sig sig-object py" id="quapy.protocol.AbstractStochasticSeededProtocol.random_state">
@ -658,6 +771,12 @@ the sequence will be consistent every time the protocol is called.</p>

 </dd></dl>

+<dl class="py attribute">
+<dt class="sig sig-object py" id="quapy.protocol.ArtificialPrevalenceProtocol">
+<span class="sig-prename descclassname"><span class="pre">quapy.protocol.</span></span><span class="sig-name descname"><span class="pre">ArtificialPrevalenceProtocol</span></span><a class="headerlink" href="#quapy.protocol.ArtificialPrevalenceProtocol" title="Permalink to this definition">¶</a></dt>
+<dd><p>alias of <a class="reference internal" href="#quapy.protocol.APP" title="quapy.protocol.APP"><code class="xref py py-class docutils literal notranslate"><span class="pre">APP</span></code></a></p>
+</dd></dl>
+
 <dl class="py class">
 <dt class="sig sig-object py" id="quapy.protocol.DomainMixer">
 <em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.protocol.</span></span><span class="sig-name descname"><span class="pre">DomainMixer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">domainA</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.data.html#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><span class="pre">LabelledCollection</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">domainB</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.data.html#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><span class="pre">LabelledCollection</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">sample_size</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">repeats</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">1</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">prevalence</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">mixture_points</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">11</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">random_state</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">0</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">return_type</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'sample_prev'</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.DomainMixer" title="Permalink to this definition">¶</a></dt>
@ -720,6 +839,29 @@ will be the same every time the protocol is called)</p></li>

 </dd></dl>

+<dl class="py class">
+<dt class="sig sig-object py" id="quapy.protocol.IterateProtocol">
+<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.protocol.</span></span><span class="sig-name descname"><span class="pre">IterateProtocol</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="pre">samples:</span> <span class="pre">[&lt;class</span> <span class="pre">'quapy.data.base.LabelledCollection'&gt;]</span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.IterateProtocol" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#quapy.protocol.AbstractProtocol" title="quapy.protocol.AbstractProtocol"><code class="xref py py-class docutils literal notranslate"><span class="pre">AbstractProtocol</span></code></a></p>
+<p>A very simple protocol which simply iterates over a list of previously generated samples</p>
+<dl class="field-list simple">
+<dt class="field-odd">Parameters<span class="colon">:</span></dt>
+<dd class="field-odd"><p><strong>samples</strong> – a list of <a class="reference internal" href="quapy.data.html#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.data.base.LabelledCollection</span></code></a></p>
+</dd>
+</dl>
+<dl class="py method">
+<dt class="sig sig-object py" id="quapy.protocol.IterateProtocol.total">
+<span class="sig-name descname"><span class="pre">total</span></span><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.IterateProtocol.total" title="Permalink to this definition">¶</a></dt>
+<dd><p>Returns the number of samples in this protocol</p>
+<dl class="field-list simple">
+<dt class="field-odd">Returns<span class="colon">:</span></dt>
+<dd class="field-odd"><p>int</p>
+</dd>
+</dl>
+</dd></dl>
+
+</dd></dl>
+
 <dl class="py class">
 <dt class="sig sig-object py" id="quapy.protocol.NPP">
 <em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.protocol.</span></span><span class="sig-name descname"><span class="pre">NPP</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">data</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.data.html#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><span class="pre">LabelledCollection</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">sample_size</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">repeats</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">random_state</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">0</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">return_type</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'sample_prev'</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.NPP" title="Permalink to this definition">¶</a></dt>
@ -778,6 +920,12 @@ to “labelled_collection” to get instead instances of LabelledCollection</p><

 </dd></dl>

+<dl class="py attribute">
+<dt class="sig sig-object py" id="quapy.protocol.NaturalPrevalenceProtocol">
+<span class="sig-prename descclassname"><span class="pre">quapy.protocol.</span></span><span class="sig-name descname"><span class="pre">NaturalPrevalenceProtocol</span></span><a class="headerlink" href="#quapy.protocol.NaturalPrevalenceProtocol" title="Permalink to this definition">¶</a></dt>
+<dd><p>alias of <a class="reference internal" href="#quapy.protocol.NPP" title="quapy.protocol.NPP"><code class="xref py py-class docutils literal notranslate"><span class="pre">NPP</span></code></a></p>
+</dd></dl>
+
 <dl class="py class">
 <dt class="sig sig-object py" id="quapy.protocol.OnLabelledCollectionProtocol">
 <em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.protocol.</span></span><span class="sig-name descname"><span class="pre">OnLabelledCollectionProtocol</span></span><a class="headerlink" href="#quapy.protocol.OnLabelledCollectionProtocol" title="Permalink to this definition">¶</a></dt>
@ -785,7 +933,7 @@ to “labelled_collection” to get instead instances of LabelledCollection</p><
 <p>Protocols that generate samples from a <code class="xref py py-class docutils literal notranslate"><span class="pre">qp.data.LabelledCollection</span></code> object.</p>
 <dl class="py attribute">
 <dt class="sig sig-object py" id="quapy.protocol.OnLabelledCollectionProtocol.RETURN_TYPES">
-<span class="sig-name descname"><span class="pre">RETURN_TYPES</span></span><em class="property"><span class="w"> </span><span class="p"><span class="pre">=</span></span><span class="w"> </span><span class="pre">['sample_prev',</span> <span class="pre">'labelled_collection']</span></em><a class="headerlink" href="#quapy.protocol.OnLabelledCollectionProtocol.RETURN_TYPES" title="Permalink to this definition">¶</a></dt>
+<span class="sig-name descname"><span class="pre">RETURN_TYPES</span></span><em class="property"><span class="w"> </span><span class="p"><span class="pre">=</span></span><span class="w"> </span><span class="pre">['sample_prev',</span> <span class="pre">'labelled_collection',</span> <span class="pre">'index']</span></em><a class="headerlink" href="#quapy.protocol.OnLabelledCollectionProtocol.RETURN_TYPES" title="Permalink to this definition">¶</a></dt>
 <dd></dd></dl>

 <dl class="py method">
@ -841,8 +989,8 @@ with shape <cite>(n_instances,)</cite> when the classifier is a hard one, or wit
 </dd></dl>

 <dl class="py class">
-<dt class="sig sig-object py" id="quapy.protocol.USimplexPP">
-<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.protocol.</span></span><span class="sig-name descname"><span class="pre">USimplexPP</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">data</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.data.html#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><span class="pre">LabelledCollection</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">sample_size</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">repeats</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">random_state</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">0</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">return_type</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'sample_prev'</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.USimplexPP" title="Permalink to this definition">¶</a></dt>
+<dt class="sig sig-object py" id="quapy.protocol.UPP">
+<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.protocol.</span></span><span class="sig-name descname"><span class="pre">UPP</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">data</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="quapy.data.html#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><span class="pre">LabelledCollection</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">sample_size</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">repeats</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">random_state</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">0</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">return_type</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'sample_prev'</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.UPP" title="Permalink to this definition">¶</a></dt>
 <dd><p>Bases: <a class="reference internal" href="#quapy.protocol.AbstractStochasticSeededProtocol" title="quapy.protocol.AbstractStochasticSeededProtocol"><code class="xref py py-class docutils literal notranslate"><span class="pre">AbstractStochasticSeededProtocol</span></code></a>, <a class="reference internal" href="#quapy.protocol.OnLabelledCollectionProtocol" title="quapy.protocol.OnLabelledCollectionProtocol"><code class="xref py py-class docutils literal notranslate"><span class="pre">OnLabelledCollectionProtocol</span></code></a></p>
 <p>A variant of <a class="reference internal" href="#quapy.protocol.APP" title="quapy.protocol.APP"><code class="xref py py-class docutils literal notranslate"><span class="pre">APP</span></code></a> that, instead of using a grid of equidistant prevalence values,
 relies on the Kraemer algorithm for sampling unit (k-1)-simplex uniformly at random, with
@ -865,8 +1013,8 @@ to “labelled_collection” to get instead instances of LabelledCollection</p><
 </dd>
 </dl>
 <dl class="py method">
-<dt class="sig sig-object py" id="quapy.protocol.USimplexPP.sample">
-<span class="sig-name descname"><span class="pre">sample</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">index</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.USimplexPP.sample" title="Permalink to this definition">¶</a></dt>
+<dt class="sig sig-object py" id="quapy.protocol.UPP.sample">
+<span class="sig-name descname"><span class="pre">sample</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">index</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.UPP.sample" title="Permalink to this definition">¶</a></dt>
 <dd><p>Realizes the sample given the index of the instances.</p>
 <dl class="field-list simple">
 <dt class="field-odd">Parameters<span class="colon">:</span></dt>
@ -879,19 +1027,19 @@ to “labelled_collection” to get instead instances of LabelledCollection</p><
 </dd></dl>

 <dl class="py method">
-<dt class="sig sig-object py" id="quapy.protocol.USimplexPP.samples_parameters">
-<span class="sig-name descname"><span class="pre">samples_parameters</span></span><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.USimplexPP.samples_parameters" title="Permalink to this definition">¶</a></dt>
-<dd><p>Return all the necessary parameters to replicate the samples as according to the USimplexPP protocol.</p>
+<dt class="sig sig-object py" id="quapy.protocol.UPP.samples_parameters">
+<span class="sig-name descname"><span class="pre">samples_parameters</span></span><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.UPP.samples_parameters" title="Permalink to this definition">¶</a></dt>
+<dd><p>Return all the necessary parameters to replicate the samples as according to the UPP protocol.</p>
 <dl class="field-list simple">
 <dt class="field-odd">Returns<span class="colon">:</span></dt>
-<dd class="field-odd"><p>a list of indexes that realize the USimplexPP sampling</p>
+<dd class="field-odd"><p>a list of indexes that realize the UPP sampling</p>
 </dd>
 </dl>
 </dd></dl>

 <dl class="py method">
-<dt class="sig sig-object py" id="quapy.protocol.USimplexPP.total">
-<span class="sig-name descname"><span class="pre">total</span></span><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.USimplexPP.total" title="Permalink to this definition">¶</a></dt>
+<dt class="sig sig-object py" id="quapy.protocol.UPP.total">
+<span class="sig-name descname"><span class="pre">total</span></span><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#quapy.protocol.UPP.total" title="Permalink to this definition">¶</a></dt>
 <dd><p>Returns the number of samples that will be generated (equals to “repeats”)</p>
 <dl class="field-list simple">
 <dt class="field-odd">Returns<span class="colon">:</span></dt>
@ -902,6 +1050,12 @@ to “labelled_collection” to get instead instances of LabelledCollection</p><

 </dd></dl>

+<dl class="py attribute">
+<dt class="sig sig-object py" id="quapy.protocol.UniformPrevalenceProtocol">
+<span class="sig-prename descclassname"><span class="pre">quapy.protocol.</span></span><span class="sig-name descname"><span class="pre">UniformPrevalenceProtocol</span></span><a class="headerlink" href="#quapy.protocol.UniformPrevalenceProtocol" title="Permalink to this definition">¶</a></dt>
+<dd><p>alias of <a class="reference internal" href="#quapy.protocol.UPP" title="quapy.protocol.UPP"><code class="xref py py-class docutils literal notranslate"><span class="pre">UPP</span></code></a></p>
+</dd></dl>
+
 </section>
 <section id="module-quapy.functional">
 <span id="quapy-functional"></span><h2>quapy.functional<a class="headerlink" href="#module-quapy.functional" title="Permalink to this heading">¶</a></h2>
@ -1175,9 +1329,9 @@ protocol for quantification.</p>
 <dd class="field-odd"><ul class="simple">
 <li><p><strong>model</strong> (<a class="reference internal" href="quapy.method.html#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><em>BaseQuantifier</em></a>) – the quantifier to optimize</p></li>
 <li><p><strong>param_grid</strong> – a dictionary with keys the parameter names and values the list of values to explore</p></li>
-<li><p><strong>protocol</strong> – </p></li>
+<li><p><strong>protocol</strong> – a sample generation protocol, an instance of <a class="reference internal" href="#quapy.protocol.AbstractProtocol" title="quapy.protocol.AbstractProtocol"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.protocol.AbstractProtocol</span></code></a></p></li>
 <li><p><strong>error</strong> – an error function (callable) or a string indicating the name of an error function (valid ones
-are those in qp.error.QUANTIFICATION_ERROR</p></li>
+are those in <code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.error.QUANTIFICATION_ERROR</span></code></p></li>
 <li><p><strong>refit</strong> – whether or not to refit the model on the whole labelled collection (training+validation) with
 the best chosen hyperparameter combination. Ignored if protocol=’gen’</p></li>
 <li><p><strong>timeout</strong> – establishes a timer (in seconds) for each of the hyperparameters configurations being tested.
--- a/docs/build/html/quapy.method.html
+++ b/docs/build/html/quapy.method.html
@ -1718,7 +1718,7 @@ registered hooks while the latter silently ignores them.</p>

 <dl class="py class">
 <dt class="sig sig-object py" id="quapy.method.neural.QuaNetTrainer">
-<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.neural.</span></span><span class="sig-name descname"><span class="pre">QuaNetTrainer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">classifier</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">sample_size</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_epochs</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">tr_iter_per_poch</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">500</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">va_iter_per_poch</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">lr</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">0.001</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">lstm_hidden_size</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">64</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">lstm_nlayers</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">1</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">ff_layers</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">[1024,</span> <span class="pre">512]</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">bidirectional</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">qdrop_p</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">0.5</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">patience</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">10</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">checkpointdir</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'../checkpoint'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">checkpointname</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">device</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'cuda'</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.method.neural.QuaNetTrainer" title="Permalink to this definition">¶</a></dt>
+<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.neural.</span></span><span class="sig-name descname"><span class="pre">QuaNetTrainer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">classifier</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">sample_size</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_epochs</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">tr_iter_per_poch</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">500</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">va_iter_per_poch</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">lr</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">0.001</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">lstm_hidden_size</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">64</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">lstm_nlayers</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">1</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">ff_layers</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">[1024,</span> <span class="pre">512]</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">bidirectional</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">qdrop_p</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">0.5</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">patience</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">10</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">checkpointdir</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'../checkpoint'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">checkpointname</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">device</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'cuda'</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#quapy.method.neural.QuaNetTrainer" title="Permalink to this definition">¶</a></dt>
 <dd><p>Bases: <a class="reference internal" href="#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><code class="xref py py-class docutils literal notranslate"><span class="pre">BaseQuantifier</span></code></a></p>
 <p>Implementation of <a class="reference external" href="https://dl.acm.org/doi/abs/10.1145/3269206.3269287">QuaNet</a>, a neural network for
 quantification. This implementation uses <a class="reference external" href="https://pytorch.org/">PyTorch</a> and can take advantage of GPU
@ -1751,7 +1751,8 @@ for speeding-up the training phase.</p>
 <li><p><strong>classifier</strong> – an object implementing <cite>fit</cite> (i.e., that can be trained on labelled data),
 <cite>predict_proba</cite> (i.e., that can generate posterior probabilities of unlabelled examples) and
 <cite>transform</cite> (i.e., that can generate embedded representations of the unlabelled instances).</p></li>
-<li><p><strong>sample_size</strong> – integer, the sample size</p></li>
+<li><p><strong>sample_size</strong> – integer, the sample size; default is None, meaning that the sample size should be
+taken from qp.environ[“SAMPLE_SIZE”]</p></li>
 <li><p><strong>n_epochs</strong> – integer, maximum number of training epochs</p></li>
 <li><p><strong>tr_iter_per_poch</strong> – integer, number of training iterations before considering an epoch complete</p></li>
 <li><p><strong>va_iter_per_poch</strong> – integer, number of validation iterations to perform after each epoch</p></li>
--- a/docs/build/html/searchindex.js
+++ b/docs/build/html/searchindex.js
--- a/examples/model_selection.py
+++ b/examples/model_selection.py
@ -0,0 +1,57 @@
+import quapy as qp
+from quapy.protocol import APP
+from quapy.method.aggregative import DistributionMatching
+from sklearn.linear_model import LogisticRegression
+import numpy as np
+
+"""
+In this example, we show how to perform model selection on a DistributionMatching quantifier.
+"""
+
+model = DistributionMatching(LogisticRegression())
+
+qp.environ['SAMPLE_SIZE'] = 100
+qp.environ['N_JOBS'] = -1
+
+training, test = qp.datasets.fetch_reviews('imdb', tfidf=True, min_df=5).train_test
+
+# The model will be returned by the fit method of GridSearchQ.
+# Every combination of hyper-parameters will be evaluated by confronting the
+# quantifier thus configured against a series of samples generated by means
+# of a sample generation protocol. For this example, we will use the
+# artificial-prevalence protocol (APP), that generates samples with prevalence
+# values in the entire range of values from a grid (e.g., [0, 0.1, 0.2, ..., 1]).
+# We devote 30% of the dataset for this exploration.
+training, validation = training.split_stratified(train_prop=0.7)
+protocol = APP(validation)
+
+# We will explore a classification-dependent hyper-parameter (e.g., the 'C'
+# hyper-parameter of LogisticRegression) and a quantification-dependent hyper-parameter
+# (e.g., the number of bins in a DistributionMatching quantifier.
+# Classifier-dependent hyper-parameters have to be marked with a prefix "classifier__"
+# in order to let the quantifier know this hyper-parameter belongs to its underlying
+# classifier.
+param_grid = {
+    'classifier__C': np.logspace(-3,3,7),
+    'nbins': [8, 16, 32, 64],
+}
+
+model = qp.model_selection.GridSearchQ(
+    model=model,
+    param_grid=param_grid,
+    protocol=protocol,
+    error='mae',  # the error to optimize is the MAE (a quantification-oriented loss)
+    refit=True,   # retrain on the whole labelled set once done
+    verbose=True  # show information as the process goes on
+).fit(training)
+
+print(f'model selection ended: best hyper-parameters={model.best_params_}')
+model = model.best_model_
+
+# evaluation in terms of MAE
+# we use the same evaluation protocol (APP) on the test set
+mae_score = qp.evaluation.evaluate(model, protocol=APP(test), error_metric='mae')
+
+print(f'MAE={mae_score:.5f}')
+
+
--- a/quapy/CHANGE_LOG.txt
+++ b/quapy/CHANGE_LOG.txt
@ -1,22 +1,16 @@
 Change Log 0.1.7
---------------------
+----------------

 - Protocols are now abstracted as instances of AbstractProtocol. There is a new class extending AbstractProtocol called
    AbstractStochasticSeededProtocol, which implements a seeding policy to allow replicate the series of samplings.
    There are some examples of protocols, APP, NPP, UPP, DomainMixer (experimental).
-    The idea is to start the sampling by simply calling the __call__ method.
+    The idea is to start the sample generation by simply calling the __call__ method.
    This change has a great impact in the framework, since many functions in qp.evaluation, qp.model_selection,
    and sampling functions in LabelledCollection relied of the old functions. E.g., the functionality of
    qp.evaluation.artificial_prevalence_report or qp.evaluation.natural_prevalence_report is now obtained by means of
    qp.evaluation.report which takes a protocol as an argument. I have not maintained compatibility with the old
    interfaces because I did not really like them. Check the wiki guide and the examples for more details.

-    check guides
-
-    check examples
-
- ACC, PACC, Forman's threshold variants have been parallelized.
-
 - Exploration of hyperparameters in Model selection can now be run in parallel (there was a n_jobs argument in
    QuaPy 0.1.6 but only the evaluation part for one specific hyperparameter was run in parallel).

@ -26,17 +20,19 @@ Change Log 0.1.7
    procedure. The user can now specify "force", "auto", True of False, in order to actively decide for applying it
    or not.

- n_jobs is now taken from the environment if set to None
-
 - examples directory created!

- cross_val_predict (for quantification) added to model_selection: would be nice to allow the user specifies a
-    test protocol maybe, or None for bypassing it?
-
 - DyS, Topsoe distance and binary search (thanks to Pablo González)

 - Multi-thread reproducibility via seeding (thanks to Pablo González)

+- n_jobs is now taken from the environment if set to None
+
+- ACC, PACC, Forman's threshold variants have been parallelized.
+
+- cross_val_predict (for quantification) added to model_selection: would be nice to allow the user specifies a
+    test protocol maybe, or None for bypassing it?
+
 - Bugfix: adding two labelled collections (with +) now checks for consistency in the classes

 - newer versions of numpy raise a warning when accessing types (e.g., np.float). I have replaced all such instances
--- a/quapy/data/base.py
+++ b/quapy/data/base.py
@ -1,4 +1,6 @@
+import itertools
 from functools import cached_property
+from typing import Iterable

 import numpy as np
 from scipy.sparse import issparse
@ -129,11 +131,23 @@ class LabelledCollection:
        # <= size * prevs[i]) examples are drawn from class i, there could be a remainder number of instances to take
        # to satisfy the size constrain. The remainder is distributed along the classes with probability = prevs.
        # (This aims at avoiding the remainder to be placed in a class for which the prevalence requested is 0.)
-        n_requests = {class_: int(size * prevs[i]) for i, class_ in enumerate(self.classes_)}
+        n_requests = {class_: round(size * prevs[i]) for i, class_ in enumerate(self.classes_)}
        remainder = size - sum(n_requests.values())
        with temp_seed(random_state):
+            # due to rounding, the remainder can be 0, >0, or <0
+            if remainder > 0:
+                # when the remainder is >0 we randomly add 1 to the requests for each class;
+                # more prevalent classes are more likely to be taken in order to minimize the impact in the final prevalence
                for rand_class in np.random.choice(self.classes_, size=remainder, p=prevs):
                    n_requests[rand_class] += 1
+            elif remainder < 0:
+                # when the remainder is <0 we randomly remove 1 from the requests, unless the request is 0 for a chosen
+                # class; we repeat until remainder==0
+                while remainder!=0:
+                    rand_class = np.random.choice(self.classes_, p=prevs)
+                    if n_requests[rand_class] > 0:
+                        n_requests[rand_class] -= 1
+                        remainder += 1

            indexes_sample = []
            for class_, n_requested in n_requests.items():
@ -266,31 +280,47 @@ class LabelledCollection:
        if not all(np.sort(self.classes_)==np.sort(other.classes_)):
            raise NotImplementedError(f'unsupported operation for collections on different classes; '
                                      f'expected {self.classes_}, found {other.classes_}')
-        return LabelledCollection.mix(self, other)
+        return LabelledCollection.join(self, other)

    @classmethod
-    def mix(cls, a:'LabelledCollection', b:'LabelledCollection'):
+    def join(cls, *args: Iterable['LabelledCollection']):
        """
-        Returns a new :class:`LabelledCollection` as the union of this collection with another collection.
+        Returns a new :class:`LabelledCollection` as the union of the collections given in input.

-        :param a: instance of :class:`LabelledCollection`
-        :param b: instance of :class:`LabelledCollection`
+        :param args: instances of :class:`LabelledCollection`
        :return: a :class:`LabelledCollection` representing the union of both collections
        """
-        if a is None: return b
-        if b is None: return a
-        elif issparse(a.instances) and issparse(b.instances):
-            join_instances = vstack([a.instances, b.instances])
-        elif isinstance(a.instances, list) and isinstance(b.instances, list):
-            join_instances = a.instances + b.instances
-        elif isinstance(a.instances, np.ndarray) and isinstance(b.instances, np.ndarray):
-            join_instances = np.concatenate([a.instances, b.instances])
+
+        args = [lc for lc in args if lc is not None]
+        assert len(args) > 0, 'empty list is not allowed for mix'
+
+        assert all([isinstance(lc, LabelledCollection) for lc in args]), \
+            'only instances of LabelledCollection allowed'
+
+        first_instances = args[0].instances
+        first_type = type(first_instances)
+        assert all([type(lc.instances)==first_type for lc in args[1:]]), \
+            'not all the collections are of instances of the same type'
+
+        if issparse(first_instances) or isinstance(first_instances, np.ndarray):
+            first_ndim = first_instances.ndim
+            assert all([lc.instances.ndim == first_ndim for lc in args[1:]]), \
+                'not all the ndarrays are of the same dimension'
+            if first_ndim > 1:
+                first_shape = first_instances.shape[1:]
+                assert all([lc.instances.shape[1:] == first_shape for lc in args[1:]]), \
+                    'not all the ndarrays are of the same shape'
+            if issparse(first_instances):
+                instances = vstack([lc.instances for lc in args])
+            else:
+                instances = np.concatenate([lc.instances for lc in args])
+        elif isinstance(first_instances, list):
+            instances = list(itertools.chain(lc.instances for lc in args))
        else:
            raise NotImplementedError('unsupported operation for collection types')
-        labels = np.concatenate([a.labels, b.labels])
-        classes = np.unique(np.concatenate([a.classes_, b.classes_])).sort()
-        return LabelledCollection(join_instances, labels, classes=classes)
-
+        labels = np.concatenate([lc.labels for lc in args])
+        classes = np.unique(labels).sort()
+        return LabelledCollection(instances, labels, classes=classes)

    @property
    def Xy(self):
--- a/quapy/evaluation.py
+++ b/quapy/evaluation.py
@ -16,7 +16,7 @@ def prediction(
    Uses a quantification model to generate predictions for the samples generated via a specific protocol.
    This function is central to all evaluation processes, and is endowed with an optimization to speed-up the
    prediction of protocols that generate samples from a large collection. The optimization applies to aggregative
-    quantifiers only, and to OnLabelledCollection protocols, and comes down to generating the classification
+    quantifiers only, and to OnLabelledCollectionProtocol protocols, and comes down to generating the classification
    predictions once and for all, and then generating samples over the classification predictions (instead of over
    the raw instances), so that the classifier prediction is never called again. This behaviour is obtained by
    setting `aggr_speedup` to 'auto' or True, and is only carried out if the overall process is convenient in terms
@ -25,7 +25,7 @@ def prediction(

    :param model: a quantifier, instance of :class:`quapy.method.base.BaseQuantifier`
    :param protocol: :class:`quapy.protocol.AbstractProtocol`; if this object is also instance of
-        :class:`quapy.protocol.OnLabelledCollection`, then the aggregation speed-up can be run. This is the protocol
+        :class:`quapy.protocol.OnLabelledCollectionProtocol`, then the aggregation speed-up can be run. This is the protocol
        in charge of generating the samples for which the model has to issue class prevalence predictions.
    :param aggr_speedup: whether or not to apply the speed-up. Set to "force" for applying it even if the number of
        instances in the original collection on which the protocol acts is larger than the number of instances
@ -90,7 +90,7 @@ def evaluation_report(model: BaseQuantifier,

    :param model: a quantifier, instance of :class:`quapy.method.base.BaseQuantifier`
    :param protocol: :class:`quapy.protocol.AbstractProtocol`; if this object is also instance of
-        :class:`quapy.protocol.OnLabelledCollection`, then the aggregation speed-up can be run. This is the protocol
+        :class:`quapy.protocol.OnLabelledCollectionProtocol`, then the aggregation speed-up can be run. This is the protocol
        in charge of generating the samples in which the model is evaluated.
    :param error_metrics: a string, or list of strings, representing the name(s) of an error function in `qp.error`
        (e.g., 'mae', the default value), or a callable function, or a list of callable functions, implementing
@ -141,8 +141,8 @@ def evaluate(

    :param model: a quantifier, instance of :class:`quapy.method.base.BaseQuantifier`
    :param protocol: :class:`quapy.protocol.AbstractProtocol`; if this object is also instance of
-        :class:`quapy.protocol.OnLabelledCollection`, then the aggregation speed-up can be run. This is the protocol
-        in charge of generating the samples in which the model is evaluated.
+        :class:`quapy.protocol.OnLabelledCollectionProtocol`, then the aggregation speed-up can be run. This is the
+        protocol in charge of generating the samples in which the model is evaluated.
    :param error_metric: a string representing the name(s) of an error function in `qp.error`
        (e.g., 'mae'), or a callable function implementing the error function itself.
    :param aggr_speedup: whether or not to apply the speed-up. Set to "force" for applying it even if the number of
--- a/quapy/model_selection.py
+++ b/quapy/model_selection.py
@ -23,9 +23,9 @@ class GridSearchQ(BaseQuantifier):
    :param model: the quantifier to optimize
    :type model: BaseQuantifier
    :param param_grid: a dictionary with keys the parameter names and values the list of values to explore
-    :param protocol:
+    :param protocol: a sample generation protocol, an instance of :class:`quapy.protocol.AbstractProtocol`
    :param error: an error function (callable) or a string indicating the name of an error function (valid ones
-        are those in qp.error.QUANTIFICATION_ERROR
+        are those in :class:`quapy.error.QUANTIFICATION_ERROR`
    :param refit: whether or not to refit the model on the whole labelled collection (training+validation) with
        the best chosen hyperparameter combination. Ignored if protocol='gen'
    :param timeout: establishes a timer (in seconds) for each of the hyperparameters configurations being tested.
--- a/quapy/plot.py
+++ b/quapy/plot.py
@ -51,8 +51,9 @@ def binary_diagonal(method_names, true_prevs, estim_prevs, pos_class=1, title=No
        table = {method_name:[true_prev, estim_prev] for method_name, true_prev, estim_prev in order}
        order  = [(method_name, *table[method_name]) for method_name in method_order]

-    cm = plt.get_cmap('tab20')
    NUM_COLORS = len(method_names)
+    if NUM_COLORS>10:
+        cm = plt.get_cmap('tab20')
        ax.set_prop_cycle(color=[cm(1. * i / NUM_COLORS) for i in range(NUM_COLORS)])
    for method, true_prev, estim_prev in order:
        true_prev = true_prev[:,pos_class]
@ -76,13 +77,12 @@ def binary_diagonal(method_names, true_prevs, estim_prevs, pos_class=1, title=No
    ax.set_xlim(0, 1)

    if legend:
+        ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))
        # box = ax.get_position()
        # ax.set_position([box.x0, box.y0, box.width * 0.8, box.height])
-        # ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))
-        # ax.set_position([box.x0, box.y0, box.width * 0.8, box.height])
-        ax.legend(loc='lower center',
-                  bbox_to_anchor=(1, -0.5),
-                  ncol=(len(method_names)+1)//2)
+        # ax.legend(loc='lower center',
+        #           bbox_to_anchor=(1, -0.5),
+        #           ncol=(len(method_names)+1)//2)

    _save_or_show(savepath)

--- a/quapy/protocol.py
+++ b/quapy/protocol.py
@ -127,6 +127,15 @@ class AbstractStochasticSeededProtocol(AbstractProtocol):
                yield self.collator(self.sample(params))

    def collator(self, sample, *args):
+        """
+        The collator prepares the sample to accommodate the desired output format before returning the output.
+        This collator simply returns the sample as it is. Classes inheriting from this abstract class can
+        implement their custom collators.
+
+        :param sample: the sample to be returned
+        :param args: additional arguments
+        :return: the sample adhering to a desired output format (in this case, the sample is returned as it is)
+        """
        return sample


--- a/quapy/tests/test_labelcollection.py
+++ b/quapy/tests/test_labelcollection.py
@ -1,5 +1,7 @@
 import unittest
 import numpy as np
+from scipy.sparse import csr_matrix
+
 import quapy as qp


@ -16,6 +18,51 @@ class LabelCollectionTestCase(unittest.TestCase):
        self.assertEqual(np.allclose(check_prev, data.prevalence()), True)
        self.assertEqual(len(tr+te), len(data))

+    def test_join(self):
+        x = np.arange(50)
+        y = np.random.randint(2, 5, 50)
+        data1 = qp.data.LabelledCollection(x, y)
+
+        x = np.arange(200)
+        y = np.random.randint(0, 3, 200)
+        data2 = qp.data.LabelledCollection(x, y)
+
+        x = np.arange(100)
+        y = np.random.randint(0, 6, 100)
+        data3 = qp.data.LabelledCollection(x, y)
+
+        combined = qp.data.LabelledCollection.join(data1, data2, data3)
+        self.assertEqual(len(combined), len(data1)+len(data2)+len(data3))
+        self.assertEqual(all(combined.classes_ == np.arange(6)), True)
+
+        x = np.random.rand(10, 3)
+        y = np.random.randint(0, 1, 10)
+        data4 = qp.data.LabelledCollection(x, y)
+        with self.assertRaises(Exception):
+            combined = qp.data.LabelledCollection.join(data1, data2, data3, data4)
+
+        x = np.random.rand(20, 3)
+        y = np.random.randint(0, 1, 20)
+        data5 = qp.data.LabelledCollection(x, y)
+        combined = qp.data.LabelledCollection.join(data4, data5)
+        self.assertEqual(len(combined), len(data4)+len(data5))
+
+        x = np.random.rand(10, 4)
+        y = np.random.randint(0, 1, 10)
+        data6 = qp.data.LabelledCollection(x, y)
+        with self.assertRaises(Exception):
+            combined = qp.data.LabelledCollection.join(data4, data5, data6)
+
+        data4.instances = csr_matrix(data4.instances)
+        with self.assertRaises(Exception):
+            combined = qp.data.LabelledCollection.join(data4, data5)
+        data5.instances = csr_matrix(data5.instances)
+        combined = qp.data.LabelledCollection.join(data4, data5)
+        self.assertEqual(len(combined), len(data4) + len(data5))
+
+        # data2.instances = csr_matrix()
+
+

 if __name__ == '__main__':
    unittest.main()