forked from moreo/QuaPy
539 lines
45 KiB
HTML
539 lines
45 KiB
HTML
|
||
|
||
<!doctype html>
|
||
|
||
<html lang="en">
|
||
<head>
|
||
<meta charset="utf-8" />
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="generator" content="Docutils 0.19: https://docutils.sourceforge.io/" />
|
||
|
||
<title>Quantification Methods — QuaPy 0.1.7 documentation</title>
|
||
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
|
||
<link rel="stylesheet" type="text/css" href="_static/bizstyle.css" />
|
||
|
||
<script data-url_root="./" id="documentation_options" src="_static/documentation_options.js"></script>
|
||
<script src="_static/jquery.js"></script>
|
||
<script src="_static/underscore.js"></script>
|
||
<script src="_static/_sphinx_javascript_frameworks_compat.js"></script>
|
||
<script src="_static/doctools.js"></script>
|
||
<script src="_static/sphinx_highlight.js"></script>
|
||
<script src="_static/bizstyle.js"></script>
|
||
<link rel="index" title="Index" href="genindex.html" />
|
||
<link rel="search" title="Search" href="search.html" />
|
||
<link rel="next" title="Model Selection" href="Model-Selection.html" />
|
||
<link rel="prev" title="Protocols" href="Protocols.html" />
|
||
<meta name="viewport" content="width=device-width,initial-scale=1.0" />
|
||
<!--[if lt IE 9]>
|
||
<script src="_static/css3-mediaqueries.js"></script>
|
||
<![endif]-->
|
||
</head><body>
|
||
<div class="related" role="navigation" aria-label="related navigation">
|
||
<h3>Navigation</h3>
|
||
<ul>
|
||
<li class="right" style="margin-right: 10px">
|
||
<a href="genindex.html" title="General Index"
|
||
accesskey="I">index</a></li>
|
||
<li class="right" >
|
||
<a href="py-modindex.html" title="Python Module Index"
|
||
>modules</a> |</li>
|
||
<li class="right" >
|
||
<a href="Model-Selection.html" title="Model Selection"
|
||
accesskey="N">next</a> |</li>
|
||
<li class="right" >
|
||
<a href="Protocols.html" title="Protocols"
|
||
accesskey="P">previous</a> |</li>
|
||
<li class="nav-item nav-item-0"><a href="index.html">QuaPy 0.1.7 documentation</a> »</li>
|
||
<li class="nav-item nav-item-this"><a href="">Quantification Methods</a></li>
|
||
</ul>
|
||
</div>
|
||
|
||
<div class="document">
|
||
<div class="documentwrapper">
|
||
<div class="bodywrapper">
|
||
<div class="body" role="main">
|
||
|
||
<section id="quantification-methods">
|
||
<h1>Quantification Methods<a class="headerlink" href="#quantification-methods" title="Permalink to this heading">¶</a></h1>
|
||
<p>Quantification methods can be categorized as belonging to
|
||
<em>aggregative</em> and <em>non-aggregative</em> groups.
|
||
Most methods included in QuaPy at the moment are of type <em>aggregative</em>
|
||
(though we plan to add many more methods in the near future), i.e.,
|
||
are methods characterized by the fact that
|
||
quantification is performed as an aggregation function of the individual
|
||
products of classification.</p>
|
||
<p>Any quantifier in QuaPy shoud extend the class <em>BaseQuantifier</em>,
|
||
and implement some abstract methods:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span> <span class="nd">@abstractmethod</span>
|
||
<span class="k">def</span> <span class="nf">fit</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">:</span> <span class="n">LabelledCollection</span><span class="p">):</span> <span class="o">...</span>
|
||
|
||
<span class="nd">@abstractmethod</span>
|
||
<span class="k">def</span> <span class="nf">quantify</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">instances</span><span class="p">):</span> <span class="o">...</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>The meaning of those functions should be familiar to those
|
||
used to work with scikit-learn since the class structure of QuaPy
|
||
is directly inspired by scikit-learn’s <em>Estimators</em>. Functions
|
||
<em>fit</em> and <em>quantify</em> are used to train the model and to provide
|
||
class estimations (the reason why
|
||
scikit-learn’ structure has not been adopted <em>as is</em> in QuaPy responds to
|
||
the fact that scikit-learn’s <em>predict</em> function is expected to return
|
||
one output for each input element –e.g., a predicted label for each
|
||
instance in a sample– while in quantification the output for a sample
|
||
is one single array of class prevalences).
|
||
Quantifiers also extend from scikit-learn’s <code class="docutils literal notranslate"><span class="pre">BaseEstimator</span></code>, in order
|
||
to simplify the use of <em>set_params</em> and <em>get_params</em> used in
|
||
<a class="reference external" href="https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection">model selector</a>.</p>
|
||
<section id="aggregative-methods">
|
||
<h2>Aggregative Methods<a class="headerlink" href="#aggregative-methods" title="Permalink to this heading">¶</a></h2>
|
||
<p>All quantification methods are implemented as part of the
|
||
<em>qp.method</em> package. In particular, <em>aggregative</em> methods are defined in
|
||
<em>qp.method.aggregative</em>, and extend <em>AggregativeQuantifier(BaseQuantifier)</em>.
|
||
The methods that any <em>aggregative</em> quantifier must implement are:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span> <span class="nd">@abstractmethod</span>
|
||
<span class="k">def</span> <span class="nf">fit</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">data</span><span class="p">:</span> <span class="n">LabelledCollection</span><span class="p">,</span> <span class="n">fit_learner</span><span class="o">=</span><span class="kc">True</span><span class="p">):</span> <span class="o">...</span>
|
||
|
||
<span class="nd">@abstractmethod</span>
|
||
<span class="k">def</span> <span class="nf">aggregate</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">classif_predictions</span><span class="p">:</span><span class="n">np</span><span class="o">.</span><span class="n">ndarray</span><span class="p">):</span> <span class="o">...</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>since, as mentioned before, aggregative methods base their prediction on the
|
||
individual predictions of a classifier. Indeed, a default implementation
|
||
of <em>BaseQuantifier.quantify</em> is already provided, which looks like:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span> <span class="k">def</span> <span class="nf">quantify</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">instances</span><span class="p">):</span>
|
||
<span class="n">classif_predictions</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">classify</span><span class="p">(</span><span class="n">instances</span><span class="p">)</span>
|
||
<span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">aggregate</span><span class="p">(</span><span class="n">classif_predictions</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>Aggregative quantifiers are expected to maintain a classifier (which is
|
||
accessed through the <em>@property</em> <em>classifier</em>). This classifier is
|
||
given as input to the quantifier, and can be already fit
|
||
on external data (in which case, the <em>fit_learner</em> argument should
|
||
be set to False), or be fit by the quantifier’s fit (default).</p>
|
||
<p>Another class of <em>aggregative</em> methods are the <em>probabilistic</em>
|
||
aggregative methods, that should inherit from the abstract class
|
||
<em>AggregativeProbabilisticQuantifier(AggregativeQuantifier)</em>.
|
||
The particularity of <em>probabilistic</em> aggregative methods (w.r.t.
|
||
non-probabilistic ones), is that the default quantifier is defined
|
||
in terms of the posterior probabilities returned by a probabilistic
|
||
classifier, and not by the crisp decisions of a hard classifier.
|
||
In any case, the interface <em>classify(instances)</em> remains unchanged.</p>
|
||
<p>One advantage of <em>aggregative</em> methods (either probabilistic or not)
|
||
is that the evaluation according to any sampling procedure (e.g.,
|
||
the <a class="reference external" href="https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation">artificial sampling protocol</a>)
|
||
can be achieved very efficiently, since the entire set can be pre-classified
|
||
once, and the quantification estimations for different samples can directly
|
||
reuse these predictions, without requiring to classify each element every time.
|
||
QuaPy leverages this property to speed-up any procedure having to do with
|
||
quantification over samples, as is customarily done in model selection or
|
||
in evaluation.</p>
|
||
<section id="the-classify-count-variants">
|
||
<h3>The Classify & Count variants<a class="headerlink" href="#the-classify-count-variants" title="Permalink to this heading">¶</a></h3>
|
||
<p>QuaPy implements the four CC variants, i.e.:</p>
|
||
<ul class="simple">
|
||
<li><p><em>CC</em> (Classify & Count), the simplest aggregative quantifier; one that
|
||
simply relies on the label predictions of a classifier to deliver class estimates.</p></li>
|
||
<li><p><em>ACC</em> (Adjusted Classify & Count), the adjusted variant of CC.</p></li>
|
||
<li><p><em>PCC</em> (Probabilistic Classify & Count), the probabilistic variant of CC that
|
||
relies on the soft estimations (or posterior probabilities) returned by a (probabilistic) classifier.</p></li>
|
||
<li><p><em>PACC</em> (Probabilistic Adjusted Classify & Count), the adjusted variant of PCC.</p></li>
|
||
</ul>
|
||
<p>The following code serves as a complete example using CC equipped
|
||
with a SVM as the classifier:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">quapy</span> <span class="k">as</span> <span class="nn">qp</span>
|
||
<span class="kn">import</span> <span class="nn">quapy.functional</span> <span class="k">as</span> <span class="nn">F</span>
|
||
<span class="kn">from</span> <span class="nn">sklearn.svm</span> <span class="kn">import</span> <span class="n">LinearSVC</span>
|
||
|
||
<span class="n">training</span><span class="p">,</span> <span class="n">test</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_twitter</span><span class="p">(</span><span class="s1">'hcr'</span><span class="p">,</span> <span class="n">pickle</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span><span class="o">.</span><span class="n">train_test</span>
|
||
|
||
<span class="c1"># instantiate a classifier learner, in this case a SVM</span>
|
||
<span class="n">svm</span> <span class="o">=</span> <span class="n">LinearSVC</span><span class="p">()</span>
|
||
|
||
<span class="c1"># instantiate a Classify & Count with the SVM</span>
|
||
<span class="c1"># (an alias is available in qp.method.aggregative.ClassifyAndCount)</span>
|
||
<span class="n">model</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">method</span><span class="o">.</span><span class="n">aggregative</span><span class="o">.</span><span class="n">CC</span><span class="p">(</span><span class="n">svm</span><span class="p">)</span>
|
||
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">training</span><span class="p">)</span>
|
||
<span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>The same code could be used to instantiate an ACC, by simply replacing
|
||
the instantiation of the model with:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">model</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">method</span><span class="o">.</span><span class="n">aggregative</span><span class="o">.</span><span class="n">ACC</span><span class="p">(</span><span class="n">svm</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>Note that the adjusted variants (ACC and PACC) need to estimate
|
||
some parameters for performing the adjustment (e.g., the
|
||
<em>true positive rate</em> and the <em>false positive rate</em> in case of
|
||
binary classification) that are estimated on a validation split
|
||
of the labelled set. In this case, the <strong>init</strong> method of
|
||
ACC defines an additional parameter, <em>val_split</em> which, by
|
||
default, is set to 0.4 and so, the 40% of the labelled data
|
||
will be used for estimating the parameters for adjusting the
|
||
predictions. This parameters can also be set with an integer,
|
||
indicating that the parameters should be estimated by means of
|
||
<em>k</em>-fold cross-validation, for which the integer indicates the
|
||
number <em>k</em> of folds. Finally, <em>val_split</em> can be set to a
|
||
specific held-out validation set (i.e., an instance of <em>LabelledCollection</em>).</p>
|
||
<p>The specification of <em>val_split</em> can be
|
||
postponed to the invokation of the fit method (if <em>val_split</em> was also
|
||
set in the constructor, the one specified at fit time would prevail),
|
||
e.g.:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">model</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">method</span><span class="o">.</span><span class="n">aggregative</span><span class="o">.</span><span class="n">ACC</span><span class="p">(</span><span class="n">svm</span><span class="p">)</span>
|
||
<span class="c1"># perform 5-fold cross validation for estimating ACC's parameters</span>
|
||
<span class="c1"># (overrides the default val_split=0.4 in the constructor)</span>
|
||
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">training</span><span class="p">,</span> <span class="n">val_split</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>The following code illustrates the case in which PCC is used:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">model</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">method</span><span class="o">.</span><span class="n">aggregative</span><span class="o">.</span><span class="n">PCC</span><span class="p">(</span><span class="n">svm</span><span class="p">)</span>
|
||
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">training</span><span class="p">)</span>
|
||
<span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
|
||
<span class="nb">print</span><span class="p">(</span><span class="s1">'classifier:'</span><span class="p">,</span> <span class="n">model</span><span class="o">.</span><span class="n">classifier</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>In this case, QuaPy will print:</p>
|
||
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">The</span> <span class="n">learner</span> <span class="n">LinearSVC</span> <span class="n">does</span> <span class="ow">not</span> <span class="n">seem</span> <span class="n">to</span> <span class="n">be</span> <span class="n">probabilistic</span><span class="o">.</span> <span class="n">The</span> <span class="n">learner</span> <span class="n">will</span> <span class="n">be</span> <span class="n">calibrated</span><span class="o">.</span>
|
||
<span class="n">classifier</span><span class="p">:</span> <span class="n">CalibratedClassifierCV</span><span class="p">(</span><span class="n">base_estimator</span><span class="o">=</span><span class="n">LinearSVC</span><span class="p">(),</span> <span class="n">cv</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>The first output indicates that the learner (<em>LinearSVC</em> in this case)
|
||
is not a probabilistic classifier (i.e., it does not implement the
|
||
<em>predict_proba</em> method) and so, the classifier will be converted to
|
||
a probabilistic one through <a class="reference external" href="https://scikit-learn.org/stable/modules/calibration.html">calibration</a>.
|
||
As a result, the classifier that is printed in the second line points
|
||
to a <em>CalibratedClassifier</em> instance. Note that calibration can only
|
||
be applied to hard classifiers when <em>fit_learner=True</em>; an exception
|
||
will be raised otherwise.</p>
|
||
<p>Lastly, everything we said aboud ACC and PCC
|
||
applies to PACC as well.</p>
|
||
</section>
|
||
<section id="expectation-maximization-emq">
|
||
<h3>Expectation Maximization (EMQ)<a class="headerlink" href="#expectation-maximization-emq" title="Permalink to this heading">¶</a></h3>
|
||
<p>The Expectation Maximization Quantifier (EMQ), also known as
|
||
the SLD, is available at <em>qp.method.aggregative.EMQ</em> or via the
|
||
alias <em>qp.method.aggregative.ExpectationMaximizationQuantifier</em>.
|
||
The method is described in:</p>
|
||
<p><em>Saerens, M., Latinne, P., and Decaestecker, C. (2002). Adjusting the outputs of a classifier
|
||
to new a priori probabilities: A simple procedure. Neural Computation, 14(1):21–41.</em></p>
|
||
<p>EMQ works with a probabilistic classifier (if the classifier
|
||
given as input is a hard one, a calibration will be attempted).
|
||
Although this method was originally proposed for improving the
|
||
posterior probabilities of a probabilistic classifier, and not
|
||
for improving the estimation of prior probabilities, EMQ ranks
|
||
almost always among the most effective quantifiers in the
|
||
experiments we have carried out.</p>
|
||
<p>An example of use can be found below:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">quapy</span> <span class="k">as</span> <span class="nn">qp</span>
|
||
<span class="kn">from</span> <span class="nn">sklearn.linear_model</span> <span class="kn">import</span> <span class="n">LogisticRegression</span>
|
||
|
||
<span class="n">dataset</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_twitter</span><span class="p">(</span><span class="s1">'hcr'</span><span class="p">,</span> <span class="n">pickle</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||
|
||
<span class="n">model</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">method</span><span class="o">.</span><span class="n">aggregative</span><span class="o">.</span><span class="n">EMQ</span><span class="p">(</span><span class="n">LogisticRegression</span><span class="p">())</span>
|
||
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">training</span><span class="p">)</span>
|
||
<span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p><em>New in v0.1.7</em>: EMQ now accepts two new parameters in the construction method, namely
|
||
<em>exact_train_prev</em> which allows to use the true training prevalence as the departing
|
||
prevalence estimation (default behaviour), or instead an approximation of it as
|
||
suggested by <a class="reference external" href="http://proceedings.mlr.press/v119/alexandari20a.html">Alexandari et al. (2020)</a>
|
||
(by setting <em>exact_train_prev=False</em>).
|
||
The other parameter is <em>recalib</em> which allows to indicate a calibration method, among those
|
||
proposed by <a class="reference external" href="http://proceedings.mlr.press/v119/alexandari20a.html">Alexandari et al. (2020)</a>,
|
||
including the Bias-Corrected Temperature Scaling, Vector Scaling, etc.
|
||
See the API documentation for further details.</p>
|
||
</section>
|
||
<section id="hellinger-distance-y-hdy">
|
||
<h3>Hellinger Distance y (HDy)<a class="headerlink" href="#hellinger-distance-y-hdy" title="Permalink to this heading">¶</a></h3>
|
||
<p>Implementation of the method based on the Hellinger Distance y (HDy) proposed by
|
||
<a class="reference external" href="https://www.sciencedirect.com/science/article/pii/S0020025512004069">González-Castro, V., Alaiz-Rodrı́guez, R., and Alegre, E. (2013). Class distribution
|
||
estimation based on the Hellinger distance. Information Sciences, 218:146–164.</a></p>
|
||
<p>It is implemented in <em>qp.method.aggregative.HDy</em> (also accessible
|
||
through the allias <em>qp.method.aggregative.HellingerDistanceY</em>).
|
||
This method works with a probabilistic classifier (hard classifiers
|
||
can be used as well and will be calibrated) and requires a validation
|
||
set to estimate parameter for the mixture model. Just like
|
||
ACC and PACC, this quantifier receives a <em>val_split</em> argument
|
||
in the constructor (or in the fit method, in which case the previous
|
||
value is overridden) that can either be a float indicating the proportion
|
||
of training data to be taken as the validation set (in a random
|
||
stratified split), or a validation set (i.e., an instance of
|
||
<em>LabelledCollection</em>) itself.</p>
|
||
<p>HDy was proposed as a binary classifier and the implementation
|
||
provided in QuaPy accepts only binary datasets.</p>
|
||
<p>The following code shows an example of use:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">quapy</span> <span class="k">as</span> <span class="nn">qp</span>
|
||
<span class="kn">from</span> <span class="nn">sklearn.linear_model</span> <span class="kn">import</span> <span class="n">LogisticRegression</span>
|
||
|
||
<span class="c1"># load a binary dataset</span>
|
||
<span class="n">dataset</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_reviews</span><span class="p">(</span><span class="s1">'hp'</span><span class="p">,</span> <span class="n">pickle</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||
<span class="n">qp</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">preprocessing</span><span class="o">.</span><span class="n">text2tfidf</span><span class="p">(</span><span class="n">dataset</span><span class="p">,</span> <span class="n">min_df</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span> <span class="n">inplace</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||
|
||
<span class="n">model</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">method</span><span class="o">.</span><span class="n">aggregative</span><span class="o">.</span><span class="n">HDy</span><span class="p">(</span><span class="n">LogisticRegression</span><span class="p">())</span>
|
||
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">training</span><span class="p">)</span>
|
||
<span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p><em>New in v0.1.7:</em> QuaPy now provides an implementation of the generalized
|
||
“Distribution Matching” approaches for multiclass, inspired by the framework
|
||
of <a class="reference external" href="https://arxiv.org/abs/1606.00868">Firat (2016)</a>. One can instantiate
|
||
a variant of HDy for multiclass quantification as follows:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="n">mutliclassHDy</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">method</span><span class="o">.</span><span class="n">aggregative</span><span class="o">.</span><span class="n">DistributionMatching</span><span class="p">(</span><span class="n">classifier</span><span class="o">=</span><span class="n">LogisticRegression</span><span class="p">(),</span> <span class="n">divergence</span><span class="o">=</span><span class="s1">'HD'</span><span class="p">,</span> <span class="n">cdf</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p><em>New in v0.1.7:</em> QuaPy now provides an implementation of the “DyS”
|
||
framework proposed by <a class="reference external" href="https://ojs.aaai.org/index.php/AAAI/article/view/4376">Maletzke et al (2020)</a>
|
||
and the “SMM” method proposed by <a class="reference external" href="https://ieeexplore.ieee.org/document/9260028">Hassan et al (2019)</a>
|
||
(thanks to <em>Pablo González</em> for the contributions!)</p>
|
||
</section>
|
||
<section id="threshold-optimization-methods">
|
||
<h3>Threshold Optimization methods<a class="headerlink" href="#threshold-optimization-methods" title="Permalink to this heading">¶</a></h3>
|
||
<p><em>New in v0.1.7:</em> QuaPy now implements Forman’s threshold optimization methods;
|
||
see, e.g., <a class="reference external" href="https://dl.acm.org/doi/abs/10.1145/1150402.1150423">(Forman 2006)</a>
|
||
and <a class="reference external" href="https://link.springer.com/article/10.1007/s10618-008-0097-y">(Forman 2008)</a>.
|
||
These include: T50, MAX, X, Median Sweep (MS), and its variant MS2.</p>
|
||
</section>
|
||
<section id="explicit-loss-minimization">
|
||
<h3>Explicit Loss Minimization<a class="headerlink" href="#explicit-loss-minimization" title="Permalink to this heading">¶</a></h3>
|
||
<p>The Explicit Loss Minimization (ELM) represent a family of methods
|
||
based on structured output learning, i.e., quantifiers relying on
|
||
classifiers that have been optimized targeting a
|
||
quantification-oriented evaluation measure.
|
||
The original methods are implemented in QuaPy as classify & count (CC)
|
||
quantifiers that use Joachim’s <a class="reference external" href="https://www.cs.cornell.edu/people/tj/svm_light/svm_perf.html">SVMperf</a>
|
||
as the underlying classifier, properly set to optimize for the desired loss.</p>
|
||
<p>In QuaPy, this can be more achieved by calling the functions:</p>
|
||
<ul class="simple">
|
||
<li><p><em>newSVMQ</em>: returns the quantification method called SVM(Q) that optimizes for the metric <em>Q</em> defined
|
||
in <a class="reference external" href="https://www.sciencedirect.com/science/article/pii/S003132031400291X"><em>Barranquero, J., Díez, J., and del Coz, J. J. (2015). Quantification-oriented learning based
|
||
on reliable classifiers. Pattern Recognition, 48(2):591–604.</em></a></p></li>
|
||
<li><p><em>newSVMKLD</em> and <em>newSVMNKLD</em>: returns the quantification method called SVM(KLD) and SVM(nKLD), standing for
|
||
Kullback-Leibler Divergence and Normalized Kullback-Leibler Divergence, as proposed in <a class="reference external" href="https://dl.acm.org/doi/abs/10.1145/2700406"><em>Esuli, A. and Sebastiani, F. (2015).
|
||
Optimizing text quantifiers for multivariate loss functions.
|
||
ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27.</em></a></p></li>
|
||
<li><p><em>newSVMAE</em> and <em>newSVMRAE</em>: returns a quantification method called SVM(AE) and SVM(RAE) that optimizes for the (Mean) Absolute Error and for the
|
||
(Mean) Relative Absolute Error, as first used by
|
||
<a class="reference external" href="https://arxiv.org/abs/2011.02552"><em>Moreo, A. and Sebastiani, F. (2021). Tweet sentiment quantification: An experimental re-evaluation. PLOS ONE 17 (9), 1-23.</em></a></p></li>
|
||
</ul>
|
||
<p>the last two methods (SVM(AE) and SVM(RAE)) have been implemented in
|
||
QuaPy in order to make available ELM variants for what nowadays
|
||
are considered the most well-behaved evaluation metrics in quantification.</p>
|
||
<p>In order to make these models work, you would need to run the script
|
||
<em>prepare_svmperf.sh</em> (distributed along with QuaPy) that
|
||
downloads <em>SVMperf</em>’ source code, applies a patch that
|
||
implements the quantification oriented losses, and compiles the
|
||
sources.</p>
|
||
<p>If you want to add any custom loss, you would need to modify
|
||
the source code of <em>SVMperf</em> in order to implement it, and
|
||
assign a valid loss code to it. Then you must re-compile
|
||
the whole thing and instantiate the quantifier in QuaPy
|
||
as follows:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="c1"># you can either set the path to your custom svm_perf_quantification implementation</span>
|
||
<span class="c1"># in the environment variable, or as an argument to the constructor of ELM</span>
|
||
<span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">'SVMPERF_HOME'</span><span class="p">]</span> <span class="o">=</span> <span class="s1">'./path/to/svm_perf_quantification'</span>
|
||
|
||
<span class="c1"># assign an alias to your custom loss and the id you have assigned to it</span>
|
||
<span class="n">svmperf</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">classification</span><span class="o">.</span><span class="n">svmperf</span><span class="o">.</span><span class="n">SVMperf</span>
|
||
<span class="n">svmperf</span><span class="o">.</span><span class="n">valid_losses</span><span class="p">[</span><span class="s1">'mycustomloss'</span><span class="p">]</span> <span class="o">=</span> <span class="mi">28</span>
|
||
|
||
<span class="c1"># instantiate the ELM method indicating the loss</span>
|
||
<span class="n">model</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">method</span><span class="o">.</span><span class="n">aggregative</span><span class="o">.</span><span class="n">ELM</span><span class="p">(</span><span class="n">loss</span><span class="o">=</span><span class="s1">'mycustomloss'</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>All ELM are binary quantifiers since they rely on <em>SVMperf</em>, that
|
||
currently supports only binary classification.
|
||
ELM variants (any binary quantifier in general) can be extended
|
||
to operate in single-label scenarios trivially by adopting a
|
||
“one-vs-all” strategy (as, e.g., in
|
||
<a class="reference external" href="https://link.springer.com/article/10.1007/s13278-016-0327-z"><em>Gao, W. and Sebastiani, F. (2016). From classification to quantification in tweet sentiment
|
||
analysis. Social Network Analysis and Mining, 6(19):1–22</em></a>).
|
||
In QuaPy this is possible by using the <em>OneVsAll</em> class.</p>
|
||
<p>There are two ways for instantiating this class, <em>OneVsAllGeneric</em> that works for
|
||
any quantifier, and <em>OneVsAllAggregative</em> that is optimized for aggregative quantifiers.
|
||
In general, you can simply use the <em>getOneVsAll</em> function and QuaPy will choose
|
||
the more convenient of the two.</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">quapy</span> <span class="k">as</span> <span class="nn">qp</span>
|
||
<span class="kn">from</span> <span class="nn">quapy.method.aggregative</span> <span class="kn">import</span> <span class="n">SVMQ</span>
|
||
|
||
<span class="c1"># load a single-label dataset (this one contains 3 classes)</span>
|
||
<span class="n">dataset</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_twitter</span><span class="p">(</span><span class="s1">'hcr'</span><span class="p">,</span> <span class="n">pickle</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||
|
||
<span class="c1"># let qp know where svmperf is</span>
|
||
<span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">'SVMPERF_HOME'</span><span class="p">]</span> <span class="o">=</span> <span class="s1">'../svm_perf_quantification'</span>
|
||
|
||
<span class="n">model</span> <span class="o">=</span> <span class="n">getOneVsAll</span><span class="p">(</span><span class="n">SVMQ</span><span class="p">(),</span> <span class="n">n_jobs</span><span class="o">=-</span><span class="mi">1</span><span class="p">)</span> <span class="c1"># run them on parallel</span>
|
||
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">training</span><span class="p">)</span>
|
||
<span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>Check the examples <em><span class="xref myst">explicit_loss_minimization.py</span></em>
|
||
and <span class="xref myst">one_vs_all.py</span> for more details.</p>
|
||
</section>
|
||
</section>
|
||
<section id="meta-models">
|
||
<h2>Meta Models<a class="headerlink" href="#meta-models" title="Permalink to this heading">¶</a></h2>
|
||
<p>By <em>meta</em> models we mean quantification methods that are defined on top of other
|
||
quantification methods, and that thus do not squarely belong to the aggregative nor
|
||
the non-aggregative group (indeed, <em>meta</em> models could use quantifiers from any of those
|
||
groups).
|
||
<em>Meta</em> models are implemented in the <em>qp.method.meta</em> module.</p>
|
||
<section id="ensembles">
|
||
<h3>Ensembles<a class="headerlink" href="#ensembles" title="Permalink to this heading">¶</a></h3>
|
||
<p>QuaPy implements (some of) the variants proposed in:</p>
|
||
<ul class="simple">
|
||
<li><p><a class="reference external" href="https://www.sciencedirect.com/science/article/pii/S1566253516300628"><em>Pérez-Gállego, P., Quevedo, J. R., & del Coz, J. J. (2017).
|
||
Using ensembles for problems with characterizable changes in data distribution: A case study on quantification.
|
||
Information Fusion, 34, 87-100.</em></a></p></li>
|
||
<li><p><a class="reference external" href="https://www.sciencedirect.com/science/article/pii/S1566253517303652"><em>Pérez-Gállego, P., Castano, A., Quevedo, J. R., & del Coz, J. J. (2019).
|
||
Dynamic ensemble selection for quantification tasks.
|
||
Information Fusion, 45, 1-15.</em></a></p></li>
|
||
</ul>
|
||
<p>The following code shows how to instantiate an Ensemble of 30 <em>Adjusted Classify & Count</em> (ACC)
|
||
quantifiers operating with a <em>Logistic Regressor</em> (LR) as the base classifier, and using the
|
||
<em>average</em> as the aggregation policy (see the original article for further details).
|
||
The last parameter indicates to use all processors for parallelization.</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">quapy</span> <span class="k">as</span> <span class="nn">qp</span>
|
||
<span class="kn">from</span> <span class="nn">quapy.method.aggregative</span> <span class="kn">import</span> <span class="n">ACC</span>
|
||
<span class="kn">from</span> <span class="nn">quapy.method.meta</span> <span class="kn">import</span> <span class="n">Ensemble</span>
|
||
<span class="kn">from</span> <span class="nn">sklearn.linear_model</span> <span class="kn">import</span> <span class="n">LogisticRegression</span>
|
||
|
||
<span class="n">dataset</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_UCIDataset</span><span class="p">(</span><span class="s1">'haberman'</span><span class="p">)</span>
|
||
|
||
<span class="n">model</span> <span class="o">=</span> <span class="n">Ensemble</span><span class="p">(</span><span class="n">quantifier</span><span class="o">=</span><span class="n">ACC</span><span class="p">(</span><span class="n">LogisticRegression</span><span class="p">()),</span> <span class="n">size</span><span class="o">=</span><span class="mi">30</span><span class="p">,</span> <span class="n">policy</span><span class="o">=</span><span class="s1">'ave'</span><span class="p">,</span> <span class="n">n_jobs</span><span class="o">=-</span><span class="mi">1</span><span class="p">)</span>
|
||
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">training</span><span class="p">)</span>
|
||
<span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
<p>Other aggregation policies implemented in QuaPy include:</p>
|
||
<ul class="simple">
|
||
<li><p>‘ptr’ for applying a dynamic selection based on the training prevalence of the ensemble’s members</p></li>
|
||
<li><p>‘ds’ for applying a dynamic selection based on the Hellinger Distance</p></li>
|
||
<li><p><em>any valid quantification measure</em> (e.g., ‘mse’) for performing a static selection based on
|
||
the performance estimated for each member of the ensemble in terms of that evaluation metric.</p></li>
|
||
</ul>
|
||
<p>When using any of the above options, it is important to set the <em>red_size</em> parameter, which
|
||
informs of the number of members to retain.</p>
|
||
<p>Please, check the <a class="reference external" href="https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection">model selection</a>
|
||
wiki if you want to optimize the hyperparameters of ensemble for classification or quantification.</p>
|
||
</section>
|
||
<section id="the-quanet-neural-network">
|
||
<h3>The QuaNet neural network<a class="headerlink" href="#the-quanet-neural-network" title="Permalink to this heading">¶</a></h3>
|
||
<p>QuaPy offers an implementation of QuaNet, a deep learning model presented in:</p>
|
||
<p><a class="reference external" href="https://dl.acm.org/doi/abs/10.1145/3269206.3269287"><em>Esuli, A., Moreo, A., & Sebastiani, F. (2018, October).
|
||
A recurrent neural network for sentiment quantification.
|
||
In Proceedings of the 27th ACM International Conference on
|
||
Information and Knowledge Management (pp. 1775-1778).</em></a></p>
|
||
<p>This model requires <em>torch</em> to be installed.
|
||
QuaNet also requires a classifier that can provide embedded representations
|
||
of the inputs.
|
||
In the original paper, QuaNet was tested using an LSTM as the base classifier.
|
||
In the following example, we show an instantiation of QuaNet that instead uses CNN as a probabilistic classifier, taking its last layer representation as the document embedding:</p>
|
||
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">quapy</span> <span class="k">as</span> <span class="nn">qp</span>
|
||
<span class="kn">from</span> <span class="nn">quapy.method.meta</span> <span class="kn">import</span> <span class="n">QuaNet</span>
|
||
<span class="kn">from</span> <span class="nn">quapy.classification.neural</span> <span class="kn">import</span> <span class="n">NeuralClassifierTrainer</span><span class="p">,</span> <span class="n">CNNnet</span>
|
||
|
||
<span class="c1"># use samples of 100 elements</span>
|
||
<span class="n">qp</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s1">'SAMPLE_SIZE'</span><span class="p">]</span> <span class="o">=</span> <span class="mi">100</span>
|
||
|
||
<span class="c1"># load the kindle dataset as text, and convert words to numerical indexes</span>
|
||
<span class="n">dataset</span> <span class="o">=</span> <span class="n">qp</span><span class="o">.</span><span class="n">datasets</span><span class="o">.</span><span class="n">fetch_reviews</span><span class="p">(</span><span class="s1">'kindle'</span><span class="p">,</span> <span class="n">pickle</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||
<span class="n">qp</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">preprocessing</span><span class="o">.</span><span class="n">index</span><span class="p">(</span><span class="n">dataset</span><span class="p">,</span> <span class="n">min_df</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span> <span class="n">inplace</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
|
||
|
||
<span class="c1"># the text classifier is a CNN trained by NeuralClassifierTrainer</span>
|
||
<span class="n">cnn</span> <span class="o">=</span> <span class="n">CNNnet</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">vocabulary_size</span><span class="p">,</span> <span class="n">dataset</span><span class="o">.</span><span class="n">n_classes</span><span class="p">)</span>
|
||
<span class="n">learner</span> <span class="o">=</span> <span class="n">NeuralClassifierTrainer</span><span class="p">(</span><span class="n">cnn</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="s1">'cuda'</span><span class="p">)</span>
|
||
|
||
<span class="c1"># train QuaNet</span>
|
||
<span class="n">model</span> <span class="o">=</span> <span class="n">QuaNet</span><span class="p">(</span><span class="n">learner</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="s1">'cuda'</span><span class="p">)</span>
|
||
<span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">training</span><span class="p">)</span>
|
||
<span class="n">estim_prevalence</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">quantify</span><span class="p">(</span><span class="n">dataset</span><span class="o">.</span><span class="n">test</span><span class="o">.</span><span class="n">instances</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
</section>
|
||
</section>
|
||
</section>
|
||
|
||
|
||
<div class="clearer"></div>
|
||
</div>
|
||
</div>
|
||
</div>
|
||
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
|
||
<div class="sphinxsidebarwrapper">
|
||
<div>
|
||
<h3><a href="index.html">Table of Contents</a></h3>
|
||
<ul>
|
||
<li><a class="reference internal" href="#">Quantification Methods</a><ul>
|
||
<li><a class="reference internal" href="#aggregative-methods">Aggregative Methods</a><ul>
|
||
<li><a class="reference internal" href="#the-classify-count-variants">The Classify & Count variants</a></li>
|
||
<li><a class="reference internal" href="#expectation-maximization-emq">Expectation Maximization (EMQ)</a></li>
|
||
<li><a class="reference internal" href="#hellinger-distance-y-hdy">Hellinger Distance y (HDy)</a></li>
|
||
<li><a class="reference internal" href="#threshold-optimization-methods">Threshold Optimization methods</a></li>
|
||
<li><a class="reference internal" href="#explicit-loss-minimization">Explicit Loss Minimization</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a class="reference internal" href="#meta-models">Meta Models</a><ul>
|
||
<li><a class="reference internal" href="#ensembles">Ensembles</a></li>
|
||
<li><a class="reference internal" href="#the-quanet-neural-network">The QuaNet neural network</a></li>
|
||
</ul>
|
||
</li>
|
||
</ul>
|
||
</li>
|
||
</ul>
|
||
|
||
</div>
|
||
<div>
|
||
<h4>Previous topic</h4>
|
||
<p class="topless"><a href="Protocols.html"
|
||
title="previous chapter">Protocols</a></p>
|
||
</div>
|
||
<div>
|
||
<h4>Next topic</h4>
|
||
<p class="topless"><a href="Model-Selection.html"
|
||
title="next chapter">Model Selection</a></p>
|
||
</div>
|
||
<div role="note" aria-label="source link">
|
||
<h3>This Page</h3>
|
||
<ul class="this-page-menu">
|
||
<li><a href="_sources/Methods.md.txt"
|
||
rel="nofollow">Show Source</a></li>
|
||
</ul>
|
||
</div>
|
||
<div id="searchbox" style="display: none" role="search">
|
||
<h3 id="searchlabel">Quick search</h3>
|
||
<div class="searchformwrapper">
|
||
<form class="search" action="search.html" method="get">
|
||
<input type="text" name="q" aria-labelledby="searchlabel" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false"/>
|
||
<input type="submit" value="Go" />
|
||
</form>
|
||
</div>
|
||
</div>
|
||
<script>document.getElementById('searchbox').style.display = "block"</script>
|
||
</div>
|
||
</div>
|
||
<div class="clearer"></div>
|
||
</div>
|
||
<div class="related" role="navigation" aria-label="related navigation">
|
||
<h3>Navigation</h3>
|
||
<ul>
|
||
<li class="right" style="margin-right: 10px">
|
||
<a href="genindex.html" title="General Index"
|
||
>index</a></li>
|
||
<li class="right" >
|
||
<a href="py-modindex.html" title="Python Module Index"
|
||
>modules</a> |</li>
|
||
<li class="right" >
|
||
<a href="Model-Selection.html" title="Model Selection"
|
||
>next</a> |</li>
|
||
<li class="right" >
|
||
<a href="Protocols.html" title="Protocols"
|
||
>previous</a> |</li>
|
||
<li class="nav-item nav-item-0"><a href="index.html">QuaPy 0.1.7 documentation</a> »</li>
|
||
<li class="nav-item nav-item-this"><a href="">Quantification Methods</a></li>
|
||
</ul>
|
||
</div>
|
||
<div class="footer" role="contentinfo">
|
||
© Copyright 2021, Alejandro Moreo.
|
||
Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 5.3.0.
|
||
</div>
|
||
</body>
|
||
</html> |