adding wiki documents to the sphinx documentation in order to allow for collaboration

This commit is contained in:
Alejandro Moreo Fernandez 2024-04-24 17:03:57 +02:00
parent f1462897ef
commit e92264c280
28 changed files with 2685 additions and 28 deletions

View File

@ -14,7 +14,20 @@ help:
.PHONY: help Makefile
# Convert Markdown files to reStructuredText before building HTML
markdown_to_rst:
@echo "Converting Markdown files to reStructuredText"
@mkdir -p $(SOURCEDIR)/wiki/wiki_examples/selected_plots
@cp $(SOURCEDIR)/wiki_editable/wiki_examples/selected_plots/* $(SOURCEDIR)/wiki/wiki_examples/selected_plots/
@find $(SOURCEDIR)/wiki_editable -name '*.md' -exec sh -c 'pandoc -f markdown -t rst "$$1" -o "$(SOURCEDIR)/wiki/$$(basename "$$1" .md).rst"' _ {} \;
@echo "Conversion complete."
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
html: markdown_to_rst
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
# # Catch-all target: route all unknown targets to Sphinx using the new
# # "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
# %: Makefile
# @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

View File

@ -21,6 +21,23 @@ GitHub
QuaPy is hosted in GitHub at `https://github.com/HLT-ISTI/QuaPy <https://github.com/HLT-ISTI/QuaPy>`_
Wiki Documents
------------
In this section you can find useful information concerning different aspects of QuaPy, with examples:
.. toctree::
:maxdepth: 1
wiki/Datasets
wiki/Evaluation
wiki/ExplicitLossMinimization
wiki/Methods
wiki/Model-Selection
wiki/Plotting
wiki/Protocols
.. toctree::
:maxdepth: 2
:caption: Contents:

View File

@ -52,6 +52,14 @@ quapy.method.non\_aggregative module
:undoc-members:
:show-inheritance:
quapy.method.composable module
------------------------
.. automodule:: quapy.method.composable
:members:
:undoc-members:
:show-inheritance:
Module contents
---------------

View File

@ -43,6 +43,15 @@
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<ul>
<li class="toctree-l1"><a class="reference internal" href="wiki/Datasets.html">Datasets</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Evaluation.html">Evaluation</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/ExplicitLossMinimization.html">Explicit Loss Minimization</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Methods.html">Quantification Methods</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Model-Selection.html">Model Selection</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Plotting.html">Plotting</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Protocols.html">Protocols</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="modules.html">quapy</a></li>
</ul>
@ -250,6 +259,8 @@
<li><a href="quapy.method.html#quapy.method.aggregative.BinaryAggregativeQuantifier">BinaryAggregativeQuantifier (class in quapy.method.aggregative)</a>
</li>
<li><a href="quapy.method.html#quapy.method.base.BinaryQuantifier">BinaryQuantifier (class in quapy.method.base)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.BlobelLoss">BlobelLoss (class in quapy.method.composable)</a>
</li>
<li><a href="quapy.html#quapy.plot.brokenbar_supremacy_by_drift">brokenbar_supremacy_by_drift() (in module quapy.plot)</a>
</li>
@ -288,18 +299,24 @@
</li>
</ul></li>
<li><a href="quapy.method.html#quapy.method.aggregative.ClassifyAndCount">ClassifyAndCount (in module quapy.method.aggregative)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.ClassTransformer">ClassTransformer (class in quapy.method.composable)</a>
</li>
<li><a href="quapy.method.html#quapy.method._neural.QuaNetTrainer.clean_checkpoint">clean_checkpoint() (quapy.method._neural.QuaNetTrainer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method._neural.QuaNetTrainer.clean_checkpoint_dir">clean_checkpoint_dir() (quapy.method._neural.QuaNetTrainer method)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="quapy.html#quapy.functional.clip">clip() (in module quapy.functional)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="quapy.classification.html#quapy.classification.neural.CNNnet">CNNnet (class in quapy.classification.neural)</a>
</li>
<li><a href="quapy.html#quapy.protocol.AbstractStochasticSeededProtocol.collator">collator() (quapy.protocol.AbstractStochasticSeededProtocol method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.CombinedLoss">CombinedLoss (class in quapy.method.composable)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.ComposableQuantifier">ComposableQuantifier() (in module quapy.method.composable)</a>
</li>
<li><a href="quapy.method.html#quapy.method._threshold_optim.MAX.condition">condition() (quapy.method._threshold_optim.MAX method)</a>
@ -326,6 +343,8 @@
<li><a href="quapy.html#quapy.util.create_parent_dir">create_parent_dir() (in module quapy.util)</a>
</li>
<li><a href="quapy.html#quapy.model_selection.cross_val_predict">cross_val_predict() (in module quapy.model_selection)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.CVClassifier">CVClassifier (class in quapy.method.composable)</a>
</li>
</ul></td>
</tr></table>
@ -351,12 +370,14 @@
<li><a href="quapy.method.html#quapy.method._threshold_optim.ThresholdOptimization.discard">(quapy.method._threshold_optim.ThresholdOptimization method)</a>
</li>
</ul></li>
<li><a href="quapy.method.html#quapy.method.non_aggregative.DistributionMatchingX">DistributionMatchingX (in module quapy.method.non_aggregative)</a>
<li><a href="quapy.method.html#quapy.method.composable.DistanceTransformer">DistanceTransformer (class in quapy.method.composable)</a>
</li>
<li><a href="quapy.method.html#quapy.method.aggregative.DistributionMatchingY">DistributionMatchingY (in module quapy.method.aggregative)</a>
<li><a href="quapy.method.html#quapy.method.non_aggregative.DistributionMatchingX">DistributionMatchingX (in module quapy.method.non_aggregative)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="quapy.method.html#quapy.method.aggregative.DistributionMatchingY">DistributionMatchingY (in module quapy.method.aggregative)</a>
</li>
<li><a href="quapy.method.html#quapy.method.non_aggregative.DMx">DMx (class in quapy.method.non_aggregative)</a>
</li>
<li><a href="quapy.method.html#quapy.method.aggregative.DMy">DMy (class in quapy.method.aggregative)</a>
@ -399,10 +420,14 @@
</li>
<li><a href="quapy.method.html#quapy.method.aggregative.EMQ.EMQ_BCTS">EMQ_BCTS() (quapy.method.aggregative.EMQ class method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.meta.Ensemble">Ensemble (class in quapy.method.meta)</a>
<li><a href="quapy.method.html#quapy.method.composable.EnergyKernelTransformer">EnergyKernelTransformer (class in quapy.method.composable)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.EnergyLoss">EnergyLoss (class in quapy.method.composable)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="quapy.method.html#quapy.method.meta.Ensemble">Ensemble (class in quapy.method.meta)</a>
</li>
<li><a href="quapy.method.html#quapy.method.meta.ensembleFactory">ensembleFactory() (in module quapy.method.meta)</a>
</li>
<li><a href="quapy.method.html#quapy.method.meta.EPACC">EPACC() (in module quapy.method.meta)</a>
@ -473,6 +498,8 @@
<li><a href="quapy.method.html#quapy.method.base.BaseQuantifier.fit">(quapy.method.base.BaseQuantifier method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.base.OneVsAllGeneric.fit">(quapy.method.base.OneVsAllGeneric method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.CVClassifier.fit">(quapy.method.composable.CVClassifier method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.meta.Ensemble.fit">(quapy.method.meta.Ensemble method)</a>
</li>
@ -496,7 +523,23 @@
<li><a href="quapy.classification.html#quapy.classification.calibration.RecalibratedProbabilisticClassifierBase.fit_tr_val">fit_tr_val() (quapy.classification.calibration.RecalibratedProbabilisticClassifierBase method)</a>
</li>
<li><a href="quapy.data.html#quapy.data.preprocessing.IndexTransformer.fit_transform">fit_transform() (quapy.data.preprocessing.IndexTransformer method)</a>
<ul>
<li><a href="quapy.method.html#quapy.method.composable.ClassTransformer.fit_transform">(quapy.method.composable.ClassTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.DistanceTransformer.fit_transform">(quapy.method.composable.DistanceTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.EnergyKernelTransformer.fit_transform">(quapy.method.composable.EnergyKernelTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.GaussianKernelTransformer.fit_transform">(quapy.method.composable.GaussianKernelTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.GaussianRFFKernelTransformer.fit_transform">(quapy.method.composable.GaussianRFFKernelTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.HistogramTransformer.fit_transform">(quapy.method.composable.HistogramTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.KernelTransformer.fit_transform">(quapy.method.composable.KernelTransformer method)</a>
</li>
</ul></li>
<li><a href="quapy.classification.html#quapy.classification.neural.TextClassifierNet.forward">forward() (quapy.classification.neural.TextClassifierNet method)</a>
<ul>
@ -517,6 +560,10 @@
<h2 id="G">G</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="quapy.method.html#quapy.method.composable.GaussianKernelTransformer">GaussianKernelTransformer (class in quapy.method.composable)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.GaussianRFFKernelTransformer">GaussianRFFKernelTransformer (class in quapy.method.composable)</a>
</li>
<li><a href="quapy.html#quapy.protocol.OnLabelledCollectionProtocol.get_collator">get_collator() (quapy.protocol.OnLabelledCollectionProtocol class method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.aggregative.BayesianCC.get_conditional_probability_samples">get_conditional_probability_samples() (quapy.method.aggregative.BayesianCC method)</a>
@ -585,11 +632,15 @@
</li>
<li><a href="quapy.method.html#quapy.method.aggregative.HDy">HDy (class in quapy.method.aggregative)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="quapy.html#quapy.functional.HellingerDistance">HellingerDistance() (in module quapy.functional)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="quapy.method.html#quapy.method.aggregative.HellingerDistanceY">HellingerDistanceY (in module quapy.method.aggregative)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.HellingerSurrogateLoss">HellingerSurrogateLoss (class in quapy.method.composable)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.HistogramTransformer">HistogramTransformer (class in quapy.method.composable)</a>
</li>
</ul></td>
</tr></table>
@ -626,10 +677,14 @@
<li><a href="quapy.method.html#quapy.method._kdey.KDEyCS">KDEyCS (class in quapy.method._kdey)</a>
</li>
<li><a href="quapy.method.html#quapy.method._kdey.KDEyHD">KDEyHD (class in quapy.method._kdey)</a>
</li>
<li><a href="quapy.method.html#quapy.method._kdey.KDEyML">KDEyML (class in quapy.method._kdey)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="quapy.method.html#quapy.method._kdey.KDEyML">KDEyML (class in quapy.method._kdey)</a>
<li><a href="quapy.method.html#quapy.method.composable.LaplacianKernelTransformer.kernel">kernel (quapy.method.composable.LaplacianKernelTransformer property)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.KernelTransformer">KernelTransformer (class in quapy.method.composable)</a>
</li>
<li><a href="quapy.data.html#quapy.data.base.Dataset.kFCV">kFCV() (quapy.data.base.Dataset class method)</a>
@ -649,10 +704,14 @@
</li>
<li><a href="quapy.data.html#quapy.data.base.LabelledCollection">LabelledCollection (class in quapy.data.base)</a>
</li>
<li><a href="quapy.html#quapy.functional.linear_search">linear_search() (in module quapy.functional)</a>
<li><a href="quapy.method.html#quapy.method.composable.LaplacianKernelTransformer">LaplacianKernelTransformer (class in quapy.method.composable)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.LeastSquaresLoss">LeastSquaresLoss (class in quapy.method.composable)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="quapy.html#quapy.functional.linear_search">linear_search() (in module quapy.functional)</a>
</li>
<li><a href="quapy.data.html#quapy.data.base.Dataset.load">load() (quapy.data.base.Dataset class method)</a>
<ul>
@ -746,6 +805,8 @@
<li><a href="quapy.method.html#module-quapy.method.aggregative">quapy.method.aggregative</a>
</li>
<li><a href="quapy.method.html#module-quapy.method.base">quapy.method.base</a>
</li>
<li><a href="quapy.method.html#module-quapy.method.composable">quapy.method.composable</a>
</li>
<li><a href="quapy.method.html#module-quapy.method.meta">quapy.method.meta</a>
</li>
@ -874,6 +935,8 @@
<li><a href="quapy.classification.html#quapy.classification.neural.NeuralClassifierTrainer.predict">(quapy.classification.neural.NeuralClassifierTrainer method)</a>
</li>
<li><a href="quapy.classification.html#quapy.classification.svmperf.SVMperf.predict">(quapy.classification.svmperf.SVMperf method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.CVClassifier.predict">(quapy.method.composable.CVClassifier method)</a>
</li>
</ul></li>
<li><a href="quapy.classification.html#quapy.classification.calibration.RecalibratedProbabilisticClassifierBase.predict_proba">predict_proba() (quapy.classification.calibration.RecalibratedProbabilisticClassifierBase method)</a>
@ -886,6 +949,8 @@
<li><a href="quapy.classification.html#quapy.classification.neural.TextClassifierNet.predict_proba">(quapy.classification.neural.TextClassifierNet method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.aggregative.EMQ.predict_proba">(quapy.method.aggregative.EMQ method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.CVClassifier.predict_proba">(quapy.method.composable.CVClassifier method)</a>
</li>
</ul></li>
</ul></td>
@ -1086,6 +1151,13 @@
<ul>
<li><a href="quapy.method.html#module-quapy.method.base">module</a>
</li>
</ul></li>
<li>
quapy.method.composable
<ul>
<li><a href="quapy.method.html#module-quapy.method.composable">module</a>
</li>
</ul></li>
<li>
@ -1279,6 +1351,10 @@
<li><a href="quapy.classification.html#quapy.classification.neural.TextClassifierNet">TextClassifierNet (class in quapy.classification.neural)</a>
</li>
<li><a href="quapy.method.html#quapy.method._threshold_optim.ThresholdOptimization">ThresholdOptimization (class in quapy.method._threshold_optim)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.TikhonovRegularization">TikhonovRegularization (class in quapy.method.composable)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.TikhonovRegularized">TikhonovRegularized() (in module quapy.method.composable)</a>
</li>
<li><a href="quapy.html#quapy.model_selection.Status.TIMEOUT">TIMEOUT (quapy.model_selection.Status attribute)</a>
</li>
@ -1322,6 +1398,20 @@
<li><a href="quapy.classification.html#quapy.classification.neural.NeuralClassifierTrainer.transform">(quapy.classification.neural.NeuralClassifierTrainer method)</a>
</li>
<li><a href="quapy.data.html#quapy.data.preprocessing.IndexTransformer.transform">(quapy.data.preprocessing.IndexTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.ClassTransformer.transform">(quapy.method.composable.ClassTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.DistanceTransformer.transform">(quapy.method.composable.DistanceTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.EnergyKernelTransformer.transform">(quapy.method.composable.EnergyKernelTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.GaussianKernelTransformer.transform">(quapy.method.composable.GaussianKernelTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.GaussianRFFKernelTransformer.transform">(quapy.method.composable.GaussianRFFKernelTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.HistogramTransformer.transform">(quapy.method.composable.HistogramTransformer method)</a>
</li>
<li><a href="quapy.method.html#quapy.method.composable.KernelTransformer.transform">(quapy.method.composable.KernelTransformer method)</a>
</li>
</ul></li>
<li><a href="quapy.classification.html#quapy.classification.calibration.TSCalibration">TSCalibration (class in quapy.classification.calibration)</a>
@ -1332,14 +1422,16 @@
<h2 id="U">U</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="quapy.html#quapy.functional.uniform_prevalence">uniform_prevalence() (in module quapy.functional)</a>
</li>
<li><a href="quapy.html#quapy.functional.uniform_prevalence_sampling">uniform_prevalence_sampling() (in module quapy.functional)</a>
</li>
<li><a href="quapy.data.html#quapy.data.base.LabelledCollection.uniform_sampling">uniform_sampling() (quapy.data.base.LabelledCollection method)</a>
</li>
<li><a href="quapy.data.html#quapy.data.base.LabelledCollection.uniform_sampling_index">uniform_sampling_index() (quapy.data.base.LabelledCollection method)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="quapy.data.html#quapy.data.base.LabelledCollection.uniform_sampling_index">uniform_sampling_index() (quapy.data.base.LabelledCollection method)</a>
</li>
<li><a href="quapy.html#quapy.functional.uniform_simplex_sampling">uniform_simplex_sampling() (in module quapy.functional)</a>
</li>
<li><a href="quapy.html#quapy.protocol.UniformPrevalenceProtocol">UniformPrevalenceProtocol (in module quapy.protocol)</a>

View File

@ -22,7 +22,7 @@
<script src="_static/js/theme.js"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="quapy" href="modules.html" />
<link rel="next" title="Datasets" href="wiki/Datasets.html" />
</head>
<body class="wy-body-for-nav">
@ -45,6 +45,15 @@
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<ul>
<li class="toctree-l1"><a class="reference internal" href="wiki/Datasets.html">Datasets</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Evaluation.html">Evaluation</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/ExplicitLossMinimization.html">Explicit Loss Minimization</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Methods.html">Quantification Methods</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Model-Selection.html">Model Selection</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Plotting.html">Plotting</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Protocols.html">Protocols</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="modules.html">quapy</a></li>
</ul>
@ -83,6 +92,21 @@
<section id="github">
<h2>GitHub<a class="headerlink" href="#github" title="Permalink to this heading"></a></h2>
<p>QuaPy is hosted in GitHub at <a class="reference external" href="https://github.com/HLT-ISTI/QuaPy">https://github.com/HLT-ISTI/QuaPy</a></p>
</section>
<section id="wiki-documents">
<h2>Wiki Documents<a class="headerlink" href="#wiki-documents" title="Permalink to this heading"></a></h2>
<p>In this section you can find useful information concerning different aspects of QuaPy, with examples:</p>
<div class="toctree-wrapper compound">
<ul>
<li class="toctree-l1"><a class="reference internal" href="wiki/Datasets.html">Datasets</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Evaluation.html">Evaluation</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/ExplicitLossMinimization.html">Explicit Loss Minimization</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Methods.html">Quantification Methods</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Model-Selection.html">Model Selection</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Plotting.html">Plotting</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Protocols.html">Protocols</a></li>
</ul>
</div>
<div class="toctree-wrapper compound">
</div>
</section>
@ -517,6 +541,62 @@
</li>
</ul>
</li>
<li class="toctree-l5"><a class="reference internal" href="quapy.method.html#quapy-method-composable-module">quapy.method.composable module</a><ul>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.BlobelLoss"><code class="docutils literal notranslate"><span class="pre">BlobelLoss</span></code></a></li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.CVClassifier"><code class="docutils literal notranslate"><span class="pre">CVClassifier</span></code></a><ul>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.CVClassifier.fit"><code class="docutils literal notranslate"><span class="pre">CVClassifier.fit()</span></code></a></li>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.CVClassifier.predict"><code class="docutils literal notranslate"><span class="pre">CVClassifier.predict()</span></code></a></li>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.CVClassifier.predict_proba"><code class="docutils literal notranslate"><span class="pre">CVClassifier.predict_proba()</span></code></a></li>
</ul>
</li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.ClassTransformer"><code class="docutils literal notranslate"><span class="pre">ClassTransformer</span></code></a><ul>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.ClassTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">ClassTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.ClassTransformer.transform"><code class="docutils literal notranslate"><span class="pre">ClassTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.CombinedLoss"><code class="docutils literal notranslate"><span class="pre">CombinedLoss</span></code></a></li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.ComposableQuantifier"><code class="docutils literal notranslate"><span class="pre">ComposableQuantifier()</span></code></a></li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.DistanceTransformer"><code class="docutils literal notranslate"><span class="pre">DistanceTransformer</span></code></a><ul>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.DistanceTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">DistanceTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.DistanceTransformer.transform"><code class="docutils literal notranslate"><span class="pre">DistanceTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.EnergyKernelTransformer"><code class="docutils literal notranslate"><span class="pre">EnergyKernelTransformer</span></code></a><ul>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.EnergyKernelTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">EnergyKernelTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.EnergyKernelTransformer.transform"><code class="docutils literal notranslate"><span class="pre">EnergyKernelTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.EnergyLoss"><code class="docutils literal notranslate"><span class="pre">EnergyLoss</span></code></a></li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianKernelTransformer"><code class="docutils literal notranslate"><span class="pre">GaussianKernelTransformer</span></code></a><ul>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianKernelTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">GaussianKernelTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianKernelTransformer.transform"><code class="docutils literal notranslate"><span class="pre">GaussianKernelTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianRFFKernelTransformer"><code class="docutils literal notranslate"><span class="pre">GaussianRFFKernelTransformer</span></code></a><ul>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianRFFKernelTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">GaussianRFFKernelTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianRFFKernelTransformer.transform"><code class="docutils literal notranslate"><span class="pre">GaussianRFFKernelTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.HellingerSurrogateLoss"><code class="docutils literal notranslate"><span class="pre">HellingerSurrogateLoss</span></code></a></li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.HistogramTransformer"><code class="docutils literal notranslate"><span class="pre">HistogramTransformer</span></code></a><ul>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.HistogramTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">HistogramTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.HistogramTransformer.transform"><code class="docutils literal notranslate"><span class="pre">HistogramTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.KernelTransformer"><code class="docutils literal notranslate"><span class="pre">KernelTransformer</span></code></a><ul>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.KernelTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">KernelTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.KernelTransformer.transform"><code class="docutils literal notranslate"><span class="pre">KernelTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.LaplacianKernelTransformer"><code class="docutils literal notranslate"><span class="pre">LaplacianKernelTransformer</span></code></a><ul>
<li class="toctree-l7"><a class="reference internal" href="quapy.method.html#quapy.method.composable.LaplacianKernelTransformer.kernel"><code class="docutils literal notranslate"><span class="pre">LaplacianKernelTransformer.kernel</span></code></a></li>
</ul>
</li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.LeastSquaresLoss"><code class="docutils literal notranslate"><span class="pre">LeastSquaresLoss</span></code></a></li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.TikhonovRegularization"><code class="docutils literal notranslate"><span class="pre">TikhonovRegularization</span></code></a></li>
<li class="toctree-l6"><a class="reference internal" href="quapy.method.html#quapy.method.composable.TikhonovRegularized"><code class="docutils literal notranslate"><span class="pre">TikhonovRegularized()</span></code></a></li>
</ul>
</li>
<li class="toctree-l5"><a class="reference internal" href="quapy.method.html#module-quapy.method">Module contents</a></li>
</ul>
</li>
@ -586,6 +666,7 @@
<li class="toctree-l4"><a class="reference internal" href="quapy.html#quapy.functional.solve_adjustment_binary"><code class="docutils literal notranslate"><span class="pre">solve_adjustment_binary()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.html#quapy.functional.strprev"><code class="docutils literal notranslate"><span class="pre">strprev()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.html#quapy.functional.ternary_search"><code class="docutils literal notranslate"><span class="pre">ternary_search()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.html#quapy.functional.uniform_prevalence"><code class="docutils literal notranslate"><span class="pre">uniform_prevalence()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.html#quapy.functional.uniform_prevalence_sampling"><code class="docutils literal notranslate"><span class="pre">uniform_prevalence_sampling()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.html#quapy.functional.uniform_simplex_sampling"><code class="docutils literal notranslate"><span class="pre">uniform_simplex_sampling()</span></code></a></li>
</ul>
@ -715,7 +796,7 @@
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="modules.html" class="btn btn-neutral float-right" title="quapy" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
<a href="wiki/Datasets.html" class="btn btn-neutral float-right" title="Datasets" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>
<hr/>

View File

@ -106,6 +106,7 @@
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#module-quapy.method.base">quapy.method.base module</a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#module-quapy.method.meta">quapy.method.meta module</a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#module-quapy.method.non_aggregative">quapy.method.non_aggregative module</a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy-method-composable-module">quapy.method.composable module</a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#module-quapy.method">Module contents</a></li>
</ul>
</li>
@ -175,6 +176,7 @@
<li class="toctree-l3"><a class="reference internal" href="quapy.html#quapy.functional.solve_adjustment_binary"><code class="docutils literal notranslate"><span class="pre">solve_adjustment_binary()</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.html#quapy.functional.strprev"><code class="docutils literal notranslate"><span class="pre">strprev()</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.html#quapy.functional.ternary_search"><code class="docutils literal notranslate"><span class="pre">ternary_search()</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.html#quapy.functional.uniform_prevalence"><code class="docutils literal notranslate"><span class="pre">uniform_prevalence()</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.html#quapy.functional.uniform_prevalence_sampling"><code class="docutils literal notranslate"><span class="pre">uniform_prevalence_sampling()</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.html#quapy.functional.uniform_simplex_sampling"><code class="docutils literal notranslate"><span class="pre">uniform_simplex_sampling()</span></code></a></li>
</ul>

Binary file not shown.

View File

@ -46,6 +46,15 @@
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<ul>
<li class="toctree-l1"><a class="reference internal" href="wiki/Datasets.html">Datasets</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Evaluation.html">Evaluation</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/ExplicitLossMinimization.html">Explicit Loss Minimization</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Methods.html">Quantification Methods</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Model-Selection.html">Model Selection</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Plotting.html">Plotting</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Protocols.html">Protocols</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="modules.html">quapy</a></li>
</ul>
@ -184,6 +193,11 @@
<td>&#160;&#160;&#160;
<a href="quapy.method.html#module-quapy.method.base"><code class="xref">quapy.method.base</code></a></td><td>
<em></em></td></tr>
<tr class="cg-1">
<td></td>
<td>&#160;&#160;&#160;
<a href="quapy.method.html#module-quapy.method.composable"><code class="xref">quapy.method.composable</code></a></td><td>
<em></em></td></tr>
<tr class="cg-1">
<td></td>
<td>&#160;&#160;&#160;

View File

@ -213,7 +213,7 @@ See <a class="reference internal" href="#quapy.data.base.LabelledCollection.load
<dl class="py method">
<dt class="sig sig-object py" id="quapy.data.base.Dataset.reduce">
<span class="sig-name descname"><span class="pre">reduce</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">n_train</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_test</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/data/base.html#Dataset.reduce"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.data.base.Dataset.reduce" title="Permalink to this definition"></a></dt>
<span class="sig-name descname"><span class="pre">reduce</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">n_train</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_test</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">100</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">random_state</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/data/base.html#Dataset.reduce"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.data.base.Dataset.reduce" title="Permalink to this definition"></a></dt>
<dd><p>Reduce the number of instances in place for quick experiments. Preserves the prevalence of each set.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
@ -448,8 +448,7 @@ as listed by <cite>self.classes_</cite></p>
<dt class="sig sig-object py" id="quapy.data.base.LabelledCollection.sampling">
<span class="sig-name descname"><span class="pre">sampling</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">size</span></span></em>, <em class="sig-param"><span class="o"><span class="pre">*</span></span><span class="n"><span class="pre">prevs</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">shuffle</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">random_state</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/data/base.html#LabelledCollection.sampling"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.data.base.LabelledCollection.sampling" title="Permalink to this definition"></a></dt>
<dd><p>Return a random sample (an instance of <a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">LabelledCollection</span></code></a>) of desired size and desired prevalence
values. For each class, the sampling is drawn without replacement if the requested prevalence is larger than
the actual prevalence of the class, or with replacement otherwise.</p>
values. For each class, the sampling is drawn with replacement.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
@ -488,8 +487,7 @@ index.</p>
<span class="sig-name descname"><span class="pre">sampling_index</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">size</span></span></em>, <em class="sig-param"><span class="o"><span class="pre">*</span></span><span class="n"><span class="pre">prevs</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">shuffle</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">random_state</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/data/base.html#LabelledCollection.sampling_index"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.data.base.LabelledCollection.sampling_index" title="Permalink to this definition"></a></dt>
<dd><p>Returns an index to be used to extract a random sample of desired size and desired prevalence values. If the
prevalence values are not specified, then returns the index of a uniform sampling.
For each class, the sampling is drawn with replacement if the requested prevalence is larger than
the actual prevalence of the class, or without replacement otherwise.</p>
For each class, the sampling is drawn with replacement.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
@ -575,8 +573,7 @@ values for each class)</p>
<dt class="sig sig-object py" id="quapy.data.base.LabelledCollection.uniform_sampling">
<span class="sig-name descname"><span class="pre">uniform_sampling</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">size</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">random_state</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/data/base.html#LabelledCollection.uniform_sampling"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.data.base.LabelledCollection.uniform_sampling" title="Permalink to this definition"></a></dt>
<dd><p>Returns a uniform sample (an instance of <a class="reference internal" href="#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">LabelledCollection</span></code></a>) of desired size. The sampling is drawn
with replacement if the requested size is greater than the number of instances, or without replacement
otherwise.</p>
with replacement.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
@ -594,8 +591,7 @@ otherwise.</p>
<dt class="sig sig-object py" id="quapy.data.base.LabelledCollection.uniform_sampling_index">
<span class="sig-name descname"><span class="pre">uniform_sampling_index</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">size</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">random_state</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/data/base.html#LabelledCollection.uniform_sampling_index"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.data.base.LabelledCollection.uniform_sampling_index" title="Permalink to this definition"></a></dt>
<dd><p>Returns an index to be used to extract a uniform sample of desired size. The sampling is drawn
with replacement if the requested size is greater than the number of instances, or without replacement
otherwise.</p>
with replacement.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">

View File

@ -119,6 +119,7 @@
<li class="toctree-l4"><a class="reference internal" href="#quapy.functional.solve_adjustment_binary"><code class="docutils literal notranslate"><span class="pre">solve_adjustment_binary()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="#quapy.functional.strprev"><code class="docutils literal notranslate"><span class="pre">strprev()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="#quapy.functional.ternary_search"><code class="docutils literal notranslate"><span class="pre">ternary_search()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="#quapy.functional.uniform_prevalence"><code class="docutils literal notranslate"><span class="pre">uniform_prevalence()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="#quapy.functional.uniform_prevalence_sampling"><code class="docutils literal notranslate"><span class="pre">uniform_prevalence_sampling()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="#quapy.functional.uniform_simplex_sampling"><code class="docutils literal notranslate"><span class="pre">uniform_simplex_sampling()</span></code></a></li>
</ul>
@ -632,6 +633,62 @@
</li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="quapy.method.html#quapy-method-composable-module">quapy.method.composable module</a><ul>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.BlobelLoss"><code class="docutils literal notranslate"><span class="pre">BlobelLoss</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.CVClassifier"><code class="docutils literal notranslate"><span class="pre">CVClassifier</span></code></a><ul>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.CVClassifier.fit"><code class="docutils literal notranslate"><span class="pre">CVClassifier.fit()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.CVClassifier.predict"><code class="docutils literal notranslate"><span class="pre">CVClassifier.predict()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.CVClassifier.predict_proba"><code class="docutils literal notranslate"><span class="pre">CVClassifier.predict_proba()</span></code></a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.ClassTransformer"><code class="docutils literal notranslate"><span class="pre">ClassTransformer</span></code></a><ul>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.ClassTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">ClassTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.ClassTransformer.transform"><code class="docutils literal notranslate"><span class="pre">ClassTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.CombinedLoss"><code class="docutils literal notranslate"><span class="pre">CombinedLoss</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.ComposableQuantifier"><code class="docutils literal notranslate"><span class="pre">ComposableQuantifier()</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.DistanceTransformer"><code class="docutils literal notranslate"><span class="pre">DistanceTransformer</span></code></a><ul>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.DistanceTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">DistanceTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.DistanceTransformer.transform"><code class="docutils literal notranslate"><span class="pre">DistanceTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.EnergyKernelTransformer"><code class="docutils literal notranslate"><span class="pre">EnergyKernelTransformer</span></code></a><ul>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.EnergyKernelTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">EnergyKernelTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.EnergyKernelTransformer.transform"><code class="docutils literal notranslate"><span class="pre">EnergyKernelTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.EnergyLoss"><code class="docutils literal notranslate"><span class="pre">EnergyLoss</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianKernelTransformer"><code class="docutils literal notranslate"><span class="pre">GaussianKernelTransformer</span></code></a><ul>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianKernelTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">GaussianKernelTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianKernelTransformer.transform"><code class="docutils literal notranslate"><span class="pre">GaussianKernelTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianRFFKernelTransformer"><code class="docutils literal notranslate"><span class="pre">GaussianRFFKernelTransformer</span></code></a><ul>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianRFFKernelTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">GaussianRFFKernelTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.GaussianRFFKernelTransformer.transform"><code class="docutils literal notranslate"><span class="pre">GaussianRFFKernelTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.HellingerSurrogateLoss"><code class="docutils literal notranslate"><span class="pre">HellingerSurrogateLoss</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.HistogramTransformer"><code class="docutils literal notranslate"><span class="pre">HistogramTransformer</span></code></a><ul>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.HistogramTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">HistogramTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.HistogramTransformer.transform"><code class="docutils literal notranslate"><span class="pre">HistogramTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.KernelTransformer"><code class="docutils literal notranslate"><span class="pre">KernelTransformer</span></code></a><ul>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.KernelTransformer.fit_transform"><code class="docutils literal notranslate"><span class="pre">KernelTransformer.fit_transform()</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.KernelTransformer.transform"><code class="docutils literal notranslate"><span class="pre">KernelTransformer.transform()</span></code></a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.LaplacianKernelTransformer"><code class="docutils literal notranslate"><span class="pre">LaplacianKernelTransformer</span></code></a><ul>
<li class="toctree-l4"><a class="reference internal" href="quapy.method.html#quapy.method.composable.LaplacianKernelTransformer.kernel"><code class="docutils literal notranslate"><span class="pre">LaplacianKernelTransformer.kernel</span></code></a></li>
</ul>
</li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.LeastSquaresLoss"><code class="docutils literal notranslate"><span class="pre">LeastSquaresLoss</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.TikhonovRegularization"><code class="docutils literal notranslate"><span class="pre">TikhonovRegularization</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="quapy.method.html#quapy.method.composable.TikhonovRegularized"><code class="docutils literal notranslate"><span class="pre">TikhonovRegularized()</span></code></a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="quapy.method.html#module-quapy.method">Module contents</a></li>
</ul>
</li>
@ -1534,7 +1591,16 @@ then it returns one such bool for each prevalence vector</p>
<dl class="py function">
<dt class="sig sig-object py" id="quapy.functional.condsoftmax">
<span class="sig-prename descclassname"><span class="pre">quapy.functional.</span></span><span class="sig-name descname"><span class="pre">condsoftmax</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">prevalences</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">_SupportsArray</span><span class="p"><span class="pre">[</span></span><span class="pre">dtype</span><span class="p"><span class="pre">]</span></span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">_NestedSequence</span><span class="p"><span class="pre">[</span></span><span class="pre">_SupportsArray</span><span class="p"><span class="pre">[</span></span><span class="pre">dtype</span><span class="p"><span class="pre">]</span></span><span class="p"><span class="pre">]</span></span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">bool</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">int</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">float</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">complex</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">str</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">bytes</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">_NestedSequence</span><span class="p"><span class="pre">[</span></span><span class="pre">Union</span><span class="p"><span class="pre">[</span></span><span class="pre">bool</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">int</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">float</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">complex</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">str</span><span class="p"><span class="pre">,</span></span><span class="w"> </span><span class="pre">bytes</span><span class="p"><span class="pre">]</span></span><span class="p"><span class="pre">]</span></span><span class="p"><span class="pre">]</span></span></span></em><span class="sig-paren">)</span> <span class="sig-return"><span class="sig-return-icon">&#x2192;</span> <span class="sig-return-typehint"><span class="pre">ndarray</span></span></span><a class="reference internal" href="_modules/quapy/functional.html#condsoftmax"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.functional.condsoftmax" title="Permalink to this definition"></a></dt>
<dd></dd></dl>
<dd><p>Applies the softmax function only to vectors that do not represent valid distributions.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><p><strong>prevalences</strong> array-like of shape <cite>(n_classes,)</cite> or of shape <cite>(n_samples, n_classes,)</cite> with prevalence values</p>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>np.ndarray representing a valid distribution</p>
</dd>
</dl>
</dd></dl>
<dl class="py function">
<dt class="sig sig-object py" id="quapy.functional.counts_from_labels">
@ -1882,6 +1948,20 @@ positive class <cite>p</cite> comes down to computing:</p>
<span class="sig-prename descclassname"><span class="pre">quapy.functional.</span></span><span class="sig-name descname"><span class="pre">ternary_search</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">loss</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">Callable</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_classes</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">int</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/functional.html#ternary_search"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.functional.ternary_search" title="Permalink to this definition"></a></dt>
<dd></dd></dl>
<dl class="py function">
<dt class="sig sig-object py" id="quapy.functional.uniform_prevalence">
<span class="sig-prename descclassname"><span class="pre">quapy.functional.</span></span><span class="sig-name descname"><span class="pre">uniform_prevalence</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">n_classes</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/functional.html#uniform_prevalence"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.functional.uniform_prevalence" title="Permalink to this definition"></a></dt>
<dd><p>Returns a vector representing the uniform distribution for <cite>n_classes</cite></p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><p><strong>n_classes</strong> number of classes</p>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>np.ndarray with all values 1/n_classes</p>
</dd>
</dl>
</dd></dl>
<dl class="py function">
<dt class="sig sig-object py" id="quapy.functional.uniform_prevalence_sampling">
<span class="sig-prename descclassname"><span class="pre">quapy.functional.</span></span><span class="sig-name descname"><span class="pre">uniform_prevalence_sampling</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">n_classes</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">int</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">size</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">int</span></span><span class="w"> </span><span class="o"><span class="pre">=</span></span><span class="w"> </span><span class="default_value"><span class="pre">1</span></span></em><span class="sig-paren">)</span> <span class="sig-return"><span class="sig-return-icon">&#x2192;</span> <span class="sig-return-typehint"><span class="pre">ndarray</span></span></span><a class="reference internal" href="_modules/quapy/functional.html#uniform_prevalence_sampling"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.functional.uniform_prevalence_sampling" title="Permalink to this definition"></a></dt>

View File

@ -458,6 +458,13 @@ non-probabilistic quantifiers. The default one is “decision_function”.</p>
<li><p><strong>data</strong> a <a class="reference internal" href="quapy.data.html#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.data.base.LabelledCollection</span></code></a> consisting of the training data</p></li>
<li><p><strong>fit_classifier</strong> whether to train the learner (default is True). Set to False if the
learner has been trained outside the quantifier.</p></li>
<li><p><strong>val_split</strong> specifies the data used for generating classifier predictions. This specification
can be made as float in (0, 1) indicating the proportion of stratified held-out validation set to
be extracted from the training set; or as an integer (default 5), indicating that the predictions
are to be generated in a <cite>k</cite>-fold cross-validation manner (with this integer indicating the value
for <cite>k</cite>); or as a collection defining the specific set of data to use for validation.
Alternatively, this set can be specified at fit time by indicating the exact set of data
on which the predictions are to be generated.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
@ -587,6 +594,13 @@ as instances, the label predictions issued by the classifier and, as labels, the
<li><p><strong>data</strong> a <a class="reference internal" href="quapy.data.html#quapy.data.base.LabelledCollection" title="quapy.data.base.LabelledCollection"><code class="xref py py-class docutils literal notranslate"><span class="pre">quapy.data.base.LabelledCollection</span></code></a> consisting of the training data</p></li>
<li><p><strong>fit_classifier</strong> whether to train the learner (default is True). Set to False if the
learner has been trained outside the quantifier.</p></li>
<li><p><strong>val_split</strong> specifies the data used for generating classifier predictions. This specification
can be made as float in (0, 1) indicating the proportion of stratified held-out validation set to
be extracted from the training set; or as an integer (default 5), indicating that the predictions
are to be generated in a <cite>k</cite>-fold cross-validation manner (with this integer indicating the value
for <cite>k</cite>); or as a collection defining the specific set of data to use for validation.
Alternatively, this set can be specified at fit time by indicating the exact set of data
on which the predictions are to be generated.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
@ -2085,7 +2099,7 @@ This function should return the (float) score to be minimized.</p>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method._threshold_optim.ThresholdOptimization">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method._threshold_optim.</span></span><span class="sig-name descname"><span class="pre">ThresholdOptimization</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">classifier</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">BaseEstimator</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">val_split</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">5</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_jobs</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/method/_threshold_optim.html#ThresholdOptimization"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method._threshold_optim.ThresholdOptimization" title="Permalink to this definition"></a></dt>
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method._threshold_optim.</span></span><span class="sig-name descname"><span class="pre">ThresholdOptimization</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">classifier</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">BaseEstimator</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">val_split</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_jobs</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/method/_threshold_optim.html#ThresholdOptimization"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method._threshold_optim.ThresholdOptimization" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <a class="reference internal" href="#quapy.method.aggregative.BinaryAggregativeQuantifier" title="quapy.method.aggregative.BinaryAggregativeQuantifier"><code class="xref py py-class docutils literal notranslate"><span class="pre">BinaryAggregativeQuantifier</span></code></a></p>
<p>Abstract class of Threshold Optimization variants for <code class="xref py py-class docutils literal notranslate"><span class="pre">ACC</span></code> as proposed by
<a class="reference external" href="https://dl.acm.org/doi/abs/10.1145/1150402.1150423">Forman 2006</a> and
@ -2276,7 +2290,7 @@ This function should return the (float) score to be minimized.</p>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.base.OneVsAllGeneric">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.base.</span></span><span class="sig-name descname"><span class="pre">OneVsAllGeneric</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">binary_quantifier</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_jobs</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/method/base.html#OneVsAllGeneric"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.base.OneVsAllGeneric" title="Permalink to this definition"></a></dt>
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.base.</span></span><span class="sig-name descname"><span class="pre">OneVsAllGeneric</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">binary_quantifier</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><span class="pre">BaseQuantifier</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_jobs</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/method/base.html#OneVsAllGeneric"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.base.OneVsAllGeneric" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <a class="reference internal" href="#quapy.method.base.OneVsAll" title="quapy.method.base.OneVsAll"><code class="xref py py-class docutils literal notranslate"><span class="pre">OneVsAll</span></code></a>, <a class="reference internal" href="#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><code class="xref py py-class docutils literal notranslate"><span class="pre">BaseQuantifier</span></code></a></p>
<p>Allows any binary quantifier to perform quantification on single-label datasets. The method maintains one binary
quantifier for each class, and then l1-normalizes the outputs so that the class prevelence values sum up to 1.</p>
@ -2317,7 +2331,7 @@ quantifier for each class, and then l1-normalizes the outputs so that the class
<dl class="py function">
<dt class="sig sig-object py" id="quapy.method.base.newOneVsAll">
<span class="sig-prename descclassname"><span class="pre">quapy.method.base.</span></span><span class="sig-name descname"><span class="pre">newOneVsAll</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">binary_quantifier</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_jobs</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/method/base.html#newOneVsAll"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.base.newOneVsAll" title="Permalink to this definition"></a></dt>
<span class="sig-prename descclassname"><span class="pre">quapy.method.base.</span></span><span class="sig-name descname"><span class="pre">newOneVsAll</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">binary_quantifier</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><a class="reference internal" href="#quapy.method.base.BaseQuantifier" title="quapy.method.base.BaseQuantifier"><span class="pre">BaseQuantifier</span></a></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_jobs</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/method/base.html#newOneVsAll"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.base.newOneVsAll" title="Permalink to this definition"></a></dt>
<dd></dd></dl>
</section>
@ -2996,6 +3010,595 @@ any quantification method should beat.</p>
</dd></dl>
</section>
<section id="quapy-method-composable-module">
<h2>quapy.method.composable module<a class="headerlink" href="#quapy-method-composable-module" title="Permalink to this heading"></a></h2>
<span class="target" id="module-quapy.method.composable"></span><p>This module allows the composition of quantification methods from loss functions and feature transformations. This functionality is realized through an integration of the qunfold package: <a class="reference external" href="https://github.com/mirkobunse/qunfold">https://github.com/mirkobunse/qunfold</a>.</p>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.BlobelLoss">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">BlobelLoss</span></span><a class="reference internal" href="_modules/qunfold/losses.html#BlobelLoss"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.BlobelLoss" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">FunctionLoss</span></code></p>
<p>The loss function of RUN (Blobel, 1985).</p>
<p>This loss function models a likelihood function under the assumption of independent Poisson-distributed elements of <cite>q</cite> with Poisson rates <cite>M*p</cite>.</p>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.CVClassifier">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">CVClassifier</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">estimator</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_estimators</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">5</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">random_state</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/sklearn.html#CVClassifier"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.CVClassifier" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">BaseEstimator</span></code>, <code class="xref py py-class docutils literal notranslate"><span class="pre">ClassifierMixin</span></code></p>
<p>An ensemble of classifiers that are trained from cross-validation folds.</p>
<p>All objects of this type have a fixed attribute <cite>oob_score = True</cite> and, when trained, a fitted attribute <cite>self.oob_decision_function_</cite>, just like scikit-learn bagging classifiers.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>estimator</strong> A classifier that implements the API of scikit-learn.</p></li>
<li><p><strong>n_estimators</strong> (<em>optional</em>) The number of stratified cross-validation folds. Defaults to <cite>5</cite>.</p></li>
<li><p><strong>random_state</strong> (<em>optional</em>) The random state for stratification. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
</dl>
<p class="rubric">Examples</p>
<p>Here, we create an instance of ACC that trains a logistic regression classifier with 10 cross-validation folds.</p>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">ACC</span><span class="p">(</span><span class="n">CVClassifier</span><span class="p">(</span><span class="n">LogisticRegression</span><span class="p">(),</span> <span class="mi">10</span><span class="p">))</span>
</pre></div>
</div>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.CVClassifier.fit">
<span class="sig-name descname"><span class="pre">fit</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">y</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/sklearn.html#CVClassifier.fit"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.CVClassifier.fit" title="Permalink to this definition"></a></dt>
<dd></dd></dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.CVClassifier.predict">
<span class="sig-name descname"><span class="pre">predict</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/sklearn.html#CVClassifier.predict"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.CVClassifier.predict" title="Permalink to this definition"></a></dt>
<dd></dd></dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.CVClassifier.predict_proba">
<span class="sig-name descname"><span class="pre">predict_proba</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/sklearn.html#CVClassifier.predict_proba"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.CVClassifier.predict_proba" title="Permalink to this definition"></a></dt>
<dd></dd></dl>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.ClassTransformer">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">ClassTransformer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">classifier</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">is_probabilistic</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">fit_classifier</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#ClassTransformer"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.ClassTransformer" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">AbstractTransformer</span></code></p>
<p>A classification-based feature transformation.</p>
<p>This transformation can either be probabilistic (using the posterior predictions of a classifier) or crisp (using the class predictions of a classifier). It is used in ACC, PACC, CC, PCC, and SLD.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>classifier</strong> A classifier that implements the API of scikit-learn.</p></li>
<li><p><strong>is_probabilistic</strong> (<em>optional</em>) Whether probabilistic or crisp predictions of the <cite>classifier</cite> are used to transform the data. Defaults to <cite>False</cite>.</p></li>
<li><p><strong>fit_classifier</strong> (<em>optional</em>) Whether to fit the <cite>classifier</cite> when this quantifier is fitted. Defaults to <cite>True</cite>.</p></li>
</ul>
</dd>
</dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.ClassTransformer.fit_transform">
<span class="sig-name descname"><span class="pre">fit_transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">y</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_classes</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#ClassTransformer.fit_transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.ClassTransformer.fit_transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to fit the transformer and to return the transformation of the input data.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>Implementations of this abstract method should check the sanity of labels by calling <cite>_check_y(y, n_classes)</cite> and they must set the property <cite>self.p_trn = class_prevalences(y, n_classes)</cite>.</p>
</div>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix to which this transformer will be fitted.</p></li>
<li><p><strong>y</strong> The labels to which this transformer will be fitted.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a transfer matrix <cite>M</cite> or a transformation <cite>(f(X), y)</cite>. Defaults to <cite>True</cite>.</p></li>
<li><p><strong>n_classes</strong> (<em>optional</em>) The number of expected classes. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A transfer matrix <cite>M</cite> if <cite>average==True</cite> or a transformation <cite>(f(X), y)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.ClassTransformer.transform">
<span class="sig-name descname"><span class="pre">transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#ClassTransformer.transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.ClassTransformer.transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to transform the data <cite>X</cite>.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix that will be transformed.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a vector <cite>q</cite> or a transformation <cite>f(X)</cite>. Defaults to <cite>True</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A vector <cite>q = f(X).mean(axis=0)</cite> if <cite>average==True</cite> or a transformation <cite>f(X)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.CombinedLoss">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">CombinedLoss</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="o"><span class="pre">*</span></span><span class="n"><span class="pre">losses</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">weights</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/losses.html#CombinedLoss"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.CombinedLoss" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">AbstractLoss</span></code></p>
<p>The weighted sum of multiple losses.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>*losses</strong> An arbitrary number of losses to be added together.</p></li>
<li><p><strong>weights</strong> (<em>optional</em>) An array of weights which the losses are scaled.</p></li>
</ul>
</dd>
</dl>
</dd></dl>
<dl class="py function">
<dt class="sig sig-object py" id="quapy.method.composable.ComposableQuantifier">
<span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">ComposableQuantifier</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">loss</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">transformer</span></span></em>, <em class="sig-param"><span class="o"><span class="pre">**</span></span><span class="n"><span class="pre">kwargs</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/quapy/method/composable.html#ComposableQuantifier"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.ComposableQuantifier" title="Permalink to this definition"></a></dt>
<dd><p>A generic quantification / unfolding method that solves a linear system of equations.</p>
<p>This class represents any quantifier that can be described in terms of a loss function, a feature transformation, and a regularization term. In this implementation, the loss is minimized through unconstrained second-order minimization. Valid probability estimates are ensured through a soft-max trick by Bunse (2022).</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>loss</strong> An instance of a loss class from <cite>quapy.methods.composable</cite>.</p></li>
<li><p><strong>transformer</strong> An instance of a transformer class from <cite>quapy.methods.composable</cite>.</p></li>
<li><p><strong>solver</strong> (<em>optional</em>) The <cite>method</cite> argument in <cite>scipy.optimize.minimize</cite>. Defaults to <cite>“trust-ncg”</cite>.</p></li>
<li><p><strong>solver_options</strong> (<em>optional</em>) The <cite>options</cite> argument in <cite>scipy.optimize.minimize</cite>. Defaults to <cite>{“gtol”: 1e-8, “maxiter”: 1000}</cite>.</p></li>
<li><p><strong>seed</strong> (<em>optional</em>) A random number generator seed from which a numpy RandomState is created. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
</dl>
<p class="rubric">Examples</p>
<p>Here, we create the ordinal variant of ACC (Bunse et al., 2023). This variant consists of the original feature transformation of ACC and of the original loss of ACC, the latter of which is regularized towards smooth solutions.</p>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">qunfold.method.composable</span> <span class="kn">import</span> <span class="p">(</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="n">ComposableQuantifier</span><span class="p">,</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="n">TikhonovRegularized</span><span class="p">,</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="n">LeastSquaresLoss</span><span class="p">,</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="n">ClassTransformer</span><span class="p">,</span>
<span class="gp">&gt;&gt;&gt; </span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">sklearn.ensemble</span> <span class="kn">import</span> <span class="n">RandomForestClassifier</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">o_acc</span> <span class="o">=</span> <span class="n">ComposableQuantifier</span><span class="p">(</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="n">TikhonovRegularized</span><span class="p">(</span><span class="n">LeastSquaresLoss</span><span class="p">(),</span> <span class="mf">0.01</span><span class="p">),</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="n">ClassTransformer</span><span class="p">(</span><span class="n">RandomForestClassifier</span><span class="p">(</span><span class="n">oob_score</span><span class="o">=</span><span class="kc">True</span><span class="p">))</span>
<span class="gp">&gt;&gt;&gt; </span><span class="p">)</span>
</pre></div>
</div>
<p>Here, we perform hyper-parameter optimization with the ordinal ACC.</p>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">quapy</span><span class="o">.</span><span class="n">model_selection</span><span class="o">.</span><span class="n">GridSearchQ</span><span class="p">(</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="n">model</span> <span class="o">=</span> <span class="n">o_acc</span><span class="p">,</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="n">param_grid</span> <span class="o">=</span> <span class="p">{</span> <span class="c1"># try both splitting criteria</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="s2">&quot;transformer__classifier__estimator__criterion&quot;</span><span class="p">:</span> <span class="p">[</span><span class="s2">&quot;gini&quot;</span><span class="p">,</span> <span class="s2">&quot;entropy&quot;</span><span class="p">],</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="p">},</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="c1"># ...</span>
<span class="gp">&gt;&gt;&gt; </span><span class="p">)</span>
</pre></div>
</div>
<p>To use a classifier that does not provide the <cite>oob_score</cite> argument, such as logistic regression, you have to configure a cross validation of this classifier. Here, we employ 10 cross validation folds. 5 folds are the default.</p>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">qunfold.method.composable</span> <span class="kn">import</span> <span class="n">CVClassifier</span>
<span class="gp">&gt;&gt;&gt; </span><span class="kn">from</span> <span class="nn">sklearn.linear_model</span> <span class="kn">import</span> <span class="n">LogisticRegression</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">acc_lr</span> <span class="o">=</span> <span class="n">ComposableQuantifier</span><span class="p">(</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="n">LeastSquaresLoss</span><span class="p">(),</span>
<span class="gp">&gt;&gt;&gt; </span> <span class="n">ClassTransformer</span><span class="p">(</span><span class="n">CVClassifier</span><span class="p">(</span><span class="n">LogisticRegression</span><span class="p">(),</span> <span class="mi">10</span><span class="p">))</span>
<span class="gp">&gt;&gt;&gt; </span><span class="p">)</span>
</pre></div>
</div>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.DistanceTransformer">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">DistanceTransformer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">metric</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'euclidean'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">preprocessor</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#DistanceTransformer"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.DistanceTransformer" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">AbstractTransformer</span></code></p>
<p>A distance-based feature transformation, as it is used in <cite>EDx</cite> and <cite>EDy</cite>.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>metric</strong> (<em>optional</em>) The metric with which the distance between data items is measured. Can take any value that is accepted by <cite>scipy.spatial.distance.cdist</cite>. Defaults to <cite>“euclidean”</cite>.</p></li>
<li><p><strong>preprocessor</strong> (<em>optional</em>) Another <cite>AbstractTransformer</cite> that is called before this transformer. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
</dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.DistanceTransformer.fit_transform">
<span class="sig-name descname"><span class="pre">fit_transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">y</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_classes</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#DistanceTransformer.fit_transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.DistanceTransformer.fit_transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to fit the transformer and to return the transformation of the input data.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>Implementations of this abstract method should check the sanity of labels by calling <cite>_check_y(y, n_classes)</cite> and they must set the property <cite>self.p_trn = class_prevalences(y, n_classes)</cite>.</p>
</div>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix to which this transformer will be fitted.</p></li>
<li><p><strong>y</strong> The labels to which this transformer will be fitted.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a transfer matrix <cite>M</cite> or a transformation <cite>(f(X), y)</cite>. Defaults to <cite>True</cite>.</p></li>
<li><p><strong>n_classes</strong> (<em>optional</em>) The number of expected classes. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A transfer matrix <cite>M</cite> if <cite>average==True</cite> or a transformation <cite>(f(X), y)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.DistanceTransformer.transform">
<span class="sig-name descname"><span class="pre">transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#DistanceTransformer.transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.DistanceTransformer.transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to transform the data <cite>X</cite>.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix that will be transformed.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a vector <cite>q</cite> or a transformation <cite>f(X)</cite>. Defaults to <cite>True</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A vector <cite>q = f(X).mean(axis=0)</cite> if <cite>average==True</cite> or a transformation <cite>f(X)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.EnergyKernelTransformer">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">EnergyKernelTransformer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">preprocessor</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#EnergyKernelTransformer"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.EnergyKernelTransformer" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">AbstractTransformer</span></code></p>
<p>A kernel-based feature transformation, as it is used in <cite>KMM</cite>, that uses the <cite>energy</cite> kernel:</p>
<blockquote>
<div><p>k(x_1, x_2) = ||x_1|| + ||x_2|| - ||x_1 - x_2||</p>
</div></blockquote>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>The methods of this transformer do not support setting <cite>average=False</cite>.</p>
</div>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><p><strong>preprocessor</strong> (<em>optional</em>) Another <cite>AbstractTransformer</cite> that is called before this transformer. Defaults to <cite>None</cite>.</p>
</dd>
</dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.EnergyKernelTransformer.fit_transform">
<span class="sig-name descname"><span class="pre">fit_transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">y</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_classes</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#EnergyKernelTransformer.fit_transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.EnergyKernelTransformer.fit_transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to fit the transformer and to return the transformation of the input data.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>Implementations of this abstract method should check the sanity of labels by calling <cite>_check_y(y, n_classes)</cite> and they must set the property <cite>self.p_trn = class_prevalences(y, n_classes)</cite>.</p>
</div>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix to which this transformer will be fitted.</p></li>
<li><p><strong>y</strong> The labels to which this transformer will be fitted.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a transfer matrix <cite>M</cite> or a transformation <cite>(f(X), y)</cite>. Defaults to <cite>True</cite>.</p></li>
<li><p><strong>n_classes</strong> (<em>optional</em>) The number of expected classes. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A transfer matrix <cite>M</cite> if <cite>average==True</cite> or a transformation <cite>(f(X), y)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.EnergyKernelTransformer.transform">
<span class="sig-name descname"><span class="pre">transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#EnergyKernelTransformer.transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.EnergyKernelTransformer.transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to transform the data <cite>X</cite>.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix that will be transformed.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a vector <cite>q</cite> or a transformation <cite>f(X)</cite>. Defaults to <cite>True</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A vector <cite>q = f(X).mean(axis=0)</cite> if <cite>average==True</cite> or a transformation <cite>f(X)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.EnergyLoss">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">EnergyLoss</span></span><a class="reference internal" href="_modules/qunfold/losses.html#EnergyLoss"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.EnergyLoss" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">FunctionLoss</span></code></p>
<p>The loss function of EDx (Kawakubo et al., 2016) and EDy (Castaño et al., 2022).</p>
<p>This loss function represents the Energy Distance between two samples.</p>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.GaussianKernelTransformer">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">GaussianKernelTransformer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">sigma</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">1</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">preprocessor</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#GaussianKernelTransformer"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.GaussianKernelTransformer" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">AbstractTransformer</span></code></p>
<p>A kernel-based feature transformation, as it is used in <cite>KMM</cite>, that uses the <cite>gaussian</cite> kernel:</p>
<blockquote>
<div><p>k(x, y) = exp(-||x - y||^2 / (2σ^2))</p>
</div></blockquote>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>sigma</strong> (<em>optional</em>) A smoothing parameter of the kernel function. Defaults to <cite>1</cite>.</p></li>
<li><p><strong>preprocessor</strong> (<em>optional</em>) Another <cite>AbstractTransformer</cite> that is called before this transformer. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
</dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.GaussianKernelTransformer.fit_transform">
<span class="sig-name descname"><span class="pre">fit_transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">y</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_classes</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#GaussianKernelTransformer.fit_transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.GaussianKernelTransformer.fit_transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to fit the transformer and to return the transformation of the input data.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>Implementations of this abstract method should check the sanity of labels by calling <cite>_check_y(y, n_classes)</cite> and they must set the property <cite>self.p_trn = class_prevalences(y, n_classes)</cite>.</p>
</div>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix to which this transformer will be fitted.</p></li>
<li><p><strong>y</strong> The labels to which this transformer will be fitted.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a transfer matrix <cite>M</cite> or a transformation <cite>(f(X), y)</cite>. Defaults to <cite>True</cite>.</p></li>
<li><p><strong>n_classes</strong> (<em>optional</em>) The number of expected classes. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A transfer matrix <cite>M</cite> if <cite>average==True</cite> or a transformation <cite>(f(X), y)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.GaussianKernelTransformer.transform">
<span class="sig-name descname"><span class="pre">transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#GaussianKernelTransformer.transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.GaussianKernelTransformer.transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to transform the data <cite>X</cite>.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix that will be transformed.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a vector <cite>q</cite> or a transformation <cite>f(X)</cite>. Defaults to <cite>True</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A vector <cite>q = f(X).mean(axis=0)</cite> if <cite>average==True</cite> or a transformation <cite>f(X)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.GaussianRFFKernelTransformer">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">GaussianRFFKernelTransformer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">sigma</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">1</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_rff</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">1000</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">preprocessor</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">seed</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#GaussianRFFKernelTransformer"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.GaussianRFFKernelTransformer" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">AbstractTransformer</span></code></p>
<p>An efficient approximation of the <cite>GaussianKernelTransformer</cite>, as it is used in <cite>KMM</cite>, using random Fourier features.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>sigma</strong> (<em>optional</em>) A smoothing parameter of the kernel function. Defaults to <cite>1</cite>.</p></li>
<li><p><strong>n_rff</strong> (<em>optional</em>) The number of random Fourier features. Defaults to <cite>1000</cite>.</p></li>
<li><p><strong>preprocessor</strong> (<em>optional</em>) Another <cite>AbstractTransformer</cite> that is called before this transformer. Defaults to <cite>None</cite>.</p></li>
<li><p><strong>seed</strong> (<em>optional</em>) Controls the randomness of the random Fourier features. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
</dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.GaussianRFFKernelTransformer.fit_transform">
<span class="sig-name descname"><span class="pre">fit_transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">y</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_classes</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#GaussianRFFKernelTransformer.fit_transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.GaussianRFFKernelTransformer.fit_transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to fit the transformer and to return the transformation of the input data.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>Implementations of this abstract method should check the sanity of labels by calling <cite>_check_y(y, n_classes)</cite> and they must set the property <cite>self.p_trn = class_prevalences(y, n_classes)</cite>.</p>
</div>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix to which this transformer will be fitted.</p></li>
<li><p><strong>y</strong> The labels to which this transformer will be fitted.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a transfer matrix <cite>M</cite> or a transformation <cite>(f(X), y)</cite>. Defaults to <cite>True</cite>.</p></li>
<li><p><strong>n_classes</strong> (<em>optional</em>) The number of expected classes. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A transfer matrix <cite>M</cite> if <cite>average==True</cite> or a transformation <cite>(f(X), y)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.GaussianRFFKernelTransformer.transform">
<span class="sig-name descname"><span class="pre">transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#GaussianRFFKernelTransformer.transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.GaussianRFFKernelTransformer.transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to transform the data <cite>X</cite>.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix that will be transformed.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a vector <cite>q</cite> or a transformation <cite>f(X)</cite>. Defaults to <cite>True</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A vector <cite>q = f(X).mean(axis=0)</cite> if <cite>average==True</cite> or a transformation <cite>f(X)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.HellingerSurrogateLoss">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">HellingerSurrogateLoss</span></span><a class="reference internal" href="_modules/qunfold/losses.html#HellingerSurrogateLoss"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.HellingerSurrogateLoss" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">FunctionLoss</span></code></p>
<p>The loss function of HDx and HDy (González-Castro et al., 2013).</p>
<p>This loss function computes the average of the squared Hellinger distances between feature-wise (or class-wise) histograms. Note that the original HDx and HDy by González-Castro et al (2013) do not use the squared but the regular Hellinger distance. Their approach is problematic because the regular distance is not always twice differentiable and, hence, complicates numerical optimizations.</p>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.HistogramTransformer">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">HistogramTransformer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">n_bins</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">preprocessor</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">unit_scale</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#HistogramTransformer"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.HistogramTransformer" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">AbstractTransformer</span></code></p>
<p>A histogram-based feature transformation, as it is used in <cite>HDx</cite> and <cite>HDy</cite>.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>n_bins</strong> The number of bins in each feature.</p></li>
<li><p><strong>preprocessor</strong> (<em>optional</em>) Another <cite>AbstractTransformer</cite> that is called before this transformer. Defaults to <cite>None</cite>.</p></li>
<li><p><strong>unit_scale</strong> (<em>optional</em>) Whether or not to scale each output to a sum of one. A value of <cite>False</cite> indicates that the sum of each output is the number of features. Defaults to <cite>True</cite>.</p></li>
</ul>
</dd>
</dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.HistogramTransformer.fit_transform">
<span class="sig-name descname"><span class="pre">fit_transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">y</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_classes</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#HistogramTransformer.fit_transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.HistogramTransformer.fit_transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to fit the transformer and to return the transformation of the input data.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>Implementations of this abstract method should check the sanity of labels by calling <cite>_check_y(y, n_classes)</cite> and they must set the property <cite>self.p_trn = class_prevalences(y, n_classes)</cite>.</p>
</div>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix to which this transformer will be fitted.</p></li>
<li><p><strong>y</strong> The labels to which this transformer will be fitted.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a transfer matrix <cite>M</cite> or a transformation <cite>(f(X), y)</cite>. Defaults to <cite>True</cite>.</p></li>
<li><p><strong>n_classes</strong> (<em>optional</em>) The number of expected classes. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A transfer matrix <cite>M</cite> if <cite>average==True</cite> or a transformation <cite>(f(X), y)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.HistogramTransformer.transform">
<span class="sig-name descname"><span class="pre">transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#HistogramTransformer.transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.HistogramTransformer.transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to transform the data <cite>X</cite>.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix that will be transformed.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a vector <cite>q</cite> or a transformation <cite>f(X)</cite>. Defaults to <cite>True</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A vector <cite>q = f(X).mean(axis=0)</cite> if <cite>average==True</cite> or a transformation <cite>f(X)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.KernelTransformer">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">KernelTransformer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">kernel</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#KernelTransformer"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.KernelTransformer" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">AbstractTransformer</span></code></p>
<p>A general kernel-based feature transformation, as it is used in <cite>KMM</cite>. If you intend to use a Gaussian kernel or energy kernel, prefer their dedicated and more efficient implementations over this class.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>The methods of this transformer do not support setting <cite>average=False</cite>.</p>
</div>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><p><strong>kernel</strong> A callable that will be used as the kernel. Must follow the signature <cite>(X[y==i], X[y==j]) -&gt; scalar</cite>.</p>
</dd>
</dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.KernelTransformer.fit_transform">
<span class="sig-name descname"><span class="pre">fit_transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">y</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_classes</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#KernelTransformer.fit_transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.KernelTransformer.fit_transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to fit the transformer and to return the transformation of the input data.</p>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>Implementations of this abstract method should check the sanity of labels by calling <cite>_check_y(y, n_classes)</cite> and they must set the property <cite>self.p_trn = class_prevalences(y, n_classes)</cite>.</p>
</div>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix to which this transformer will be fitted.</p></li>
<li><p><strong>y</strong> The labels to which this transformer will be fitted.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a transfer matrix <cite>M</cite> or a transformation <cite>(f(X), y)</cite>. Defaults to <cite>True</cite>.</p></li>
<li><p><strong>n_classes</strong> (<em>optional</em>) The number of expected classes. Defaults to <cite>None</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A transfer matrix <cite>M</cite> if <cite>average==True</cite> or a transformation <cite>(f(X), y)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
<dl class="py method">
<dt class="sig sig-object py" id="quapy.method.composable.KernelTransformer.transform">
<span class="sig-name descname"><span class="pre">transform</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">X</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#KernelTransformer.transform"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.KernelTransformer.transform" title="Permalink to this definition"></a></dt>
<dd><p>This abstract method has to transform the data <cite>X</cite>.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>X</strong> The feature matrix that will be transformed.</p></li>
<li><p><strong>average</strong> (<em>optional</em>) Whether to return a vector <cite>q</cite> or a transformation <cite>f(X)</cite>. Defaults to <cite>True</cite>.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>A vector <cite>q = f(X).mean(axis=0)</cite> if <cite>average==True</cite> or a transformation <cite>f(X)</cite> if <cite>average==False</cite>.</p>
</dd>
</dl>
</dd></dl>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.LaplacianKernelTransformer">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">LaplacianKernelTransformer</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">sigma</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">1</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/transformers.html#LaplacianKernelTransformer"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.LaplacianKernelTransformer" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <a class="reference internal" href="#quapy.method.composable.KernelTransformer" title="qunfold.transformers.KernelTransformer"><code class="xref py py-class docutils literal notranslate"><span class="pre">KernelTransformer</span></code></a></p>
<p>A kernel-based feature transformation, as it is used in <cite>KMM</cite>, that uses the <cite>laplacian</cite> kernel.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><p><strong>sigma</strong> (<em>optional</em>) A smoothing parameter of the kernel function. Defaults to <cite>1</cite>.</p>
</dd>
</dl>
<dl class="py property">
<dt class="sig sig-object py" id="quapy.method.composable.LaplacianKernelTransformer.kernel">
<em class="property"><span class="pre">property</span><span class="w"> </span></em><span class="sig-name descname"><span class="pre">kernel</span></span><a class="headerlink" href="#quapy.method.composable.LaplacianKernelTransformer.kernel" title="Permalink to this definition"></a></dt>
<dd></dd></dl>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.LeastSquaresLoss">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">LeastSquaresLoss</span></span><a class="reference internal" href="_modules/qunfold/losses.html#LeastSquaresLoss"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.LeastSquaresLoss" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">FunctionLoss</span></code></p>
<p>The loss function of ACC (Forman, 2008), PACC (Bella et al., 2019), and ReadMe (Hopkins &amp; King, 2010).</p>
<p>This loss function computes the sum of squares of element-wise errors between <cite>q</cite> and <cite>M*p</cite>.</p>
</dd></dl>
<dl class="py class">
<dt class="sig sig-object py" id="quapy.method.composable.TikhonovRegularization">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">TikhonovRegularization</span></span><a class="reference internal" href="_modules/qunfold/losses.html#TikhonovRegularization"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.TikhonovRegularization" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <code class="xref py py-class docutils literal notranslate"><span class="pre">AbstractLoss</span></code></p>
<p>Tikhonov regularization, as proposed by Blobel (1985).</p>
<p>This regularization promotes smooth solutions. This behavior is often required in ordinal quantification and in unfolding problems.</p>
</dd></dl>
<dl class="py function">
<dt class="sig sig-object py" id="quapy.method.composable.TikhonovRegularized">
<span class="sig-prename descclassname"><span class="pre">quapy.method.composable.</span></span><span class="sig-name descname"><span class="pre">TikhonovRegularized</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">loss</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">tau</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">0.0</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="_modules/qunfold/losses.html#TikhonovRegularized"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#quapy.method.composable.TikhonovRegularized" title="Permalink to this definition"></a></dt>
<dd><p>Add TikhonovRegularization (Blobel, 1985) to any loss.</p>
<p>Calling this function is equivalent to calling</p>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">CombinedLoss</span><span class="p">(</span><span class="n">loss</span><span class="p">,</span> <span class="n">TikhonovRegularization</span><span class="p">(),</span> <span class="n">weights</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="n">tau</span><span class="p">])</span>
</pre></div>
</div>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><ul class="simple">
<li><p><strong>loss</strong> An instance from <cite>qunfold.losses</cite>.</p></li>
<li><p><strong>tau</strong> (<em>optional</em>) The regularization strength. Defaults to 0.</p></li>
</ul>
</dd>
<dt class="field-even">Returns<span class="colon">:</span></dt>
<dd class="field-even"><p>An instance of <cite>CombinedLoss</cite>.</p>
</dd>
</dl>
<p class="rubric">Examples</p>
<p>The regularized loss of RUN (Blobel, 1985) is:</p>
<div class="doctest highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">TikhonovRegularization</span><span class="p">(</span><span class="n">BlobelLoss</span><span class="p">(),</span> <span class="n">tau</span><span class="p">)</span>
</pre></div>
</div>
</dd></dl>
</section>
<section id="module-quapy.method">
<span id="module-contents"></span><h2>Module contents<a class="headerlink" href="#module-quapy.method" title="Permalink to this heading"></a></h2>

View File

@ -46,6 +46,15 @@
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<ul>
<li class="toctree-l1"><a class="reference internal" href="wiki/Datasets.html">Datasets</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Evaluation.html">Evaluation</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/ExplicitLossMinimization.html">Explicit Loss Minimization</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Methods.html">Quantification Methods</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Model-Selection.html">Model Selection</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Plotting.html">Plotting</a></li>
<li class="toctree-l1"><a class="reference internal" href="wiki/Protocols.html">Protocols</a></li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="modules.html">quapy</a></li>
</ul>

File diff suppressed because one or more lines are too long

View File

@ -10,8 +10,10 @@ import pathlib
import sys
from os.path import join
quapy_path = join(pathlib.Path(__file__).parents[2].resolve().as_posix(), 'quapy')
wiki_path = join(pathlib.Path(__file__).parents[0].resolve().as_posix(), 'wiki')
print(f'quapy path={quapy_path}')
sys.path.insert(0, quapy_path)
sys.path.insert(0, wiki_path)
project = 'QuaPy: A Python-based open-source framework for quantification'

View File

@ -21,6 +21,23 @@ GitHub
QuaPy is hosted in GitHub at `https://github.com/HLT-ISTI/QuaPy <https://github.com/HLT-ISTI/QuaPy>`_
Wiki Documents
------------
In this section you can find useful information concerning different aspects of QuaPy, with examples:
.. toctree::
:maxdepth: 1
wiki/Datasets
wiki/Evaluation
wiki/ExplicitLossMinimization
wiki/Methods
wiki/Model-Selection
wiki/Plotting
wiki/Protocols
.. toctree::
:maxdepth: 2
:caption: Contents:

View File

@ -0,0 +1,440 @@
# Datasets
QuaPy makes available several datasets that have been used in
quantification literature, as well as an interface to allow
anyone import their custom datasets.
A _Dataset_ object in QuaPy is roughly a pair of _LabelledCollection_ objects,
one playing the role of the training set, another the test set.
_LabelledCollection_ is a data class consisting of the (iterable)
instances and labels. This class handles most of the sampling functionality in QuaPy.
Take a look at the following code:
```python
import quapy as qp
import quapy.functional as F
instances = [
'1st positive document', '2nd positive document',
'the only negative document',
'1st neutral document', '2nd neutral document', '3rd neutral document'
]
labels = [2, 2, 0, 1, 1, 1]
data = qp.data.LabelledCollection(instances, labels)
print(F.strprev(data.prevalence(), prec=2))
```
Output the class prevalences (showing 2 digit precision):
```
[0.17, 0.50, 0.33]
```
One can easily produce new samples at desired class prevalence values:
```python
sample_size = 10
prev = [0.4, 0.1, 0.5]
sample = data.sampling(sample_size, *prev)
print('instances:', sample.instances)
print('labels:', sample.labels)
print('prevalence:', F.strprev(sample.prevalence(), prec=2))
```
Which outputs:
```
instances: ['the only negative document' '2nd positive document'
'2nd positive document' '2nd neutral document' '1st positive document'
'the only negative document' 'the only negative document'
'the only negative document' '2nd positive document'
'1st positive document']
labels: [0 2 2 1 2 0 0 0 2 2]
prevalence: [0.40, 0.10, 0.50]
```
Samples can be made consistent across different runs (e.g., to test
different methods on the same exact samples) by sampling and retaining
the indexes, that can then be used to generate the sample:
```python
index = data.sampling_index(sample_size, *prev)
for method in methods:
sample = data.sampling_from_index(index)
...
```
However, generating samples for evaluation purposes is tackled in QuaPy
by means of the evaluation protocols (see the dedicated entries in the Wiki
for [evaluation](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation) and
[protocols](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols)).
## Reviews Datasets
Three datasets of reviews about Kindle devices, Harry Potter's series, and
the well-known IMDb movie reviews can be fetched using a unified interface.
For example:
```python
import quapy as qp
data = qp.datasets.fetch_reviews('kindle')
```
These datasets have been used in:
```
Esuli, A., Moreo, A., & Sebastiani, F. (2018, October).
A recurrent neural network for sentiment quantification.
In Proceedings of the 27th ACM International Conference on
Information and Knowledge Management (pp. 1775-1778).
```
The list of reviews ids is available in:
```python
qp.datasets.REVIEWS_SENTIMENT_DATASETS
```
Some statistics of the fhe available datasets are summarized below:
| Dataset | classes | train size | test size | train prev | test prev | type |
|---|:---:|:---:|:---:|:---:|:---:|---|
| hp | 2 | 9533 | 18399 | \[0.018, 0.982\] | \[0.065, 0.935\] | text |
| kindle | 2 | 3821 | 21591 | \[0.081, 0.919\] | \[0.063, 0.937\] | text |
| imdb | 2 | 25000 | 25000 | \[0.500, 0.500\] | \[0.500, 0.500\] | text |
## Twitter Sentiment Datasets
11 Twitter datasets for sentiment analysis.
Text is not accessible, and the documents were made available
in tf-idf format. Each dataset presents two splits: a train/val
split for model selection purposes, and a train+val/test split
for model evaluation. The following code exemplifies how to load
a twitter dataset for model selection.
```python
import quapy as qp
data = qp.datasets.fetch_twitter('gasp', for_model_selection=True)
```
The datasets were used in:
```
Gao, W., & Sebastiani, F. (2015, August).
Tweet sentiment: From classification to quantification.
In 2015 IEEE/ACM International Conference on Advances in
Social Networks Analysis and Mining (ASONAM) (pp. 97-104). IEEE.
```
Three of the datasets (semeval13, semeval14, and semeval15) share the
same training set (semeval), meaning that the training split one would get
when requesting any of them is the same. The dataset "semeval" can only
be requested with "for_model_selection=True".
The lists of the Twitter dataset's ids can be consulted in:
```python
# a list of 11 dataset ids that can be used for model selection or model evaluation
qp.datasets.TWITTER_SENTIMENT_DATASETS_TEST
# 9 dataset ids in which "semeval13", "semeval14", and "semeval15" are replaced with "semeval"
qp.datasets.TWITTER_SENTIMENT_DATASETS_TRAIN
```
Some details can be found below:
| Dataset | classes | train size | test size | features | train prev | test prev | type |
|---|:---:|:---:|:---:|:---:|:---:|:---:|---|
| gasp | 3 | 8788 | 3765 | 694582 | [0.421, 0.496, 0.082] | [0.407, 0.507, 0.086] | sparse |
| hcr | 3 | 1594 | 798 | 222046 | [0.546, 0.211, 0.243] | [0.640, 0.167, 0.193] | sparse |
| omd | 3 | 1839 | 787 | 199151 | [0.463, 0.271, 0.266] | [0.437, 0.283, 0.280] | sparse |
| sanders | 3 | 2155 | 923 | 229399 | [0.161, 0.691, 0.148] | [0.164, 0.688, 0.148] | sparse |
| semeval13 | 3 | 11338 | 3813 | 1215742 | [0.159, 0.470, 0.372] | [0.158, 0.430, 0.412] | sparse |
| semeval14 | 3 | 11338 | 1853 | 1215742 | [0.159, 0.470, 0.372] | [0.109, 0.361, 0.530] | sparse |
| semeval15 | 3 | 11338 | 2390 | 1215742 | [0.159, 0.470, 0.372] | [0.153, 0.413, 0.434] | sparse |
| semeval16 | 3 | 8000 | 2000 | 889504 | [0.157, 0.351, 0.492] | [0.163, 0.341, 0.497] | sparse |
| sst | 3 | 2971 | 1271 | 376132 | [0.261, 0.452, 0.288] | [0.207, 0.481, 0.312] | sparse |
| wa | 3 | 2184 | 936 | 248563 | [0.305, 0.414, 0.281] | [0.282, 0.446, 0.272] | sparse |
| wb | 3 | 4259 | 1823 | 404333 | [0.270, 0.392, 0.337] | [0.274, 0.392, 0.335] | sparse |
## UCI Machine Learning
### Binary datasets
A set of 32 datasets from the [UCI Machine Learning repository](https://archive.ics.uci.edu/ml/datasets.php)
used in:
```
Pérez-Gállego, P., Quevedo, J. R., & del Coz, J. J. (2017).
Using ensembles for problems with characterizable changes
in data distribution: A case study on quantification.
Information Fusion, 34, 87-100.
```
The list does not exactly coincide with that used in Pérez-Gállego et al. 2017
since we were unable to find the datasets with ids "diabetes" and "phoneme".
These dataset can be loaded by calling, e.g.:
```python
import quapy as qp
data = qp.datasets.fetch_UCIBinaryDataset('yeast', verbose=True)
```
This call will return a _Dataset_ object in which the training and
test splits are randomly drawn, in a stratified manner, from the whole
collection at 70% and 30%, respectively. The _verbose=True_ option indicates
that the dataset description should be printed in standard output.
The original data is not split,
and some papers submit the entire collection to a kFCV validation.
In order to accommodate with these practices, one could first instantiate
the entire collection, and then creating a generator that will return one
training+test dataset at a time, following a kFCV protocol:
```python
import quapy as qp
collection = qp.datasets.fetch_UCIBinaryLabelledCollection("yeast")
for data in qp.data.Dataset.kFCV(collection, nfolds=5, nrepeats=2):
...
```
Above code will allow to conduct a 2x5FCV evaluation on the "yeast" dataset.
All datasets come in numerical form (dense matrices); some statistics
are summarized below.
| Dataset | classes | instances | features | prev | type |
|---|:---:|:---:|:---:|:---:|---|
| acute.a | 2 | 120 | 6 | [0.508, 0.492] | dense |
| acute.b | 2 | 120 | 6 | [0.583, 0.417] | dense |
| balance.1 | 2 | 625 | 4 | [0.539, 0.461] | dense |
| balance.2 | 2 | 625 | 4 | [0.922, 0.078] | dense |
| balance.3 | 2 | 625 | 4 | [0.539, 0.461] | dense |
| breast-cancer | 2 | 683 | 9 | [0.350, 0.650] | dense |
| cmc.1 | 2 | 1473 | 9 | [0.573, 0.427] | dense |
| cmc.2 | 2 | 1473 | 9 | [0.774, 0.226] | dense |
| cmc.3 | 2 | 1473 | 9 | [0.653, 0.347] | dense |
| ctg.1 | 2 | 2126 | 22 | [0.222, 0.778] | dense |
| ctg.2 | 2 | 2126 | 22 | [0.861, 0.139] | dense |
| ctg.3 | 2 | 2126 | 22 | [0.917, 0.083] | dense |
| german | 2 | 1000 | 24 | [0.300, 0.700] | dense |
| haberman | 2 | 306 | 3 | [0.735, 0.265] | dense |
| ionosphere | 2 | 351 | 34 | [0.641, 0.359] | dense |
| iris.1 | 2 | 150 | 4 | [0.667, 0.333] | dense |
| iris.2 | 2 | 150 | 4 | [0.667, 0.333] | dense |
| iris.3 | 2 | 150 | 4 | [0.667, 0.333] | dense |
| mammographic | 2 | 830 | 5 | [0.514, 0.486] | dense |
| pageblocks.5 | 2 | 5473 | 10 | [0.979, 0.021] | dense |
| semeion | 2 | 1593 | 256 | [0.901, 0.099] | dense |
| sonar | 2 | 208 | 60 | [0.534, 0.466] | dense |
| spambase | 2 | 4601 | 57 | [0.606, 0.394] | dense |
| spectf | 2 | 267 | 44 | [0.794, 0.206] | dense |
| tictactoe | 2 | 958 | 9 | [0.653, 0.347] | dense |
| transfusion | 2 | 748 | 4 | [0.762, 0.238] | dense |
| wdbc | 2 | 569 | 30 | [0.627, 0.373] | dense |
| wine.1 | 2 | 178 | 13 | [0.669, 0.331] | dense |
| wine.2 | 2 | 178 | 13 | [0.601, 0.399] | dense |
| wine.3 | 2 | 178 | 13 | [0.730, 0.270] | dense |
| wine-q-red | 2 | 1599 | 11 | [0.465, 0.535] | dense |
| wine-q-white | 2 | 4898 | 11 | [0.335, 0.665] | dense |
| yeast | 2 | 1484 | 8 | [0.711, 0.289] | dense |
### Issues:
All datasets will be downloaded automatically the first time they are requested, and
stored in the _quapy_data_ folder for faster further reuse.
However, some datasets require special actions that at the moment are not fully
automated.
* Datasets with ids "ctg.1", "ctg.2", and "ctg.3" (_Cardiotocography Data Set_) load
an Excel file, which requires the user to install the _xlrd_ Python module in order
to open it.
* The dataset with id "pageblocks.5" (_Page Blocks Classification (5)_) needs to
open a "unix compressed file" (extension .Z), which is not directly doable with
standard Pythons packages like gzip or zip. This file would need to be uncompressed using
OS-dependent software manually. Information on how to do it will be printed the first
time the dataset is invoked.
* It is a good idea to ignore datasets _acute.a_, _acute.b_ and _balance.2_, since the former two
are very easy (many classifiers would score 100% accuracy) while the latter is extremely difficult
(probably there is some problem with this dataset, the errors it tends to produce are orders of magnitude
greater than for other datasets, and this has a disproportionate impact in the average performance).
### Multiclass datasets
A collection of 5 multiclass datasets from the [UCI Machine Learning repository](https://archive.ics.uci.edu/ml/datasets.php). The datasets were first used
in [this paper](https://arxiv.org/abs/2401.00490) and can be instantiated as follows:
```python
import quapy as qp
data = qp.datasets.fetch_UCIMulticlassLabelledCollection('dry-bean', test_split=0.4, verbose=True)
```
There are no pre-defined train-test partitions for these datasets, but you can easily create your own with the
`split_stratified` method, e.g., `data.split_stratified()`. This is equivalent
to calling the following method directly:
```python
data = qp.datasets.fetch_UCIMulticlassDataset('dry-bean', min_test_split=0.4, verbose=True)
```
The datasets correspond to all the datasets that can be retrieved from the platform
using the following filters:
* datasets for classification
* more than 2 classes
* containing at least 1,000 instances
* can be imported using the Python API.
Some statistics about these datasets are displayed below:
| **Dataset** | **classes** | **train size** | **test size** |
|------------------|:-----------:|:--------------:|:-------------:|
| dry-bean | 7 | 9527 | 4084 |
| wine-quality | 7 | 3428 | 1470 |
| academic-success | 3 | 3096 | 1328 |
| digits | 10 | 3933 | 1687 |
| letter | 26 | 14000 | 6000 |
## LeQua 2022 Datasets
QuaPy also provides the datasets used for the LeQua 2022 competition.
In brief, there are 4 tasks (T1A, T1B, T2A, T2B) having to do with text quantification
problems. Tasks T1A and T1B provide documents in vector form, while T2A and T2B provide
raw documents instead.
Tasks T1A and T2A are binary sentiment quantification problems, while T2A and T2B
are multiclass quantification problems consisting of estimating the class prevalence
values of 28 different merchandise products.
Every task consists of a training set, a set of validation samples (for model selection)
and a set of test samples (for evaluation). QuaPy returns this data as a LabelledCollection
(training) and two generation protocols (for validation and test samples), as follows:
```python
training, val_generator, test_generator = fetch_lequa2022(task=task)
```
See the `lequa2022_experiments.py` in the examples folder for further details on how to
carry out experiments using these datasets.
The datasets are downloaded only once, and stored for fast reuse.
Some statistics are summarized below:
| Dataset | classes | train size | validation samples | test samples | docs by sample | type |
|---------|:-------:|:----------:|:------------------:|:------------:|:----------------:|:--------:|
| T1A | 2 | 5000 | 1000 | 5000 | 250 | vector |
| T1B | 28 | 20000 | 1000 | 5000 | 1000 | vector |
| T2A | 2 | 5000 | 1000 | 5000 | 250 | text |
| T2B | 28 | 20000 | 1000 | 5000 | 1000 | text |
For further details on the datasets, we refer to the original
[paper](https://ceur-ws.org/Vol-3180/paper-146.pdf):
```
Esuli, A., Moreo, A., Sebastiani, F., & Sperduti, G. (2022).
A Detailed Overview of LeQua@ CLEF 2022: Learning to Quantify.
```
## IFCB Plankton dataset
IFCB is a dataset of plankton species in water samples hosted in `Zenodo <https://zenodo.org/records/10036244>`_.
This dataset is based on the data available publicly at `WHOI-Plankton repo <https://github.com/hsosik/WHOI-Plankton>`_
and in the scripts for the processing are available at `P. González's repo <https://github.com/pglez82/IFCB_Zenodo>`_.
This dataset comes with precomputed features for testing quantification algorithms.
Some statistics:
| | **Training** | **Validation** | **Test** |
|-----------------|:------------:|:--------------:|:--------:|
| samples | 200 | 86 | 678 |
| total instances | 584474 | 246916 | 2626429 |
| mean per sample | 2922.3 | 2871.1 | 3873.8 |
| min per sample | 266 | 59 | 33 |
| max per sample | 6645 | 7375 | 9112 |
The number of features is 512, while the number of classes is 50.
In terms of prevalence, the mean is 0.020, the minimum is 0, and the maximum is 0.978.
The dataset can be loaded for model selection (`for_model_selection=True`, thus returning the training and validation)
or for test (`for_model_selection=False`, thus returning the training+validation and the test).
Additionally, the training can be interpreted as a list (a generator) of samples (`single_sample_train=False`)
or as a single training set (`single_sample_train=True`).
Example:
```python
train, val_gen = qp.datasets.fetch_IFCB(for_model_selection=True, single_sample_train=True)
# ... model selection
train, test_gen = qp.datasets.fetch_IFCB(for_model_selection=False, single_sample_train=True)
# ... train and evaluation
```
## Adding Custom Datasets
QuaPy provides data loaders for simple formats dealing with
text, following the format:
```
class-id \t first document's pre-processed text \n
class-id \t second document's pre-processed text \n
...
```
and sparse representations of the form:
```
{-1, 0, or +1} col(int):val(float) col(int):val(float) ... \n
...
```
The code in charge in loading a LabelledCollection is:
```python
@classmethod
def load(cls, path:str, loader_func:callable):
return LabelledCollection(*loader_func(path))
```
indicating that any _loader_func_ (e.g., a user-defined one) which
returns valid arguments for initializing a _LabelledCollection_ object will allow
to load any collection. In particular, the _LabelledCollection_ receives as
arguments the instances (as an iterable) and the labels (as an iterable) and,
additionally, the number of classes can be specified (it would otherwise be
inferred from the labels, but that requires at least one positive example for
all classes to be present in the collection).
The same _loader_func_ can be passed to a Dataset, along with two
paths, in order to create a training and test pair of _LabelledCollection_,
e.g.:
```python
import quapy as qp
train_path = '../my_data/train.dat'
test_path = '../my_data/test.dat'
def my_custom_loader(path):
with open(path, 'rb') as fin:
...
return instances, labels
data = qp.data.Dataset.load(train_path, test_path, my_custom_loader)
```
### Data Processing
QuaPy implements a number of preprocessing functions in the package _qp.data.preprocessing_, including:
* _text2tfidf_: tfidf vectorization
* _reduce_columns_: reducing the number of columns based on term frequency
* _standardize_: transforms the column values into z-scores (i.e., subtract the mean and normalizes by the standard deviation, so
that the column values have zero mean and unit variance).
* _index_: transforms textual tokens into lists of numeric ids)

View File

@ -0,0 +1,159 @@
# Evaluation
Quantification is an appealing tool in scenarios of dataset shift,
and particularly in scenarios of prior-probability shift.
That is, the interest in estimating the class prevalences arises
under the belief that those class prevalences might have changed
with respect to the ones observed during training.
In other words, one could simply return the training prevalence
as a predictor of the test prevalence if this change is assumed
to be unlikely (as is the case in general scenarios of
machine learning governed by the iid assumption).
In brief, quantification requires dedicated evaluation protocols,
which are implemented in QuaPy and explained here.
## Error Measures
The module quapy.error implements the most popular error measures for quantification, e.g., mean absolute error (_mae_), mean relative absolute error (_mrae_), among others. For each such measure (e.g., _mrae_) there are corresponding functions (e.g., _rae_) that do not average the results across samples.
Some errors of classification are also available, e.g., accuracy error (_acce_) or F-1 error (_f1e_).
The error functions implement the following interface, e.g.:
```python
mae(true_prevs, prevs_hat)
```
in which the first argument is a ndarray containing the true
prevalences, and the second argument is another ndarray with
the estimations produced by some method.
Some error functions, e.g., _mrae_, _mkld_, and _mnkld_, are
smoothed for numerical stability. In those cases, there is a
third argument, e.g.:
```python
def mrae(true_prevs, prevs_hat, eps=None): ...
```
indicating the value for the smoothing parameter epsilon.
Traditionally, this value is set to 1/(2T) in past literature,
with T the sampling size. One could either pass this value
to the function each time, or to set a QuaPy's environment
variable _SAMPLE_SIZE_ once, and omit this argument
thereafter (recommended);
e.g.:
```python
qp.environ['SAMPLE_SIZE'] = 100 # once for all
true_prev = np.asarray([0.5, 0.3, 0.2]) # let's assume 3 classes
estim_prev = np.asarray([0.1, 0.3, 0.6])
error = qp.error.mrae(true_prev, estim_prev)
print(f'mrae({true_prev}, {estim_prev}) = {error:.3f}')
```
will print:
```
mrae([0.500, 0.300, 0.200], [0.100, 0.300, 0.600]) = 0.914
```
Finally, it is possible to instantiate QuaPy's quantification
error functions from strings using, e.g.:
```python
error_function = qp.error.from_name('mse')
error = error_function(true_prev, estim_prev)
```
## Evaluation Protocols
An _evaluation protocol_ is an evaluation procedure that uses
one specific _sample generation procotol_ to genereate many
samples, typically characterized by widely varying amounts of
_shift_ with respect to the original distribution, that are then
used to evaluate the performance of a (trained) quantifier.
These protocols are explained in more detail in a dedicated [entry
in the wiki](Protocols.md). For the moment being, let us assume we already have
chosen and instantiated one specific such protocol, that we here
simply call _prot_. Let also assume our model is called
_quantifier_ and that our evaluatio measure of choice is
_mae_. The evaluation comes down to:
```python
mae = qp.evaluation.evaluate(quantifier, protocol=prot, error_metric='mae')
print(f'MAE = {mae:.4f}')
```
It is often desirable to evaluate our system using more than one
single evaluatio measure. In this case, it is convenient to generate
a _report_. A report in QuaPy is a dataframe accounting for all the
true prevalence values with their corresponding prevalence values
as estimated by the quantifier, along with the error each has given
rise.
```python
report = qp.evaluation.evaluation_report(quantifier, protocol=prot, error_metrics=['mae', 'mrae', 'mkld'])
```
From a pandas' dataframe, it is straightforward to visualize all the results,
and compute the averaged values, e.g.:
```python
pd.set_option('display.expand_frame_repr', False)
report['estim-prev'] = report['estim-prev'].map(F.strprev)
print(report)
print('Averaged values:')
print(report.mean())
```
This will produce an output like:
```
true-prev estim-prev mae mrae mkld
0 [0.308, 0.692] [0.314, 0.686] 0.005649 0.013182 0.000074
1 [0.896, 0.104] [0.909, 0.091] 0.013145 0.069323 0.000985
2 [0.848, 0.152] [0.809, 0.191] 0.039063 0.149806 0.005175
3 [0.016, 0.984] [0.033, 0.967] 0.017236 0.487529 0.005298
4 [0.728, 0.272] [0.751, 0.249] 0.022769 0.057146 0.001350
... ... ... ... ... ...
4995 [0.72, 0.28] [0.698, 0.302] 0.021752 0.053631 0.001133
4996 [0.868, 0.132] [0.888, 0.112] 0.020490 0.088230 0.001985
4997 [0.292, 0.708] [0.298, 0.702] 0.006149 0.014788 0.000090
4998 [0.24, 0.76] [0.220, 0.780] 0.019950 0.054309 0.001127
4999 [0.948, 0.052] [0.965, 0.035] 0.016941 0.165776 0.003538
[5000 rows x 5 columns]
Averaged values:
mae 0.023588
mrae 0.108779
mkld 0.003631
dtype: float64
Process finished with exit code 0
```
Alternatively, we can simply generate all the predictions by:
```python
true_prevs, estim_prevs = qp.evaluation.prediction(quantifier, protocol=prot)
```
All the evaluation functions implement specific optimizations for speeding-up
the evaluation of aggregative quantifiers (i.e., of instances of _AggregativeQuantifier_).
The optimization comes down to generating classification predictions (either crisp or soft)
only once for the entire test set, and then applying the sampling procedure to the
predictions, instead of generating samples of instances and then computing the
classification predictions every time. This is only possible when the protocol
is an instance of _OnLabelledCollectionProtocol_. The optimization is only
carried out when the number of classification predictions thus generated would be
smaller than the number of predictions required for the entire protocol; e.g.,
if the original dataset contains 1M instances, but the protocol is such that it would
at most generate 20 samples of 100 instances, then it would be preferable to postpone the
classification for each sample. This behaviour is indicated by setting
_aggr_speedup="auto"_. Conversely, when indicating _aggr_speedup="force"_ QuaPy will
precompute all the predictions irrespectively of the number of instances and number of samples.
Finally, this can be deactivated by setting _aggr_speedup=False_. Note that this optimization
is not only applied for the final evaluation, but also for the internal evaluations carried
out during _model selection_. Since these are typically many, the heuristic can help reduce the
execution time a lot.

View File

@ -0,0 +1,26 @@
# Explicit Loss Minimization
QuaPy makes available several Explicit Loss Minimization (ELM) methods, including
SVM(Q), SVM(KLD), SVM(NKLD), SVM(AE), or SVM(RAE).
These methods require to first download the
[svmperf](http://www.cs.cornell.edu/people/tj/svm_light/svm_perf.html)
package, apply the patch
[svm-perf-quantification-ext.patch](./svm-perf-quantification-ext.patch), and compile the sources.
The script [prepare_svmperf.sh](prepare_svmperf.sh) does all the job. Simply run:
```
./prepare_svmperf.sh
```
The resulting directory [svm_perf_quantification](./svm_perf_quantification) contains the
patched version of _svmperf_ with quantification-oriented losses.
The [svm-perf-quantification-ext.patch](https://github.com/HLT-ISTI/QuaPy/blob/master/prepare_svmperf.sh) is an extension of the patch made available by
[Esuli et al. 2015](https://dl.acm.org/doi/abs/10.1145/2700406?casa_token=8D2fHsGCVn0AAAAA:ZfThYOvrzWxMGfZYlQW_y8Cagg-o_l6X_PcF09mdETQ4Tu7jK98mxFbGSXp9ZSO14JkUIYuDGFG0)
that allows SVMperf to optimize for
the _Q_ measure as proposed by [Barranquero et al. 2015](https://www.sciencedirect.com/science/article/abs/pii/S003132031400291X)
and for the _KLD_ and _NKLD_ measures as proposed by [Esuli et al. 2015](https://dl.acm.org/doi/abs/10.1145/2700406?casa_token=8D2fHsGCVn0AAAAA:ZfThYOvrzWxMGfZYlQW_y8Cagg-o_l6X_PcF09mdETQ4Tu7jK98mxFbGSXp9ZSO14JkUIYuDGFG0).
This patch extends the above one by also allowing SVMperf to optimize for
_AE_ and _RAE_.
See [Methods.md](Methods.md) for more details and code examples.

View File

@ -0,0 +1,526 @@
# Quantification Methods
Quantification methods can be categorized as belonging to
`aggregative` and `non-aggregative` groups.
Most methods included in QuaPy at the moment are of type `aggregative`
(though we plan to add many more methods in the near future), i.e.,
are methods characterized by the fact that
quantification is performed as an aggregation function of the individual
products of classification.
Any quantifier in QuaPy shoud extend the class `BaseQuantifier`,
and implement some abstract methods:
```python
@abstractmethod
def fit(self, data: LabelledCollection): ...
@abstractmethod
def quantify(self, instances): ...
```
The meaning of those functions should be familiar to those
used to work with scikit-learn since the class structure of QuaPy
is directly inspired by scikit-learn's _Estimators_. Functions
`fit` and `quantify` are used to train the model and to provide
class estimations (the reason why
scikit-learn' structure has not been adopted _as is_ in QuaPy responds to
the fact that scikit-learn's `predict` function is expected to return
one output for each input element --e.g., a predicted label for each
instance in a sample-- while in quantification the output for a sample
is one single array of class prevalences).
Quantifiers also extend from scikit-learn's `BaseEstimator`, in order
to simplify the use of `set_params` and `get_params` used in
[model selector](https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection).
## Aggregative Methods
All quantification methods are implemented as part of the
`qp.method` package. In particular, `aggregative` methods are defined in
`qp.method.aggregative`, and extend `AggregativeQuantifier(BaseQuantifier)`.
The methods that any `aggregative` quantifier must implement are:
```python
@abstractmethod
def aggregation_fit(self, classif_predictions: LabelledCollection, data: LabelledCollection):
@abstractmethod
def aggregate(self, classif_predictions:np.ndarray): ...
```
These two functions replace the `fit` and `quantify` methods, since those
come with default implementations. The `fit` function is provided and amounts to:
```python
def fit(self, data: LabelledCollection, fit_classifier=True, val_split=None):
self._check_init_parameters()
classif_predictions = self.classifier_fit_predict(data, fit_classifier, predict_on=val_split)
self.aggregation_fit(classif_predictions, data)
return self
```
Note that this function fits the classifier, and generates the predictions. This is assumed
to be a routine common to all aggregative quantifiers, and is provided by QuaPy. What remains
ahead is to define the `aggregation_fit` function, that takes as input the classifier predictions
and the original training data (this latter is typically unused). The classifier predictions
can be:
- confidence scores: quantifiers inheriting directly from `AggregativeQuantifier`
- crisp predictions: quantifiers inheriting from `AggregativeCrispQuantifier`
- posterior probabilities: quantifiers inheriting from `AggregativeSoftQuantifier`
- _anything_: custom quantifiers overriding the `classify` method
Note also that the `fit` method also calls `_check_init_parameters`; this function is meant to be
overriden (if needed) and allows the method to quickly raise any exception based on any inconsistency
found in the `__init__` arguments, thus avoiding to break after training the classifier and generating
predictions.
Similarly, the function `quantify` is provided, and amounts to:
```python
def quantify(self, instances):
classif_predictions = self.classify(instances)
return self.aggregate(classif_predictions)
```
in which only the function `aggregate` is required to be overriden in most cases.
Aggregative quantifiers are expected to maintain a classifier (which is
accessed through the `@property` `classifier`). This classifier is
given as input to the quantifier, and can be already fit
on external data (in which case, the `fit_learner` argument should
be set to False), or be fit by the quantifier's fit (default).
The above patterns (in training: fit the classifier, then fit the aggregation;
in test: classify, then aggregate) allows QuaPy to optimize many internal procedures.
In particular, the model selection routing takes advantage of this two-step process
and generates classifiers only for the valid combinations of hyperparameters of the
classifier, and then _clones_ these classifiers and explores the combinations
of hyperparameters that are specific to the quantifier (this can result in huge
time savings).
Concerning the inference phase, this two-step process allow the evaluation of many
standard protocols (e.g., the [artificial sampling protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation)) to be
carried out very efficiently. The reason is that the entire set can be pre-classified
once, and the quantification estimations for different samples can directly
reuse these predictions, without requiring to classify each element every time.
QuaPy leverages this property to speed-up any procedure having to do with
quantification over samples, as is customarily done in model selection or
in evaluation.
### The Classify & Count variants
QuaPy implements the four CC variants, i.e.:
* _CC_ (Classify & Count), the simplest aggregative quantifier; one that
simply relies on the label predictions of a classifier to deliver class estimates.
* _ACC_ (Adjusted Classify & Count), the adjusted variant of CC.
* _PCC_ (Probabilistic Classify & Count), the probabilistic variant of CC that
relies on the soft estimations (or posterior probabilities) returned by a (probabilistic) classifier.
* _PACC_ (Probabilistic Adjusted Classify & Count), the adjusted variant of PCC.
The following code serves as a complete example using CC equipped
with a SVM as the classifier:
```python
import quapy as qp
import quapy.functional as F
from sklearn.svm import LinearSVC
training, test = qp.datasets.fetch_twitter('hcr', pickle=True).train_test
# instantiate a classifier learner, in this case a SVM
svm = LinearSVC()
# instantiate a Classify & Count with the SVM
# (an alias is available in qp.method.aggregative.ClassifyAndCount)
model = qp.method.aggregative.CC(svm)
model.fit(training)
estim_prevalence = model.quantify(test.instances)
```
The same code could be used to instantiate an ACC, by simply replacing
the instantiation of the model with:
```python
model = qp.method.aggregative.ACC(svm)
```
Note that the adjusted variants (ACC and PACC) need to estimate
some parameters for performing the adjustment (e.g., the
_true positive rate_ and the _false positive rate_ in case of
binary classification) that are estimated on a validation split
of the labelled set. In this case, the `__init__` method of
ACC defines an additional parameter, `val_split`. If this parameter
is set to a float in [0,1] representing a fraction (e.g., 0.4)
then that fraction of labelled data (e.g., 40%)
will be used for estimating the parameters for adjusting the
predictions. This parameters can also be set with an integer,
indicating that the parameters should be estimated by means of
_k_-fold cross-validation, for which the integer indicates the
number _k_ of folds (the default value is 5). Finally, `val_split` can be set to a
specific held-out validation set (i.e., an instance of `LabelledCollection`).
The specification of `val_split` can be
postponed to the invokation of the fit method (if `val_split` was also
set in the constructor, the one specified at fit time would prevail),
e.g.:
```python
model = qp.method.aggregative.ACC(svm)
# perform 5-fold cross validation for estimating ACC's parameters
# (overrides the default val_split=0.4 in the constructor)
model.fit(training, val_split=5)
```
The following code illustrates the case in which PCC is used:
```python
model = qp.method.aggregative.PCC(svm)
model.fit(training)
estim_prevalence = model.quantify(test.instances)
print('classifier:', model.classifier)
```
In this case, QuaPy will print:
```
The learner LinearSVC does not seem to be probabilistic. The learner will be calibrated.
classifier: CalibratedClassifierCV(base_estimator=LinearSVC(), cv=5)
```
The first output indicates that the learner (`LinearSVC` in this case)
is not a probabilistic classifier (i.e., it does not implement the
`predict_proba` method) and so, the classifier will be converted to
a probabilistic one through [calibration](https://scikit-learn.org/stable/modules/calibration.html).
As a result, the classifier that is printed in the second line points
to a `CalibratedClassifier` instance. Note that calibration can only
be applied to hard classifiers when `fit_learner=True`; an exception
will be raised otherwise.
Lastly, everything we said aboud ACC and PCC
applies to PACC as well.
_New in v0.1.9_: quantifiers ACC and PACC now have three additional arguments: `method`, `solver` and `norm`:
* Argument `method` specifies how to solve, for `p`, the linear system `q = Mp` (where `q` is the unadjusted counts for the
test sample, `M` contains the class-conditional unadjusted counts --i.e., the missclassification rates-- and `p` is the
sought prevalence vector):
* option `"inversion"`: attempts to invert matrix `M`, thus solving `Minv q = p`. In degenerated cases, this
inversion may not exist. In such cases, the method defaults to returning `q` (the unadjusted counts)
* option `"invariant-ratio""` uses the invariant ratio estimator system proposed in Remark 5 of
[Vaz, A.F., Izbicki F. and Stern, R.B. "Quantification Under Prior Probability Shift: the Ratio Estimator
and its Extensions", in Journal of Machine Learning Research 20 (2019)](https://jmlr.csail.mit.edu/papers/volume20/18-456/18-456.pdf).
* Argument `solver` specifies how to solve the linear system.
* `"exact-raise"` solves the system of linear equations and raises an exception if the system is not solvable
* `"exact-cc"` returns the original unadjusted count if the system is not solvable
* `"minimize"` minimizes the L2 norm of :math:`|Mp-q|`. This one generally works better, and is the
default parameter. More details about this can be consulted in
[Bunse, M. "On Multi-Class Extensions of Adjusted Classify and Count",
on proceedings of the 2nd International Workshop on Learning to Quantify: Methods and Applications (LQ 2022),
ECML/PKDD 2022, Grenoble (France)](https://lq-2022.github.io/proceedings/CompleteVolume.pdf)).
* Argument `norm` specifies how to normalize the estimate `p` when the vector lies outside of the probability simplex.
Options are:
* `"clip"` which clips the values to range `[0, 1]` and then L1-normalizes the vector
* `"mapsimplex"` which projects the results on the probability simplex, as proposed by Vaz et al. in
[Remark 5 of Vaz, et. (2019)](https://jmlr.csail.mit.edu/papers/volume20/18-456/18-456.pdf). This implementation
relies on [Mathieu Blondel's `projection_simplex_sort`](https://gist.github.com/mblondel/6f3b7aaad90606b98f71))
* `"condsoftmax"` applies softmax normalization only if the prevalence vector lies outside of the probability simplex.
#### BayesianCC (_New in v0.1.9_!)
The `BayesianCC` is a variant of ACC introduced in
[Ziegler, A. and Czyż, P. "Bayesian quantification with black-box estimators", arXiv (2023)](https://arxiv.org/abs/2302.09159),
which models the probabilities `q = Mp` using latent random variables with weak Bayesian priors, rather than
plug-in probability estimates. In particular, it uses Markov Chain Monte Carlo sampling to find the values of
`p` compatible with the observed quantities.
The `aggregate` method returns the posterior mean and the `get_prevalence_samples` method can be used to find
uncertainty around `p` estimates (conditional on the observed data and the trained classifier)
and is suitable for problems in which the `q = Mp` matrix is nearly non-invertible.
Note that this quantification method requires `val_split` to be a `float` and installation of additional dependencies (`$ pip install quapy[bayes]`) needed to run Markov chain Monte Carlo sampling. Markov Chain Monte Carlo is is slower than matrix inversion methods, but is guaranteed to sample proper probability vectors, so no clipping strategies are required.
An example presenting how to run the method and use posterior samples is available in `examples/bayesian_quantification.py`.
### Expectation Maximization (EMQ)
The Expectation Maximization Quantifier (EMQ), also known as
the SLD, is available at `qp.method.aggregative.EMQ` or via the
alias `qp.method.aggregative.ExpectationMaximizationQuantifier`.
The method is described in:
_Saerens, M., Latinne, P., and Decaestecker, C. (2002). Adjusting the outputs of a classifier
to new a priori probabilities: A simple procedure. Neural Computation, 14(1):2141._
EMQ works with a probabilistic classifier (if the classifier
given as input is a hard one, a calibration will be attempted).
Although this method was originally proposed for improving the
posterior probabilities of a probabilistic classifier, and not
for improving the estimation of prior probabilities, EMQ ranks
almost always among the most effective quantifiers in the
experiments we have carried out.
An example of use can be found below:
```python
import quapy as qp
from sklearn.linear_model import LogisticRegression
dataset = qp.datasets.fetch_twitter('hcr', pickle=True)
model = qp.method.aggregative.EMQ(LogisticRegression())
model.fit(dataset.training)
estim_prevalence = model.quantify(dataset.test.instances)
```
_New in v0.1.7_: EMQ now accepts two new parameters in the construction method, namely
`exact_train_prev` which allows to use the true training prevalence as the departing
prevalence estimation (default behaviour), or instead an approximation of it as
suggested by [Alexandari et al. (2020)](http://proceedings.mlr.press/v119/alexandari20a.html)
(by setting `exact_train_prev=False`).
The other parameter is `recalib` which allows to indicate a calibration method, among those
proposed by [Alexandari et al. (2020)](http://proceedings.mlr.press/v119/alexandari20a.html),
including the Bias-Corrected Temperature Scaling, Vector Scaling, etc.
See the API documentation for further details.
### Hellinger Distance y (HDy)
Implementation of the method based on the Hellinger Distance y (HDy) proposed by
[González-Castro, V., Alaiz-Rodrı́guez, R., and Alegre, E. (2013). Class distribution
estimation based on the Hellinger distance. Information Sciences, 218:146164.](https://www.sciencedirect.com/science/article/pii/S0020025512004069)
It is implemented in `qp.method.aggregative.HDy` (also accessible
through the allias `qp.method.aggregative.HellingerDistanceY`).
This method works with a probabilistic classifier (hard classifiers
can be used as well and will be calibrated) and requires a validation
set to estimate parameter for the mixture model. Just like
ACC and PACC, this quantifier receives a `val_split` argument
in the constructor (or in the fit method, in which case the previous
value is overridden) that can either be a float indicating the proportion
of training data to be taken as the validation set (in a random
stratified split), or a validation set (i.e., an instance of
`LabelledCollection`) itself.
HDy was proposed as a binary classifier and the implementation
provided in QuaPy accepts only binary datasets.
The following code shows an example of use:
```python
import quapy as qp
from sklearn.linear_model import LogisticRegression
# load a binary dataset
dataset = qp.datasets.fetch_reviews('hp', pickle=True)
qp.data.preprocessing.text2tfidf(dataset, min_df=5, inplace=True)
model = qp.method.aggregative.HDy(LogisticRegression())
model.fit(dataset.training)
estim_prevalence = model.quantify(dataset.test.instances)
```
_New in v0.1.7:_ QuaPy now provides an implementation of the generalized
"Distribution Matching" approaches for multiclass, inspired by the framework
of [Firat (2016)](https://arxiv.org/abs/1606.00868). One can instantiate
a variant of HDy for multiclass quantification as follows:
```python
mutliclassHDy = qp.method.aggregative.DMy(classifier=LogisticRegression(), divergence='HD', cdf=False)
```
_New in v0.1.7:_ QuaPy now provides an implementation of the "DyS"
framework proposed by [Maletzke et al (2020)](https://ojs.aaai.org/index.php/AAAI/article/view/4376)
and the "SMM" method proposed by [Hassan et al (2019)](https://ieeexplore.ieee.org/document/9260028)
(thanks to _Pablo González_ for the contributions!)
### Threshold Optimization methods
_New in v0.1.7:_ QuaPy now implements Forman's threshold optimization methods;
see, e.g., [(Forman 2006)](https://dl.acm.org/doi/abs/10.1145/1150402.1150423)
and [(Forman 2008)](https://link.springer.com/article/10.1007/s10618-008-0097-y).
These include: T50, MAX, X, Median Sweep (MS), and its variant MS2.
### Explicit Loss Minimization
The Explicit Loss Minimization (ELM) represent a family of methods
based on structured output learning, i.e., quantifiers relying on
classifiers that have been optimized targeting a
quantification-oriented evaluation measure.
The original methods are implemented in QuaPy as classify & count (CC)
quantifiers that use Joachim's [SVMperf](https://www.cs.cornell.edu/people/tj/svm_light/svm_perf.html)
as the underlying classifier, properly set to optimize for the desired loss.
In QuaPy, this can be more achieved by calling the functions:
* `newSVMQ`: returns the quantification method called SVM(Q) that optimizes for the metric _Q_ defined
in [_Barranquero, J., Díez, J., and del Coz, J. J. (2015). Quantification-oriented learning based
on reliable classifiers. Pattern Recognition, 48(2):591604._](https://www.sciencedirect.com/science/article/pii/S003132031400291X)
* `newSVMKLD` and `newSVMNKLD`: returns the quantification method called SVM(KLD) and SVM(nKLD), standing for
Kullback-Leibler Divergence and Normalized Kullback-Leibler Divergence, as proposed in [_Esuli, A. and Sebastiani, F. (2015).
Optimizing text quantifiers for multivariate loss functions.
ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27._](https://dl.acm.org/doi/abs/10.1145/2700406)
* `newSVMAE` and `newSVMRAE`: returns a quantification method called SVM(AE) and SVM(RAE) that optimizes for the (Mean) Absolute Error and for the
(Mean) Relative Absolute Error, as first used by
[_Moreo, A. and Sebastiani, F. (2021). Tweet sentiment quantification: An experimental re-evaluation. PLOS ONE 17 (9), 1-23._](https://arxiv.org/abs/2011.02552)
the last two methods (SVM(AE) and SVM(RAE)) have been implemented in
QuaPy in order to make available ELM variants for what nowadays
are considered the most well-behaved evaluation metrics in quantification.
In order to make these models work, you would need to run the script
`prepare_svmperf.sh` (distributed along with QuaPy) that
downloads `SVMperf`' source code, applies a patch that
implements the quantification oriented losses, and compiles the
sources.
If you want to add any custom loss, you would need to modify
the source code of `SVMperf` in order to implement it, and
assign a valid loss code to it. Then you must re-compile
the whole thing and instantiate the quantifier in QuaPy
as follows:
```python
# you can either set the path to your custom svm_perf_quantification implementation
# in the environment variable, or as an argument to the constructor of ELM
qp.environ['SVMPERF_HOME'] = './path/to/svm_perf_quantification'
# assign an alias to your custom loss and the id you have assigned to it
svmperf = qp.classification.svmperf.SVMperf
svmperf.valid_losses['mycustomloss'] = 28
# instantiate the ELM method indicating the loss
model = qp.method.aggregative.ELM(loss='mycustomloss')
```
All ELM are binary quantifiers since they rely on `SVMperf`, that
currently supports only binary classification.
ELM variants (any binary quantifier in general) can be extended
to operate in single-label scenarios trivially by adopting a
"one-vs-all" strategy (as, e.g., in
[_Gao, W. and Sebastiani, F. (2016). From classification to quantification in tweet sentiment
analysis. Social Network Analysis and Mining, 6(19):122_](https://link.springer.com/article/10.1007/s13278-016-0327-z)).
In QuaPy this is possible by using the `OneVsAll` class.
There are two ways for instantiating this class, `OneVsAllGeneric` that works for
any quantifier, and `OneVsAllAggregative` that is optimized for aggregative quantifiers.
In general, you can simply use the `newOneVsAll` function and QuaPy will choose
the more convenient of the two.
```python
import quapy as qp
from quapy.method.aggregative import SVMQ
# load a single-label dataset (this one contains 3 classes)
dataset = qp.datasets.fetch_twitter('hcr', pickle=True)
# let qp know where svmperf is
qp.environ['SVMPERF_HOME'] = '../svm_perf_quantification'
model = newOneVsAll(SVMQ(), n_jobs=-1) # run them on parallel
model.fit(dataset.training)
estim_prevalence = model.quantify(dataset.test.instances)
```
Check the examples _[explicit_loss_minimization.py](..%2Fexamples%2Fexplicit_loss_minimization.py)_
and [one_vs_all.py](..%2Fexamples%2Fone_vs_all.py) for more details.
### Kernel Density Estimation methods (KDEy)
_New in v0.1.8_: QuaPy now provides implementations for the three variants
of KDE-based methods proposed in
_[Moreo, A., González, P. and del Coz, J.J., 2023.
Kernel Density Estimation for Multiclass Quantification.
arXiv preprint arXiv:2401.00490.](https://arxiv.org/abs/2401.00490)_.
The variants differ in the divergence metric to be minimized:
- KDEy-HD: minimizes the (squared) Hellinger Distance and solves the problem via a Monte Carlo approach
- KDEy-CS: minimizes the Cauchy-Schwarz divergence and solves the problem via a closed-form solution
- KDEy-ML: minimizes the Kullback-Leibler divergence and solves the problem via maximum-likelihood
These methods are specifically devised for multiclass problems (although they can tackle
binary problems too).
All KDE-based methods depend on the hyperparameter `bandwidth` of the kernel. Typical values
that can be explored in model selection range in [0.01, 0.25]. The methods' performance
vary smoothing with smooth variations of this hyperparameter.
## Meta Models
By _meta_ models we mean quantification methods that are defined on top of other
quantification methods, and that thus do not squarely belong to the aggregative nor
the non-aggregative group (indeed, _meta_ models could use quantifiers from any of those
groups).
_Meta_ models are implemented in the `qp.method.meta` module.
### Ensembles
QuaPy implements (some of) the variants proposed in:
* [_Pérez-Gállego, P., Quevedo, J. R., & del Coz, J. J. (2017).
Using ensembles for problems with characterizable changes in data distribution: A case study on quantification.
Information Fusion, 34, 87-100._](https://www.sciencedirect.com/science/article/pii/S1566253516300628)
* [_Pérez-Gállego, P., Castano, A., Quevedo, J. R., & del Coz, J. J. (2019).
Dynamic ensemble selection for quantification tasks.
Information Fusion, 45, 1-15._](https://www.sciencedirect.com/science/article/pii/S1566253517303652)
The following code shows how to instantiate an Ensemble of 30 _Adjusted Classify & Count_ (ACC)
quantifiers operating with a _Logistic Regressor_ (LR) as the base classifier, and using the
_average_ as the aggregation policy (see the original article for further details).
The last parameter indicates to use all processors for parallelization.
```python
import quapy as qp
from quapy.method.aggregative import ACC
from quapy.method.meta import Ensemble
from sklearn.linear_model import LogisticRegression
dataset = qp.datasets.fetch_UCIBinaryDataset('haberman')
model = Ensemble(quantifier=ACC(LogisticRegression()), size=30, policy='ave', n_jobs=-1)
model.fit(dataset.training)
estim_prevalence = model.quantify(dataset.test.instances)
```
Other aggregation policies implemented in QuaPy include:
* 'ptr' for applying a dynamic selection based on the training prevalence of the ensemble's members
* 'ds' for applying a dynamic selection based on the Hellinger Distance
* _any valid quantification measure_ (e.g., 'mse') for performing a static selection based on
the performance estimated for each member of the ensemble in terms of that evaluation metric.
When using any of the above options, it is important to set the `red_size` parameter, which
informs of the number of members to retain.
Please, check the [model selection](https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection)
wiki if you want to optimize the hyperparameters of ensemble for classification or quantification.
### The QuaNet neural network
QuaPy offers an implementation of QuaNet, a deep learning model presented in:
[_Esuli, A., Moreo, A., & Sebastiani, F. (2018, October).
A recurrent neural network for sentiment quantification.
In Proceedings of the 27th ACM International Conference on
Information and Knowledge Management (pp. 1775-1778)._](https://dl.acm.org/doi/abs/10.1145/3269206.3269287)
This model requires `torch` to be installed.
QuaNet also requires a classifier that can provide embedded representations
of the inputs.
In the original paper, QuaNet was tested using an LSTM as the base classifier.
In the following example, we show an instantiation of QuaNet that instead uses CNN as a probabilistic classifier, taking its last layer representation as the document embedding:
```python
import quapy as qp
from quapy.method.meta import QuaNet
from quapy.classification.neural import NeuralClassifierTrainer, CNNnet
# use samples of 100 elements
qp.environ['SAMPLE_SIZE'] = 100
# load the kindle dataset as text, and convert words to numerical indexes
dataset = qp.datasets.fetch_reviews('kindle', pickle=True)
qp.data.preprocessing.index(dataset, min_df=5, inplace=True)
# the text classifier is a CNN trained by NeuralClassifierTrainer
cnn = CNNnet(dataset.vocabulary_size, dataset.n_classes)
learner = NeuralClassifierTrainer(cnn, device='cuda')
# train QuaNet
model = QuaNet(learner, device='cuda')
model.fit(dataset.training)
estim_prevalence = model.quantify(dataset.test.instances)
```

View File

@ -0,0 +1,145 @@
# Model Selection
As a supervised machine learning task, quantification methods
can strongly depend on a good choice of model hyper-parameters.
The process whereby those hyper-parameters are chosen is
typically known as _Model Selection_, and typically consists of
testing different settings and picking the one that performed
best in a held-out validation set in terms of any given
evaluation measure.
## Targeting a Quantification-oriented loss
The task being optimized determines the evaluation protocol,
i.e., the criteria according to which the performance of
any given method for solving is to be assessed.
As a task on its own right, quantification should impose
its own model selection strategies, i.e., strategies
aimed at finding appropriate configurations
specifically designed for the task of quantification.
Quantification has long been regarded as an add-on of
classification, and thus the model selection strategies
customarily adopted in classification have simply been
applied to quantification (see the next section).
It has been argued in [Moreo, Alejandro, and Fabrizio Sebastiani.
Re-Assessing the "Classify and Count" Quantification Method.
ECIR 2021: Advances in Information Retrieval pp 7591.](https://link.springer.com/chapter/10.1007/978-3-030-72240-1_6)
that specific model selection strategies should
be adopted for quantification. That is, model selection
strategies for quantification should target
quantification-oriented losses and be tested in a variety
of scenarios exhibiting different degrees of prior
probability shift.
The class _qp.model_selection.GridSearchQ_ implements a grid-search exploration over the space of
hyper-parameter combinations that [evaluates](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation)
each combination of hyper-parameters by means of a given quantification-oriented
error metric (e.g., any of the error functions implemented
in _qp.error_) and according to a
[sampling generation protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols).
The following is an example (also included in the examples folder) of model selection for quantification:
```python
import quapy as qp
from quapy.protocol import APP
from quapy.method.aggregative import DMy
from sklearn.linear_model import LogisticRegression
import numpy as np
"""
In this example, we show how to perform model selection on a DistributionMatching quantifier.
"""
model = DMy(LogisticRegression())
qp.environ['SAMPLE_SIZE'] = 100
qp.environ['N_JOBS'] = -1 # explore hyper-parameters in parallel
training, test = qp.datasets.fetch_reviews('imdb', tfidf=True, min_df=5).train_test
# The model will be returned by the fit method of GridSearchQ.
# Every combination of hyper-parameters will be evaluated by confronting the
# quantifier thus configured against a series of samples generated by means
# of a sample generation protocol. For this example, we will use the
# artificial-prevalence protocol (APP), that generates samples with prevalence
# values in the entire range of values from a grid (e.g., [0, 0.1, 0.2, ..., 1]).
# We devote 30% of the dataset for this exploration.
training, validation = training.split_stratified(train_prop=0.7)
protocol = APP(validation)
# We will explore a classification-dependent hyper-parameter (e.g., the 'C'
# hyper-parameter of LogisticRegression) and a quantification-dependent hyper-parameter
# (e.g., the number of bins in a DistributionMatching quantifier.
# Classifier-dependent hyper-parameters have to be marked with a prefix "classifier__"
# in order to let the quantifier know this hyper-parameter belongs to its underlying
# classifier.
param_grid = {
'classifier__C': np.logspace(-3, 3, 7),
'nbins': [8, 16, 32, 64],
}
model = qp.model_selection.GridSearchQ(
model=model,
param_grid=param_grid,
protocol=protocol,
error='mae', # the error to optimize is the MAE (a quantification-oriented loss)
refit=True, # retrain on the whole labelled set once done
verbose=True # show information as the process goes on
).fit(training)
print(f'model selection ended: best hyper-parameters={model.best_params_}')
model = model.best_model_
# evaluation in terms of MAE
# we use the same evaluation protocol (APP) on the test set
mae_score = qp.evaluation.evaluate(model, protocol=APP(test), error_metric='mae')
print(f'MAE={mae_score:.5f}')
```
In this example, the system outputs:
```
[GridSearchQ]: starting model selection with self.n_jobs =-1
[GridSearchQ]: hyperparams={'classifier__C': 0.01, 'nbins': 64} got mae score 0.04021 [took 1.1356s]
[GridSearchQ]: hyperparams={'classifier__C': 0.01, 'nbins': 32} got mae score 0.04286 [took 1.2139s]
[GridSearchQ]: hyperparams={'classifier__C': 0.01, 'nbins': 16} got mae score 0.04888 [took 1.2491s]
[GridSearchQ]: hyperparams={'classifier__C': 0.001, 'nbins': 8} got mae score 0.05163 [took 1.5372s]
[...]
[GridSearchQ]: hyperparams={'classifier__C': 1000.0, 'nbins': 32} got mae score 0.02445 [took 2.9056s]
[GridSearchQ]: optimization finished: best params {'classifier__C': 100.0, 'nbins': 32} (score=0.02234) [took 7.3114s]
[GridSearchQ]: refitting on the whole development set
model selection ended: best hyper-parameters={'classifier__C': 100.0, 'nbins': 32}
MAE=0.03102
```
## Targeting a Classification-oriented loss
Optimizing a model for quantification could rather be
computationally costly.
In aggregative methods, one could alternatively try to optimize
the classifier's hyper-parameters for classification.
Although this is theoretically suboptimal, many articles in
quantification literature have opted for this strategy.
In QuaPy, this is achieved by simply instantiating the
classifier learner as a GridSearchCV from scikit-learn.
The following code illustrates how to do that:
```python
learner = GridSearchCV(
LogisticRegression(),
param_grid={'C': np.logspace(-4, 5, 10), 'class_weight': ['balanced', None]},
cv=5)
model = DistributionMatching(learner).fit(dataset.train)
```
However, this is conceptually flawed, since the model should be
optimized for the task at hand (quantification), and not for a surrogate task (classification),
i.e., the model should be requested to deliver low quantification errors, rather
than low classification errors.

View File

@ -0,0 +1,250 @@
# Plotting
The module _qp.plot_ implements some basic plotting functions
that can help analyse the performance of a quantification method.
All plotting functions receive as inputs the outcomes of
some experiments and include, for each experiment,
the following three main arguments:
* _method_names_ a list containing the names of the quantification methods
* _true_prevs_ a list containing matrices of true prevalences
* _estim_prevs_ a list containing matrices of estimated prevalences
(should be of the same shape as the corresponding matrix in _true_prevs_)
Note that a method (as indicated by a name in _method_names_) can
appear more than once. This could occur when various datasets are
involved in the experiments. In this case, all experiments for the
method will be merged and the plot will represent the method's
performance across various datasets.
This is a very simple example of a valid input for the plotting functions:
```python
method_names = ['classify & count', 'EMQ', 'classify & count']
true_prevs = [
np.array([[0.5, 0.5], [0.25, 0.75]]),
np.array([[0.0, 1.0], [0.25, 0.75], [0.0, 0.1]]),
np.array([[0.0, 1.0], [0.25, 0.75], [0.0, 0.1]]),
]
estim_prevs = [
np.array([[0.45, 0.55], [0.6, 0.4]]),
np.array([[0.0, 1.0], [0.5, 0.5], [0.2, 0.8]]),
np.array([[0.1, 0.9], [0.3, 0.7], [0.0, 0.1]]),
]
```
in which the _classify & count_ has been tested in two datasets and
the _EMQ_ method has been tested only in one dataset. For the first
experiment, only two (binary) quantifications have been tested,
while for the second and third experiments three instances have
been tested.
In general, we would like to test the performance of the
quantification methods across different scenarios showcasing
the accuracy of the quantifier in predicting class prevalences
for a wide range of prior distributions. This can easily be
achieved by means of the
[artificial sampling protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols)
that is implemented in QuaPy.
The following code shows how to perform one simple experiment
in which the 4 _CC-variants_, all equipped with a linear SVM, are
applied to one binary dataset of reviews about _Kindle_ devices and
tested across the entire spectrum of class priors (taking 21 splits
of the interval [0,1], i.e., using prevalence steps of 0.05, and
generating 100 random samples at each prevalence).
```python
import quapy as qp
from protocol import APP
from quapy.method.aggregative import CC, ACC, PCC, PACC
from sklearn.svm import LinearSVC
qp.environ['SAMPLE_SIZE'] = 500
def gen_data():
def base_classifier():
return LinearSVC(class_weight='balanced')
def models():
yield 'CC', CC(base_classifier())
yield 'ACC', ACC(base_classifier())
yield 'PCC', PCC(base_classifier())
yield 'PACC', PACC(base_classifier())
train, test = qp.datasets.fetch_reviews('kindle', tfidf=True, min_df=5).train_test
method_names, true_prevs, estim_prevs, tr_prevs = [], [], [], []
for method_name, model in models():
model.fit(train)
true_prev, estim_prev = qp.evaluation.prediction(model, APP(test, repeats=100, random_state=0))
method_names.append(method_name)
true_prevs.append(true_prev)
estim_prevs.append(estim_prev)
tr_prevs.append(train.prevalence())
return method_names, true_prevs, estim_prevs, tr_prevs
method_names, true_prevs, estim_prevs, tr_prevs = gen_data()
````
the plots that can be generated are explained below.
## Diagonal Plot
The _diagonal_ plot shows a very insightful view of the
quantifier's performance. It plots the predicted class
prevalence (in the y-axis) against the true class prevalence
(in the x-axis). Unfortunately, it is limited to binary quantification,
although one can simply generate as many _diagonal_ plots as
classes there are by indicating which class should be considered
the target of the plot.
The following call will produce the plot:
```python
qp.plot.binary_diagonal(method_names, true_prevs, estim_prevs, train_prev=tr_prevs[0], savepath='./plots/bin_diag.png')
```
the last argument is optional, and indicates the path where to save
the plot (the file extension will determine the format -- typical extensions
are '.png' or '.pdf'). If this path is not provided, then the plot
will be shown but not saved.
The resulting plot should look like:
![diagonal plot on Kindle](./wiki_examples/selected_plots/bin_diag.png)
Note that in this case, we are also indicating the training
prevalence, which is plotted in the diagonal a as cyan dot.
The color bands indicate the standard deviations of the predictions,
and can be hidden by setting the argument _show_std=False_ (see
the complete list of arguments in the documentation).
Finally, note how most quantifiers, and specially the "unadjusted"
variants CC and PCC, are strongly biased towards the
prevalence seen during training.
## Quantification bias
This plot aims at evincing the bias that any quantifier
displays with respect to the training prevalences by
means of [box plots](https://en.wikipedia.org/wiki/Box_plot).
This plot can be generated by:
```python
qp.plot.binary_bias_global(method_names, true_prevs, estim_prevs, savepath='./plots/bin_bias.png')
```
and should look like:
![bias plot on Kindle](./wiki_examples/selected_plots/bin_bias.png)
The box plots show some interesting facts:
* all methods are biased towards the training prevalence but specially
so CC and PCC (an unbiased quantifier would have a box centered at 0)
* the bias is always positive, indicating that all methods tend to
overestimate the positive class prevalence
* CC and PCC have high variability while ACC and specially PACC exhibit
lower variability.
Again, these plots could be generated for experiments ranging across
different datasets, and the plot will merge all data accordingly.
Another illustrative example can be shown that consists of
training different CC quantifiers trained at different
(artificially sampled) training prevalences.
For this example, we generate training samples of 5000
documents containing 10%, 20%, ..., 90% of positives from the
IMDb dataset, and generate the bias plot again.
This example can be run by rewritting the _gen_data()_ function
like this:
```python
def gen_data():
train, test = qp.datasets.fetch_reviews('imdb', tfidf=True, min_df=5).train_test
model = CC(LinearSVC())
method_data = []
for training_prevalence in np.linspace(0.1, 0.9, 9):
training_size = 5000
# since the problem is binary, it suffices to specify the negative prevalence, since the positive is constrained
train_sample = train.sampling(training_size, 1-training_prevalence)
model.fit(train_sample)
true_prev, estim_prev = qp.evaluation.prediction(model, APP(test, repeats=100, random_state=0))
method_name = 'CC$_{'+f'{int(100*training_prevalence)}' + '\%}$'
method_data.append((method_name, true_prev, estim_prev, train_sample.prevalence()))
return zip(*method_data)
```
and the plot should now look like:
![bias plot on IMDb](./wiki_examples/selected_plots/bin_bias_cc.png)
which clearly shows a negative bias for CC variants trained on
data containing more negatives (i.e., < 50%) and positive biases
in cases containing more positives (i.e., >50%). The CC trained
at 50% behaves as an unbiased estimator of the positive class
prevalence.
The function _qp.plot.binary_bias_bins_ allows the user to
generate box plots broken down by bins of true test prevalence.
To this aim, an argument _nbins_ is passed which indicates
how many isometric subintervals to take. For example
the following plot is produced for _nbins=3_:
![bias plot on IMDb](./wiki_examples/selected_plots/bin_bias_bin_cc.png)
Interestingly enough, the seemingly unbiased estimator (CC at 50%) happens to display
a positive bias (or a tendency to overestimate) in cases of low prevalence
(i.e., when the true prevalence of the positive class is below 33%),
and a negative bias (or a tendency to underestimate) in cases of high prevalence
(i.e., when the true prevalence is beyond 67%).
Out of curiosity, the diagonal plot for this experiment looks like:
![diag plot on IMDb](./wiki_examples/selected_plots/bin_diag_cc.png)
showing pretty clearly the dependency of CC on the prior probabilities
of the labeled set it was trained on.
## Error by Drift
Above discussed plots are useful for analyzing and comparing
the performance of different quantification methods, but are
limited to the binary case. The "error by drift" is a plot
that shows the error in predictions as a function of the
(prior probability) drift between each test sample and the
training set. Interestingly, the error and drift can both be measured
in terms of any evaluation measure for quantification (like the
ones available in _qp.error_) and can thus be computed
irrespectively of the number of classes.
The following shows how to generate the plot for the 4 CC variants,
using 10 bins for the drift
and _absolute error_ as the measure of the error (the
drift in the x-axis is always computed in terms of _absolute error_ since
other errors are harder to interpret):
```python
qp.plot.error_by_drift(method_names, true_prevs, estim_prevs, tr_prevs,
error_name='ae', n_bins=10, savepath='./plots/err_drift.png')
```
![diag plot on IMDb](./wiki_examples/selected_plots/err_drift.png)
Note that all methods work reasonably well in cases of low prevalence
drift (i.e., any CC-variant is a good quantifier whenever the IID
assumption is approximately preserved). The higher the drift, the worse
those quantifiers tend to perform, although it is clear that PACC
yields the lowest error for the most difficult cases.
Remember that any plot can be generated _across many datasets_, and
that this would probably result in a more solid comparison.
In those cases, however, it is likely that the variances of each
method get higher, to the detriment of the visualization.
We recommend to set _show_std=False_ in those cases
in order to hide the color bands.

View File

@ -0,0 +1,177 @@
# Protocols
_New in v0.1.7!_
Quantification methods are expected to behave robustly in the presence of
shift. For this reason, quantification methods need to be confronted with
samples exhibiting widely varying amounts of shift.
_Protocols_ implement specific ways for generating such samples.
In QuaPy, a protocol is an instance of _AbstractProtocol_ implementing a
_call_ method that returns a generator yielding a tuple _(sample, prev)_
every time. The protocol can also implement the function _total()_ informing
of the number of total samples that the protocol generates.
Protocols can inherit from _AbstractStochasticSeededProtocol_, the class of
protocols that generate samples stochastically, but that can be set with
a seed in order to allow for replicating the exact same samples. This is important
for evaluation purposes, since we typically require all our methods be evaluated
on the exact same test samples in order to allow for a fair comparison.
Indeed, the seed is set by default to 0, since this is the most commonly
desired behaviour. Indicate _radom_state=None_ for allowing different sequences of samples to be
generated every time the protocol is invoked.
Protocols that also inherit from _OnLabelledCollectionProtocol_ are such that
samples are generated from a _LabelledCollection_ object (e.g., a test collection,
or a validation collection). These protocols also allow for generating sequences of
_LabelledCollection_ instead of _(sample, prev)_ by indicating
_return_type='labelled_collection'_ instead of the default value _return_type='sample_prev'_.
For a more technical explanation on _AbstractStochasticSeededProtocol_ and
_OnLabelledCollectionProtocol_, see the "custom_protocol.py" provided in the
example folder.
QuaPy provides implementations of most popular sample generation protocols
used in literature. This is the subject of the following sections.
## Artificial-Prevalence Protocol
The "artificial-sampling protocol" (APP) proposed by
[Forman (2005)](https://link.springer.com/chapter/10.1007/11564096_55)
is likely the most popular protocol used for quantification evaluation.
In APP, a test set is used to generate samples at
desired prevalence values covering the full spectrum.
In APP, the user specifies the number
of (equally distant) points to be generated from the interval [0,1];
in QuaPy this is achieved by setting _n_prevpoints_.
For example, if _n_prevpoints=11_ then, for each class, the prevalence values
[0., 0.1, 0.2, ..., 1.] will be used. This means that, for two classes,
the number of different prevalence values will be 11 (since, once the prevalence
of one class is determined, the other one is constrained). For 3 classes,
the number of valid combinations can be obtained as 11 + 10 + ... + 1 = 66.
In general, the number of valid combinations that will be produced for a given
value of n_prevpoints can be consulted by invoking
_num_prevalence_combinations_, e.g.:
```python
import quapy.functional as F
n_prevpoints = 21
n_classes = 4
n = F.num_prevalence_combinations(n_prevpoints, n_classes, n_repeats=1)
```
in this example, _n=1771_. Note the last argument, _n_repeats_, that
informs of the number of examples that will be generated for any
valid combination (typical values are, e.g., 1 for a single sample,
or 10 or higher for computing standard deviations of performing statistical
significance tests).
One can instead work the other way around, i.e., one could decide for a
maximum budged of evaluations and get the number of prevalence points that
will give rise to a number of evaluations close, but not higher, than
this budget. This can be achieved with the function
_get_nprevpoints_approximation_, e.g.:
```python
budget = 5000
n_prevpoints = F.get_nprevpoints_approximation(budget, n_classes, n_repeats=1)
n = F.num_prevalence_combinations(n_prevpoints, n_classes, n_repeats=1)
print(f'by setting n_prevpoints={n_prevpoints} the number of evaluations for {n_classes} classes will be {n}')
```
this will produce the following output:
```
by setting n_prevpoints=30 the number of evaluations for 4 classes will be 4960
```
The following code shows an example of usage of APP for model selection
and evaluation:
```python
import quapy as qp
from quapy.method.aggregative import ACC
from quapy.protocol import APP
import numpy as np
from sklearn.linear_model import LogisticRegression
qp.environ['SAMPLE_SIZE'] = 100
qp.environ['N_JOBS'] = -1
# define an instance of our custom quantifier
quantifier = ACC(LogisticRegression())
# load the IMDb dataset
train, test = qp.datasets.fetch_reviews('imdb', tfidf=True, min_df=5).train_test
# model selection
train, val = train.split_stratified(train_prop=0.75)
quantifier = qp.model_selection.GridSearchQ(
quantifier,
param_grid={'classifier__C': np.logspace(-2, 2, 5)},
protocol=APP(val) # <- this is the protocol we use for generating validation samples
).fit(train)
# default values are n_prevalences=21, repeats=10, random_state=0; this is equialent to:
# val_app = APP(val, n_prevalences=21, repeats=10, random_state=0)
# quantifier = GridSearchQ(quantifier, param_grid, protocol=val_app).fit(train)
# evaluation with APP
mae = qp.evaluation.evaluate(quantifier, protocol=APP(test), error_metric='mae')
print(f'MAE = {mae:.4f}')
```
Note that APP is an instance of _AbstractStochasticSeededProtocol_ and that the
_random_state_ is by default set to 0, meaning that all the generated validation
samples will be consistent for all the combinations of hyperparameters being tested.
Note also that the _sample_size_ is not indicated when instantiating the protocol;
in such cases QuaPy takes the value of _qp.environ['SAMPLE_SIZE']_.
This protocol is useful for testing a quantifier under conditions of
_prior probability shift_.
## Sampling from the unit-simplex, the Uniform-Prevalence Protocol (UPP)
Generating all possible combinations from a grid of prevalence values (APP) in
multiclass is cumbersome, and when the number of classes increases it rapidly
becomes impractical. In some cases, it is preferable to generate a fixed number
of samples displaying prevalence values that are uniformly drawn from the unit-simplex,
that is, so that every legitimate distribution is equally likely. The main drawback
of this approach is that we are not guaranteed that all classes have been tested
in the entire range of prevalence values. The main advantage is that every possible
prevalence value is electable (this was not possible with standard APP, since values
not included in the grid are never tested). Yet another advantage is that we can
control the computational burden every evaluation incurs, by deciding in advance
the number of samples to generate.
The UPP protocol implements this idea by relying on the Kraemer algorithm
for sampling from the unit-simplex as many vectors of prevalence values as indicated
in the _repeats_ parameter. UPP can be instantiated as:
```python
protocol = qp.in_protocol.UPP(test, repeats=100)
```
This is the most convenient protocol for datasets
containing many classes; see, e.g.,
[LeQua (2022)](https://ceur-ws.org/Vol-3180/paper-146.pdf),
and is useful for testing a quantifier under conditions of
_prior probability shift_.
## Natural-Prevalence Protocol
The "natural-prevalence protocol" (NPP) comes down to generating samples drawn
uniformly at random from the original labelled collection. This protocol has
sometimes been used in literature, although it is now considered to be deprecated,
due to its limited capability to generate interesting amounts of shift.
All other things being equal, this protocol can be used just like APP or UPP,
and is instantiated via:
```python
protocol = qp.in_protocol.NPP(test, repeats=100)
```
## Other protocols
Other protocols exist in QuaPy and will be added to the `qp.protocol.py` module.

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 108 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 185 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 337 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 243 KiB