broken submodule fix

trying pca reduction for cifar 100
log densities for kde
2026-02-12 13:14:59 +01:00 · 2026-01-31 00:11:19 +01:00 · 2026-01-27 17:58:53 +01:00 · 2026-01-23 18:05:23 +01:00 · 2026-01-20 10:53:16 +01:00 · 2026-01-15 09:41:37 +01:00
36 changed files with 3272 additions and 352 deletions
--- a/.gitmodules
+++ b/.gitmodules
@ -0,0 +1,3 @@
+[submodule "result_table"]
+	path = result_table
+	url = gitea@gitea-s2i2s.isti.cnr.it:moreo/result_table.git
--- a/BayesianKDEy/TODO.txt
+++ b/BayesianKDEy/TODO.txt
@ -1,7 +1,40 @@
+- Things to try:
+    - Why not optmize the calibration of the classifier, instead of the classifier as a component of the quantifier?
+    - init chain helps? [seems irrelevant in MAPLS...]
+    - Aitchison kernel is better?
+    - other classifiers?
+    - optimize classifier?
+    - use all datasets?
+    - improve KDE on wine-quality?
 - Add other methods that natively provide uncertainty quantification methods?
+    Ratio estimator
+    Card & Smith
 - MPIW (Mean Prediction Interval Width): is the average of the amplitudes (w/o aggregating coverage whatsoever)
 - Implement Interval Score or Winkler Score
 - analyze across shift
- add Bayesian EM
- optimize also C and class_weight?
+- add Bayesian EM:
+    - https://github.com/ChangkunYe/MAPLS/blob/main/label_shift/mapls.py
+    - take this opportunity to add RLLS:
+        https://github.com/Angie-Liu/labelshift
+        https://github.com/ChangkunYe/MAPLS/blob/main/label_shift/rlls.py
+- add CIFAR10 and MNIST? Maybe consider also previously tested types of shift (tweak-one-out, etc.)? from RLLS paper
+    - https://github.com/Angie-Liu/labelshift/tree/master
+    - https://github.com/Angie-Liu/labelshift/blob/master/cifar10_for_labelshift.py
+    - Note: MNIST is downloadable from https://archive.ics.uci.edu/dataset/683/mnist+database+of+handwritten+digits
+    - Seem to be some pretrained models in:
+        https://github.com/geifmany/cifar-vgg
+        https://github.com/EN10/KerasMNIST
+        https://github.com/tohinz/SVHN-Classifier
+- consider prior knowledge in experiments:
+    - One scenario in which our prior is uninformative (i.e., uniform)
+    - One scenario in which our prior is wrong (e.g., alpha-prior = (3,2,1), protocol-prior=(1,1,5))
+    - One scenario in which our prior is very good (e.g., alpha-prior = (3,2,1), protocol-prior=(3,2,1))
+    - Do all my baseline methods come with the option to inform a prior?
+- consider different bandwidths within the bayesian approach?
+- how to improve the coverage (or how to increase the amplitude)?
+    - Added temperature-calibration, improve things.
+    - Is temperature-calibration actually not equivalent to using a larger bandwidth in the kernels?
+- consider W as a measure of quantification error (the current e.g., w-CI is the winkler...)
+- optimize also C and class_weight? [I don't think so, but could be done easily now]

+- remove wikis from repo
--- a/BayesianKDEy/_bayeisan_kdey.py
+++ b/BayesianKDEy/_bayeisan_kdey.py
@ -1,16 +1,24 @@
-from numpy.ma.core import shape
 from sklearn.base import BaseEstimator
 import numpy as np
+from sklearn.decomposition import PCA

-import quapy.util
+from BayesianKDEy.commons import ILRtransformation, in_simplex
 from quapy.method._kdey import KDEBase
 from quapy.method.confidence import WithConfidenceABC, ConfidenceRegionABC
-from quapy.functional import CLRtransformation, ILRtransformation
+from quapy.functional import CLRtransformation
 from quapy.method.aggregative import AggregativeSoftQuantifier
 from tqdm import tqdm
 import quapy.functional as F
-#import emcee
+import emcee
+from collections.abc import Iterable
+from numbers import Number

+import jax
+import jax.numpy as jnp
+from jax.scipy.special import logsumexp
+import numpyro
+import numpyro.distributions as dist
+from numpyro.infer import MCMC, NUTS


 class BayesianKDEy(AggregativeSoftQuantifier, KDEBase, WithConfidenceABC):
@ -33,6 +41,8 @@ class BayesianKDEy(AggregativeSoftQuantifier, KDEBase, WithConfidenceABC):
    :param region: string, set to `intervals` for constructing confidence intervals (default), or to
        `ellipse` for constructing an ellipse in the probability simplex, or to `ellipse-clr` for
        constructing an ellipse in the Centered-Log Ratio (CLR) unconstrained space.
+    :param prior: an array-list with the alpha parameters of a Dirichlet prior, or the string 'uniform'
+        for a uniform, uninformative prior (default)
    :param verbose: bool, whether to display progress bar
    """
    def __init__(self,
@ -49,7 +59,11 @@ class BayesianKDEy(AggregativeSoftQuantifier, KDEBase, WithConfidenceABC):
                 explore='simplex',
                 step_size=0.05,
                 temperature=1.,
-                 verbose: bool = False):
+                 engine='numpyro',
+                 prior='uniform',
+                 reduce=None,
+                 verbose: bool = False, 
+                 **kwargs):

        if num_warmup <= 0:
            raise ValueError(f'parameter {num_warmup=} must be a positive integer')
@ -57,9 +71,16 @@ class BayesianKDEy(AggregativeSoftQuantifier, KDEBase, WithConfidenceABC):
            raise ValueError(f'parameter {num_samples=} must be a positive integer')
        assert explore in ['simplex', 'clr', 'ilr'], \
            f'unexpected value for param {explore=}; valid ones are "simplex", "clr", and "ilr"'
-        assert temperature>0., f'temperature must be >0'
+        assert ((isinstance(prior, str) and prior == 'uniform') or
+                (isinstance(prior, Iterable) and all(isinstance(v, Number) for v in prior))), \
+            f'wrong type for {prior=}; expected "uniform" or an array-like of real values'
+        # assert temperature>0., f'temperature must be >0'
+        assert engine in ['rw-mh', 'emcee', 'numpyro']

        super().__init__(classifier, fit_classifier, val_split)
+        assert all(k.startswith('classifier__') for k in kwargs.keys()), 'unexpected kwargs; must start with "classifier__"'
+        self.classifier.set_params(**{k.replace('classifier__', ''):v for k,v in kwargs.items()})  # <- improve
+
        self.bandwidth = KDEBase._check_bandwidth(bandwidth, kernel)
        self.kernel = self._check_kernel(kernel)
        self.num_warmup = num_warmup
@ -70,15 +91,35 @@ class BayesianKDEy(AggregativeSoftQuantifier, KDEBase, WithConfidenceABC):
        self.explore = explore
        self.step_size = step_size
        self.temperature = temperature
+        self.engine = engine
+        self.prior = prior
+        self.reduce = reduce
        self.verbose = verbose

    def aggregation_fit(self, classif_predictions, labels):
+        if self.reduce is not None:
+            self.pca = PCA(n_components=self.reduce)
+            classif_predictions = self.pca.fit_transform(classif_predictions)
+            #print(f'reduce to ', classif_predictions.shape)
        self.mix_densities = self.get_mixture_components(classif_predictions, labels, self.classes_, self.bandwidth, self.kernel)
+        #print('num mix ', len(self.mix_densities))
        return self

-    def aggregate(self, classif_predictions):
-        # self.prevalence_samples = self._bayesian_kde(classif_predictions, init=None, verbose=self.verbose)
-        self.prevalence_samples = self._bayesian_emcee(classif_predictions)
+    def aggregate(self, classif_predictions: np.ndarray):
+        if hasattr(self, 'pca'):
+            classif_predictions = self.pca.transform(classif_predictions)
+
+        if self.engine == 'rw-mh':
+            if self.prior != 'uniform':
+                raise RuntimeError('prior is not yet implemented in rw-mh')
+            self.prevalence_samples = self._bayesian_kde(classif_predictions, init=None, verbose=self.verbose)
+        elif self.engine == 'emcee':
+            if self.prior != 'uniform':
+                raise RuntimeError('prior is not yet implemented in emcee')
+            self.prevalence_samples = self._bayesian_emcee(classif_predictions)
+        elif self.engine == 'numpyro':
+            self.ilr = ILRtransformation(jax_mode=True)
+            self.prevalence_samples = self._bayesian_numpyro(classif_predictions)
        return self.prevalence_samples.mean(axis=0)

    def predict_conf(self, instances, confidence_level=None) -> (np.ndarray, ConfidenceRegionABC):
@ -187,6 +228,8 @@ class BayesianKDEy(AggregativeSoftQuantifier, KDEBase, WithConfidenceABC):
    def _bayesian_emcee(self, X_probs):
        ndim = X_probs.shape[1]
        nwalkers = 32
+        if nwalkers < (ndim*2):
+            nwalkers = ndim * 2 + 1

        f = CLRtransformation()

@ -198,6 +241,7 @@ class BayesianKDEy(AggregativeSoftQuantifier, KDEBase, WithConfidenceABC):

        kdes = self.mix_densities
        test_densities = np.asarray([self.pdf(kde_i, X_probs, self.kernel) for kde_i in kdes])
+        # test_densities_unconstrained = [f(t) for t in test_densities]

        # p0 = np.random.normal(nwalkers, ndim)
        p0 = F.uniform_prevalence_sampling(ndim, nwalkers)
@ -211,5 +255,97 @@ class BayesianKDEy(AggregativeSoftQuantifier, KDEBase, WithConfidenceABC):
        samples = f.inverse(samples)
        return samples

-def in_simplex(x):
-    return np.all(x >= 0) and np.isclose(x.sum(), 1)
+    def _bayesian_numpyro(self, X_probs):
+        #print("bayesian_numpyro", X_probs.shape)
+        kdes = self.mix_densities
+        # test_densities = np.asarray(
+        #     [self.pdf(kde_i, X_probs, self.kernel) for kde_i in kdes]
+        # )
+        test_log_densities = np.asarray(
+            [self.pdf(kde_i, X_probs, self.kernel, log_densities=True) for kde_i in kdes]
+        )
+        #print(f'min={np.min(test_log_densities)}')
+        #print(f'max={np.max(test_log_densities)}')
+        #print("bayesian_numpyro", test_log_densities.shape)
+        #print("len kdes ", len(kdes))
+        # import sys
+        # sys.exit(0)
+
+        n_classes = len(kdes)
+        if isinstance(self.prior, str) and self.prior == 'uniform':
+            alpha = [1.] * n_classes
+        else:
+            alpha = self.prior
+
+        kernel = NUTS(self._numpyro_model)
+        mcmc = MCMC(
+            kernel,
+            num_warmup=self.num_warmup,
+            num_samples=self.num_samples,
+            num_chains=1,
+            progress_bar=self.verbose,
+        )
+
+        rng_key = jax.random.PRNGKey(self.mcmc_seed)
+        mcmc.run(rng_key, test_log_densities=test_log_densities, alpha=alpha)
+
+        samples_z = mcmc.get_samples()["z"]
+
+        # back to simplex
+        samples_prev = np.asarray(self.ilr.inverse(np.asarray(samples_z)))
+
+        return samples_prev
+
+    def _numpyro_model(self, test_log_densities, alpha):
+        """
+        test_densities: shape (n_classes, n_instances,)
+        """
+        n_classes = test_log_densities.shape[0]
+
+        # sample in unconstrained R^(n_classes-1)
+        z = numpyro.sample(
+            "z",
+            dist.Normal(0.0, 1.0).expand([n_classes - 1])
+        )
+
+        prev = self.ilr.inverse(z)  # simplex, shape (n_classes,)
+
+        # for numerical stability:
+        # eps = 1e-10
+        # prev_safe = jnp.clip(prev, eps, 1.0)
+        # prev_safe = prev_safe / jnp.sum(prev_safe)
+        # prev = prev_safe
+
+        # from jax import debug
+        # debug.print("prev = {}", prev)
+        # numpyro.factor(
+        #     "check_prev",
+        #     jnp.where(jnp.all(prev > 0), 0.0, -jnp.inf)
+        # )
+
+        # prior
+        if alpha is not None:
+            alpha = jnp.asarray(alpha)
+            numpyro.factor(
+                'log_prior', dist.Dirichlet(alpha).log_prob(prev)
+            )
+        # if alpha is None, then this corresponds to a weak logistic-normal prior
+
+        # likelihood
+        # test_densities = jnp.array(test_densities)
+        # likelihoods = jnp.dot(prev, test_densities)
+        # likelihoods = jnp.clip(likelihoods, 1e-12, jnp.inf) # numerical stability
+        # numpyro.factor(
+        #     "loglik", (1.0 / self.temperature) * jnp.sum(jnp.log(likelihoods))
+        # )
+
+        log_likelihood = jnp.sum(
+            logsumexp(jnp.log(prev)[:, None] + test_log_densities, axis=0)
+        )
+        numpyro.factor(
+            "loglik", (1.0 / self.temperature) * log_likelihood
+        )
+
+
+
+
--- a/BayesianKDEy/_bayesian_mapls.py
+++ b/BayesianKDEy/_bayesian_mapls.py
@ -0,0 +1,322 @@
+import jax.numpy as jnp
+import numpy as np
+import numpyro
+import numpyro.distributions as dist
+from numpyro.infer import MCMC, NUTS, HMC
+import jax.random as random
+from sklearn.base import BaseEstimator
+from jax.scipy.special import logsumexp
+from BayesianKDEy.commons import ILRtransformation
+from quapy.method.aggregative import AggregativeSoftQuantifier
+from quapy.method.confidence import WithConfidenceABC, ConfidenceRegionABC
+import quapy.functional as F
+
+
+class BayesianMAPLS(AggregativeSoftQuantifier, WithConfidenceABC):
+    """
+
+    :param classifier:
+    :param fit_classifier:
+    :param val_split:
+    :param exact_train_prev: set to True (default) for using the true training prevalence as the initial
+        observation; set to False for computing the training prevalence as an estimate of it, i.e., as the
+        expected value of the posterior probabilities of the training instances.
+    :param num_samples:
+    :param mcmc_seed:
+    :param confidence_level:
+    :param region:
+    """
+
+    def __init__(self,
+                 classifier: BaseEstimator = None,
+                 fit_classifier=True,
+                 val_split: int = 5,
+                 exact_train_prev=True,
+                 num_warmup: int = 500,
+                 num_samples: int = 1_000,
+                 mcmc_seed: int = 0,
+                 confidence_level: float = 0.95,
+                 region: str = 'intervals',
+                 temperature=1.,
+                 prior='uniform',
+                 mapls_chain_init=True,
+                 verbose=False
+                 ):
+
+        if num_samples <= 0:
+            raise ValueError(f'parameter {num_samples=} must be a positive integer')
+        super().__init__(classifier, fit_classifier, val_split)
+        self.exact_train_prev = exact_train_prev
+        self.num_warmup = num_warmup
+        self.num_samples = num_samples
+        self.mcmc_seed = mcmc_seed
+        self.confidence_level = confidence_level
+        self.region = region
+        self.temperature = temperature
+        self.prior = prior
+        self.mapls_chain_init = mapls_chain_init
+        self.verbose = verbose
+
+    def aggregation_fit(self, classif_predictions, labels):
+        self.train_post = classif_predictions
+        if self.exact_train_prev:
+            self.train_prevalence = F.prevalence_from_labels(labels, classes=self.classes_)
+        else:
+            self.train_prevalence = F.prevalence_from_probabilities(classif_predictions)
+        self.ilr = ILRtransformation(jax_mode=True)
+        return self
+
+    def aggregate(self, classif_predictions: np.ndarray):
+        n_test, n_classes = classif_predictions.shape
+
+        pi_star, lam = mapls(
+            self.train_post,
+            test_probs=classif_predictions,
+            pz=self.train_prevalence,
+            return_lambda=True
+        )
+
+        # pi_star: MAP in simplex shape (n_classes,) and convert to ILR space
+        z0 = self.ilr(pi_star)
+
+        if self.prior == 'uniform':
+            alpha = [1.] * n_classes
+        elif self.prior == 'map':
+            alpha_0 = alpha0_from_lamda(lam, n_test=n_test, n_classes=n_classes)
+            alpha = [alpha_0] * n_classes
+        elif self.prior == 'map2':
+            lam2 = get_lamda(
+                test_probs=classif_predictions,
+                pz=self.train_prevalence,
+                q_prior=pi_star,
+                dvg=kl_div
+            )
+            alpha_0 = alpha0_from_lamda(lam2, n_test=n_test, n_classes=n_classes)
+            alpha = [alpha_0] * n_classes
+        else:
+            alpha = self.prior
+
+        kernel = NUTS(self.model)
+        mcmc = MCMC(
+            kernel,
+            num_warmup=self.num_warmup,
+            num_samples=self.num_samples,
+            num_chains=1,
+            progress_bar=self.verbose
+        )
+
+        mcmc.run(
+            random.PRNGKey(self.mcmc_seed),
+            test_posteriors=classif_predictions,
+            alpha=alpha,
+            init_params={"z": z0} if self.mapls_chain_init else None
+        )
+
+        samples = mcmc.get_samples()["z"]
+        self.prevalence_samples = self.ilr.inverse(samples)
+        return self.prevalence_samples.mean(axis=0)
+
+    def predict_conf(self, instances, confidence_level=None) -> (np.ndarray, ConfidenceRegionABC):
+        if confidence_level is None:
+            confidence_level = self.confidence_level
+        classif_predictions = self.classify(instances)
+        point_estimate = self.aggregate(classif_predictions)
+        samples = self.prevalence_samples  # available after calling "aggregate" function
+        region = WithConfidenceABC.construct_region(samples, confidence_level=confidence_level, method=self.region)
+        return point_estimate, region
+
+    def log_likelihood(self, test_classif, test_prev, train_prev):
+        # n_test = test_classif.shape[0]
+        log_w = jnp.log(test_prev) - jnp.log(train_prev)
+        # return (1/n_test) * jnp.sum(
+        #     logsumexp(jnp.log(test_classif) + log_w, axis=-1)
+        # )
+        return jnp.sum(
+            logsumexp(jnp.log(test_classif) + log_w, axis=-1)
+        )
+
+    def model(self, test_posteriors, alpha):
+        test_posteriors = jnp.array(test_posteriors)
+        n_classes = test_posteriors.shape[1]
+
+        # prior in ILR
+        z = numpyro.sample(
+            "z",
+            dist.Normal(jnp.zeros(n_classes-1), 1.0)
+        )
+
+        # back to simplex
+        prev = self.ilr.inverse(z)
+        train_prev = jnp.array(self.train_prevalence)
+
+        # prior
+        alpha = jnp.array(alpha)
+        numpyro.factor(
+            "dirichlet_prior",
+            dist.Dirichlet(alpha).log_prob(prev)
+        )
+
+        # Likelihood
+        numpyro.factor(
+            "likelihood",
+            (1.0 / self.temperature) * self.log_likelihood(test_posteriors, test_prev=prev, train_prev=train_prev)
+        )
+
+
+# adapted from https://github.com/ChangkunYe/MAPLS/blob/main/label_shift/mapls.py
+
+def mapls(train_probs: np.ndarray,
+          test_probs: np.ndarray,
+          pz: np.ndarray,
+          qy_mode: str = 'soft',
+          max_iter: int = 100,
+          init_mode: str = 'identical',
+          lam: float = None,
+          dvg_name='kl',
+          return_lambda=False
+          ):
+    r"""
+    Implementation of Maximum A Posteriori Label Shift,
+    for Unknown target label distribution estimation
+
+    Given source domain P(Y_s=i|X_s=x) = f(x) and P(Y_s=i),
+    estimate targe domain P(Y_t=i) on test set
+
+    """
+    # Sanity Check
+    cls_num = len(pz)
+    assert test_probs.shape[-1] == cls_num
+    if type(max_iter) != int or max_iter < 0:
+        raise Exception('max_iter should be a positive integer, not ' + str(max_iter))
+
+    # Setup d(p,q) measure
+    if dvg_name == 'kl':
+        dvg = kl_div
+    elif dvg_name == 'js':
+        dvg = js_div
+    else:
+        raise Exception('Unsupported distribution distance measure, expect kl or js.')
+
+    # Set Prior of Target Label Distribution
+    q_prior = np.ones(cls_num) / cls_num
+    # q_prior = pz.copy()
+
+    # Lambda estimation-------------------------------------------------------#
+    if lam is None:
+        # logging.info('Data shape: %s, %s' % (str(train_probs.shape), str(test_probs.shape)))
+        # logging.info('Divergence type is %s' % (dvg))
+        lam = get_lamda(test_probs, pz, q_prior, dvg=dvg, max_iter=max_iter)
+        # logging.info('Estimated lambda value is %.4f' % lam)
+    # else:
+    #     logging.info('Assigned lambda is %.4f' % lam)
+
+    # EM Algorithm Computation
+    qz = mapls_EM(test_probs, pz, lam, q_prior, cls_num,
+                  init_mode=init_mode, max_iter=max_iter, qy_mode=qy_mode)
+
+    if return_lambda:
+        return qz, lam
+    else:
+        return qz
+
+
+def mapls_EM(probs, pz, lam, q_prior, cls_num, init_mode='identical', max_iter=100, qy_mode='soft'):
+    # Normalize Source Label Distribution pz
+    pz = np.array(pz) / np.sum(pz)
+    # Initialize Target Label Distribution qz
+    if init_mode == 'uniform':
+        qz = np.ones(cls_num) / cls_num
+    elif init_mode == 'identical':
+        qz = pz.copy()
+    else:
+        raise ValueError('init_mode should be either "uniform" or "identical"')
+
+    # Initialize w
+    w = (np.array(qz) / np.array(pz))
+    # EM algorithm with MAP estimation----------------------------------------#
+    for i in range(max_iter):
+        # print('w shape ', w.shape)
+
+        # E-Step--------------------------------------------------------------#
+        mapls_probs = normalized(probs * w, axis=-1, order=1)
+
+        # M-Step--------------------------------------------------------------#
+        if qy_mode == 'hard':
+            pred = np.argmax(mapls_probs, axis=-1)
+            qz_new = np.bincount(pred.reshape(-1), minlength=cls_num)
+        elif qy_mode == 'soft':
+            qz_new = np.mean(mapls_probs, axis=0)
+        # elif qy_mode == 'topk':
+        #     qz_new = Topk_qy(mapls_probs, cls_num, topk_ratio=0.9, head=0)
+        else:
+            raise Exception('mapls mode should be either "soft" or "hard". ')
+        # print(np.shape(pc_probs), np.shape(pred), np.shape(cls_num_list_t))
+
+        # Update w with MAP estimation of Target Label Distribution qz
+        # qz = (qz_new + alpha) / (N + np.sum(alpha))
+        qz = lam * qz_new + (1 - lam) * q_prior
+        qz /= qz.sum()
+        w = qz / pz
+
+    return qz
+
+
+def get_lamda(test_probs, pz, q_prior, dvg, max_iter=50):
+    K = len(pz)
+
+    # MLLS estimation of source and target domain label distribution
+    qz_pred = mapls_EM(test_probs, pz, 1, 0, K, max_iter=max_iter)
+
+    TU_div = dvg(qz_pred, q_prior)
+    TS_div = dvg(qz_pred, pz)
+    SU_div = dvg(pz, q_prior)
+    # logging.info('weights are, TU_div %.4f, TS_div %.4f, SU_div %.4f' % (TU_div, TS_div, SU_div))
+
+    SU_conf = 1 - lam_forward(SU_div, lam_inv(dpq=0.5, lam=0.2))
+    TU_conf = lam_forward(TU_div, lam_inv(dpq=0.5, lam=SU_conf))
+    TS_conf = lam_forward(TS_div, lam_inv(dpq=0.5, lam=SU_conf))
+    # logging.info('weights are, unviform_weight %.4f, differ_weight %.4f, regularize weight %.4f'
+    #              % (TU_conf, TS_conf, SU_conf))
+
+    confs = np.array([TU_conf, 1 - TS_conf])
+    w = np.array([0.9, 0.1])
+    lam = np.sum(w * confs)
+
+    # logging.info('Estimated lambda is: %.4f', lam)
+
+    return lam
+
+
+def lam_inv(dpq, lam):
+    return (1 / (1 - lam) - 1) / dpq
+
+
+def lam_forward(dpq, gamma):
+    return gamma * dpq / (1 + gamma * dpq)
+
+
+def kl_div(p, q, eps=1e-12):
+    p = np.asarray(p, dtype=float)
+    q = np.asarray(q, dtype=float)
+
+    mask = p > 0
+    return np.sum(p[mask] * np.log(p[mask] / (q[mask] + eps)))
+
+
+def js_div(p, q):
+    assert (np.abs(np.sum(p) - 1) < 1e-6) and (np.abs(np.sum(q) - 1) < 1e-6)
+    m = (p + q) / 2
+    return kl_div(p, m) / 2 + kl_div(q, m) / 2
+
+
+def normalized(a, axis=-1, order=2):
+    r"""
+    Prediction Normalization
+    """
+    l2 = np.atleast_1d(np.linalg.norm(a, order, axis))
+    l2[l2 == 0] = 1
+    return a / np.expand_dims(l2, axis)
+
+
+def alpha0_from_lamda(lam, n_test, n_classes):
+    return 1+n_test*(1-lam)/(lam*n_classes)
--- a/BayesianKDEy/bandwidth_and_dimensionality.py
+++ b/BayesianKDEy/bandwidth_and_dimensionality.py
@ -0,0 +1,78 @@
+from sklearn.neighbors import KernelDensity
+import quapy.functional as F
+import numpy as np
+
+import numpy as np
+import matplotlib.pyplot as plt
+from sklearn.neighbors import KernelDensity
+import quapy.functional as F
+
+# aitchison=True
+aitchison=False
+clr = F.CLRtransformation()
+
+# h = 0.1
+# dims = list(range(5, 100, 5))
+dims = [10, 28, 100]
+
+center_densities = []
+vertex_densities = []
+
+center_densities_scaled = []
+vertex_densities_scaled = []
+
+
+for n in dims:
+    h0 = 0.4
+    simplex_center = F.uniform_prevalence(n)
+    simplex_vertex = np.asarray([.9] + [.1/ (n - 1)] * (n - 1), dtype=float)
+
+    # KDE trained on a single point (the center)
+    kde = KernelDensity(bandwidth=h0)
+    X = simplex_center[None, :]
+    if aitchison:
+        X = clr(X)
+    kde.fit(X)
+
+    X = np.vstack([simplex_center, simplex_vertex])
+    if aitchison:
+        X = clr(X)
+    density = np.exp(kde.score_samples(X))
+
+    center_densities.append(density[0])
+    vertex_densities.append(density[1])
+
+    h1= h0 * np.sqrt(n / 2)
+    # KDE trained on a single point (the center)
+    kde = KernelDensity(bandwidth=h1)
+    X = simplex_center[None, :]
+    if aitchison:
+        X = clr(X)
+    kde.fit(X)
+
+    X = np.vstack([simplex_center, simplex_vertex])
+    if aitchison:
+        X = clr(X)
+    density = np.exp(kde.score_samples(X))
+
+    center_densities_scaled.append(density[0])
+    vertex_densities_scaled.append(density[1])
+
+# Plot
+plt.figure(figsize=(6*4, 4*4))
+plt.plot(dims, center_densities, marker='o', label='Center of simplex')
+plt.plot(dims, vertex_densities, marker='s', label='Vertex of simplex')
+plt.plot(dims, center_densities_scaled, marker='o', label='Center of simplex (scaled)')
+plt.plot(dims, vertex_densities_scaled, marker='s', label='Vertex of simplex (scaled)')
+
+plt.xlabel('Number of classes (simplex dimension)')
+# plt.ylim(min(center_densities+vertex_densities), max(center_densities+vertex_densities))
+plt.ylabel('Kernel density')
+plt.yscale('log')  # crucial to see anything meaningful
+plt.title(f'KDE density vs dimension (bandwidth = {h0}) in {"Simplex" if not aitchison else "ILR-space"}')
+plt.legend()
+plt.grid(alpha=0.3)
+
+plt.tight_layout()
+plt.show()
+
--- a/BayesianKDEy/commons.py
+++ b/BayesianKDEy/commons.py
@ -0,0 +1,195 @@
+import os
+from functools import lru_cache
+from pathlib import Path
+
+from jax import numpy as jnp
+from sklearn.base import BaseEstimator
+from sklearn.decomposition import PCA
+
+import quapy.functional as F
+
+import quapy as qp
+import numpy as np
+
+from quapy.method.aggregative import KDEyML
+from quapy.functional import ILRtransformation
+from scipy.stats import entropy
+
+RESULT_DIR = Path('results')
+
+
+# utils
+def experiment_path(dir:Path, dataset_name:str, method_name:str):
+    os.makedirs(dir, exist_ok=True)
+    return dir/f'{dataset_name}__{method_name}.pkl'
+
+
+def normalized_entropy(p):
+    """
+    Normalized Shannon entropy in [0, 1]
+    p: array-like, prevalence vector (sums to 1)
+    """
+    p = np.asarray(p)
+    H = entropy(p) # Shannon entropy
+    H_max = np.log(len(p))
+    return np.clip(H / H_max, 0, 1)
+
+
+def antagonistic_prevalence(p, strength=1):
+    ilr = ILRtransformation()
+    z = ilr(p)
+    z_ant = - strength * z
+    p_ant = ilr.inverse(z_ant)
+    return p_ant
+
+
+"""
+class KDEyScaledB(KDEyML):
+    def __init__(self, classifier: BaseEstimator=None, fit_classifier=True, val_split=5, bandwidth=1., random_state=None):
+        super().__init__(
+            classifier=classifier, fit_classifier=fit_classifier, val_split=val_split, bandwidth=bandwidth,
+            random_state=random_state, kernel='gaussian'
+        )
+    
+    def aggregation_fit(self, classif_predictions, labels):
+        if not hasattr(self, '_changed'):
+            def scale_bandwidth(n_classes, beta=0.5):            
+                return self.bandwidth * np.power(n_classes, beta)
+            n_classes = len(set(y))
+            scaled = scale_bandwidth(n_classes)
+            print(f'bandwidth scaling: {self.bandwidth:.4f} => {scaled:.4f}')
+            self.bandwidth = scaled     
+            self._changed = True   
+        return super().aggregation_fit(classif_predictions, labels)
+"""
+class KDEyScaledB(KDEyML):
+    def __init__(self, classifier: BaseEstimator=None, fit_classifier=True, val_split=5, bandwidth=1., random_state=None):
+        super().__init__(
+            classifier=classifier, fit_classifier=fit_classifier, val_split=val_split, bandwidth=bandwidth,
+            random_state=random_state, kernel='gaussian'
+        )
+
+class KDEyFresh(KDEyML):
+    def __init__(self, classifier: BaseEstimator=None, fit_classifier=True, val_split=5, bandwidth=1., random_state=None):
+        super().__init__(
+            classifier=classifier, fit_classifier=fit_classifier, val_split=val_split, bandwidth=bandwidth,
+            random_state=random_state, kernel='gaussian'
+        )
+
+class KDEyReduce(KDEyML):
+    def __init__(self, classifier: BaseEstimator=None, fit_classifier=True, val_split=5, bandwidth=1., n_components=10, random_state=None):
+        super().__init__(
+            classifier=classifier, fit_classifier=fit_classifier, val_split=val_split, bandwidth=bandwidth,
+            random_state=random_state, kernel='gaussian'
+        )
+        self.n_components=n_components
+
+    def aggregation_fit(self, classif_predictions, labels):
+        self.pca = PCA(n_components=self.n_components)
+        classif_predictions = self.pca.fit_transform(classif_predictions)
+        return super().aggregation_fit(classif_predictions, labels)
+
+    def aggregate(self, posteriors: np.ndarray):
+        posteriors = self.pca.transform(posteriors)
+        return super().aggregate(posteriors)
+    
+
+class KDEyCLR(KDEyML):
+    def __init__(self, classifier: BaseEstimator=None, fit_classifier=True, val_split=5, bandwidth=1., random_state=None):
+        super().__init__(
+            classifier=classifier, fit_classifier=fit_classifier, val_split=val_split, bandwidth=bandwidth,
+            random_state=random_state, kernel='aitchison'
+        )
+
+class KDEyCLR2(KDEyML):
+    def __init__(self, classifier: BaseEstimator=None, fit_classifier=True, val_split=5, bandwidth=1., random_state=None):
+        super().__init__(
+            classifier=classifier, fit_classifier=fit_classifier, val_split=val_split, bandwidth=bandwidth,
+            random_state=random_state, kernel='aitchison'
+        )
+
+
+class KDEyILR(KDEyML):
+    def __init__(self, classifier: BaseEstimator=None, fit_classifier=True, val_split=5, bandwidth=1., random_state=None):
+        super().__init__(
+            classifier=classifier, fit_classifier=fit_classifier, val_split=val_split, bandwidth=bandwidth,
+            random_state=random_state, kernel='ilr'
+        )
+
+
+class ILRtransformation(F.CompositionalTransformation):
+    def __init__(self, jax_mode=False):
+        self.jax_mode = jax_mode
+
+    def array(self, X):
+        if self.jax_mode:
+            return jnp.array(X)
+        else:
+            return np.asarray(X)
+
+    def __call__(self, X):
+        X = self.array(X)
+        X = qp.error.smooth(X, self.EPSILON)
+        k = X.shape[-1]
+        V = self.array(self.get_V(k))
+        logp = jnp.log(X) if self.jax_mode else np.log(X)
+        return logp @ V.T
+
+    def inverse(self, Z):
+        Z = self.array(Z)
+        k_minus_1 = Z.shape[-1]
+        k = k_minus_1 + 1
+        V = self.array(self.get_V(k))
+        logp = Z @ V
+        p = jnp.exp(logp) if self.jax_mode else np.exp(logp)
+        p = p / jnp.sum(p, axis=-1, keepdims=True) if self.jax_mode else p / np.sum(p, axis=-1, keepdims=True)
+        return p
+
+    @lru_cache(maxsize=None)
+    def get_V(self, k):
+        def helmert_matrix(k):
+            """
+            Returns the (k x k) Helmert matrix.
+            """
+            H = np.zeros((k, k))
+            for i in range(1, k):
+                H[i, :i] = 1
+                H[i, i] = -(i)
+                H[i] = H[i] / np.sqrt(i * (i + 1))
+            # row 0 stays zeros; will be discarded
+            return H
+
+        def ilr_basis(k):
+            """
+            Constructs an orthonormal ILR basis using the Helmert submatrix.
+            Output shape: (k-1, k)
+            """
+            H = helmert_matrix(k)
+            V = H[1:, :]  # remove first row of zeros
+            return V
+
+        return ilr_basis(k)
+
+
+def in_simplex(x, atol=1e-8):
+    x = np.asarray(x)
+
+    non_negative = np.all(x >= 0, axis=-1)
+    sum_to_one = np.isclose(x.sum(axis=-1), 1.0, atol=atol)
+
+    return non_negative & sum_to_one
+
+
+class MockClassifierFromPosteriors(BaseEstimator):
+    def __init__(self):
+        pass
+
+    def fit(self, X, y):
+        self.classes_ = sorted(np.unique(y))
+        return self
+
+    def predict(self, X):
+        return np.argmax(X, axis=1)
+
+    def predict_proba(self, X):
+        return X
--- a/BayesianKDEy/data/mnist_basiccnn/info_mnist_basiccnn.txt
+++ b/BayesianKDEy/data/mnist_basiccnn/info_mnist_basiccnn.txt
@ -0,0 +1,39 @@
+Dataset: mnist
+Model: basiccnn
+Number of classes: 10
+Train samples: 42000
+Validation samples: 18000
+Test samples: 10000
+Validation accuracy: 0.9895
+Test accuracy: 0.9901
+Github repository: ....
+
+Data Loading Instructions:
+The extracted features, predictions, targets, and logits are saved in .npz files.
+To load the data in Python:
+
+import numpy as np
+
+# Load training data
+train_data = np.load('mnist_basiccnn_train_out.npz')
+train_features = train_data['features']
+train_predictions = train_data['predictions']
+train_targets = train_data['targets']
+train_logits = train_data['logits']
+
+# Load validation data
+val_data = np.load('mnist_basiccnn_val_out.npz')
+val_features = val_data['features']
+val_predictions = val_data['predictions']
+val_targets = val_data['targets']
+val_logits = val_data['logits']
+
+# Load test data
+test_data = np.load('mnist_basiccnn_test_out.npz')
+test_features = test_data['features']
+test_predictions = test_data['predictions']
+test_targets = test_data['targets']
+test_logits = test_data['logits']
+
+Note: features are the output of the feature extractor (before the final classification layer),
+predictions are softmax probabilities, targets are the true labels, and logits are the raw model outputs.
--- a/BayesianKDEy/data/mnist_basiccnn/mnist_basiccnn_test_out.npz
+++ b/BayesianKDEy/data/mnist_basiccnn/mnist_basiccnn_test_out.npz
--- a/BayesianKDEy/data/mnist_basiccnn/mnist_basiccnn_train_out.npz
+++ b/BayesianKDEy/data/mnist_basiccnn/mnist_basiccnn_train_out.npz
--- a/BayesianKDEy/data/mnist_basiccnn/mnist_basiccnn_val_out.npz
+++ b/BayesianKDEy/data/mnist_basiccnn/mnist_basiccnn_val_out.npz
--- a/BayesianKDEy/datasets.py
+++ b/BayesianKDEy/datasets.py
@ -0,0 +1,345 @@
+import inspect
+from abc import ABC, abstractmethod
+from functools import lru_cache
+
+import numpy as np
+from pathlib import Path
+import quapy as qp
+
+from commons import MockClassifierFromPosteriors
+from quapy.protocol import UPP
+from quapy.data import LabelledCollection
+from quapy.method.aggregative import EMQ
+
+
+
+def fetchVisual(modality, dataset, net, data_home='./data'):    
+    MODALITY = ('features', 'predictions', 'logits')
+    assert modality in MODALITY, f'unknown modality, valid ones are {MODALITY}'
+
+    file_prefix = f'{dataset}_{net}'
+    data_home = Path(data_home) / file_prefix    
+
+    # Load training data
+    train_data = np.load(data_home/f'{file_prefix}_train_out.npz')
+    train_X = train_data[modality]
+    train_y = train_data['targets']
+
+    # Load validation data
+    val_data = np.load(data_home/f'{file_prefix}_val_out.npz')
+    val_X = val_data[modality]
+    val_y = val_data['targets']
+
+    # Load test data
+    test_data = np.load(data_home/f'{file_prefix}_test_out.npz')
+    test_X = test_data[modality]
+    test_y = test_data['targets']
+
+    train = LabelledCollection(train_X, train_y)
+    val = LabelledCollection(val_X, val_y, classes=train.classes_)
+    test = LabelledCollection(test_X, test_y, classes=train.classes_)
+
+    def show_prev_stats(data:LabelledCollection):
+        p = data.prevalence()
+        return f'prevs in [{min(p)*100:.3f}%, {max(p)*100:.3f}%]'
+    print(f'loaded {dataset} ({modality=}): '
+          f'#train={len(train)}({show_prev_stats(train)}), '
+          f'#val={len(val)}({show_prev_stats(val)}), '
+          f'#test={len(test)}({show_prev_stats(test)}), '
+          f'#features={train_X.shape[1]}, '
+          f'#classes={len(set(train_y))}')
+
+    return train, val, test
+
+
+def fetchMNIST(modality, data_home='./data'):
+    return fetchVisual(modality, dataset='mnist', net='basiccnn', data_home=data_home)
+
+def fetchCIFAR100coarse(modality, data_home='./data'):
+    return fetchVisual(modality, dataset='cifar100coarse', net='resnet18', data_home=data_home)
+
+def fetchCIFAR100(modality, data_home='./data'):
+    return fetchVisual(modality, dataset='cifar100', net='resnet18', data_home=data_home)
+
+def fetchCIFAR10(modality, data_home='./data'):
+    return fetchVisual(modality, dataset='cifar10', net='resnet18', data_home=data_home)
+
+def fetchFashionMNIST(modality, data_home='./data'):
+    return fetchVisual(modality, dataset='fashionmnist', net='basiccnn', data_home=data_home)
+                       
+def fetchSVHN(modality, data_home='./data'):
+    return fetchVisual(modality, dataset='svhn', net='resnet18', data_home=data_home)
+
+
+
+class DatasetHandler(ABC):
+
+    def __init__(self, name):
+        self.name = name
+
+    @classmethod
+    def get_defaults(cls):
+        sig = inspect.signature(cls.__init__)
+
+        defaults = {
+            name: param.default
+            for name, param in sig.parameters.items()
+            if param.default is not inspect.Parameter.empty
+        }
+
+        return defaults
+
+    @abstractmethod
+    def get_training(self): ...
+
+    @abstractmethod
+    def get_train_testprot_for_eval(self): ...
+
+    @abstractmethod
+    def get_train_valprot_for_modsel(self): ...
+
+    @classmethod
+    @abstractmethod
+    def get_datasets(cls): ...
+
+    @classmethod
+    def iter(cls, **kwargs):
+        for name in cls.get_datasets():
+            yield cls(name, **kwargs)
+
+    def __repr__(self):
+        return f'{self.__class__.__name__}({self.name})'
+
+    @classmethod
+    @abstractmethod
+    def is_binary(self):  ...
+
+
+class VisualDataHandler(DatasetHandler):
+
+    def __init__(self, name, n_val_samples=100, n_test_samples=100, sample_size=500, random_state=0):
+        # mode features : feature-based, the idea is to learn a LogisticRegression on top
+        # mode predictions : posterior probabilities
+        # assert modality in ['features', 'predictions'], f'unknown {modality=}'
+        super().__init__(name=name)
+        modality = 'predictions'
+        if name.endswith('-f'):
+            modality = 'features'
+        elif name.endswith('-l'):
+            modality = 'logits'
+        self.modality = modality
+        self.n_val_samples = n_val_samples
+        self.n_test_samples = n_test_samples
+        self.sample_size = sample_size
+        self.random_state = random_state
+
+    @lru_cache(maxsize=None)
+    def dataset(self):
+        name = self.name.lower()
+        name = name.replace('-f', '')
+        name = name.replace('-l', '')
+
+        if name=='mnist':
+            data = fetchMNIST(modality=self.modality)
+        elif name=='cifar100coarse':
+            data = fetchCIFAR100coarse(modality=self.modality)
+        elif name=='cifar100':
+            data = fetchCIFAR100(modality=self.modality)
+        elif name=='cifar10':
+            data = fetchCIFAR10(modality=self.modality)
+        elif name=='fashionmnist':
+            data = fetchFashionMNIST(modality=self.modality)
+        elif name=='svhn':
+            data = fetchSVHN(modality=self.modality)
+        else:
+            raise ValueError(f'unknown dataset {name}')
+
+        # the training set was used to extract features;
+        # we use the validation portion as a training set for quantifiers
+        net_train, val, test = data
+        train, val = val.split_stratified(train_prop=0.6, random_state=self.random_state)
+        return train, val, test
+
+    def get_training(self):
+        train, val, test = self.dataset()
+        return train
+
+    def get_validation(self):
+        train, val, test = self.dataset()
+        return val
+
+    def get_train_testprot_for_eval(self):
+        train, val, test = self.dataset()
+        test_prot = UPP(test, sample_size=self.sample_size, repeats=self.n_test_samples, random_state=self.random_state)
+        return train+val, test_prot
+
+    def get_train_valprot_for_modsel(self):
+        train, val, test = self.dataset()
+        val_prot = UPP(val, sample_size=self.sample_size, repeats=self.n_val_samples, random_state=self.random_state)
+        return train, val_prot
+
+    @classmethod
+    def get_datasets(cls):
+        datasets = ['cifar100coarse', 'cifar10', 'mnist', 'fashionmnist', 'svhn'] #+ ['cifar100']
+        # datasets_feat = [f'{d}-f' for d in datasets]
+        datasets_feat = [f'{d}-l' for d in datasets]
+        return datasets_feat # + datasets
+        
+
+    @classmethod
+    def iter(cls, **kwargs):
+        for name in cls.get_datasets():
+            yield cls(name, **kwargs)
+
+    def __repr__(self):
+        return f'{self.name}'
+
+    @classmethod
+    def is_binary(self):
+        return False
+
+
+class CIFAR100Handler(VisualDataHandler):
+    def __init__(self, name, n_val_samples=100, n_test_samples=100, sample_size=2000, random_state=0):
+        super().__init__(name=name, n_val_samples=n_val_samples, n_test_samples=n_test_samples, sample_size=sample_size, random_state=random_state)
+
+    @classmethod
+    def get_datasets(cls):
+        datasets = ['cifar100']
+        # datasets_feat = [f'{d}-f' for d in datasets]
+        datasets_feat = [f'{d}-l' for d in datasets]
+        return datasets_feat # + datasets
+
+
+# LeQua multiclass tasks
+class LeQuaHandler(DatasetHandler):
+
+    DATASETS = ['LeQua2022', 'LeQua2024']
+
+    def __init__(self, name):
+        super().__init__(name)
+        self.sample_size = 1000
+
+    def get_training(self):
+        return self.dataset()[0]
+
+    def get_train_testprot_for_eval(self):
+        training, _, test_generator = self.dataset()
+        return training, test_generator
+
+    def get_train_valprot_for_modsel(self):
+        training, val_generator, _ = self.dataset()
+        return training, val_generator
+
+    @lru_cache(maxsize=None)
+    def dataset(self):
+        if self.name=='LeQua2022':
+            return qp.datasets.fetch_lequa2022(task='T1B')
+        elif self.name=='LeQua2024':
+            return qp.datasets.fetch_lequa2024(task='T2')
+        else:
+            raise ValueError(f'unexpected dataset name {self.name}; valid ones are {self.DATASETS}')
+
+    def __repr__(self):
+        return self.name
+
+    @classmethod
+    def iter(cls):
+        for name in cls.DATASETS:
+            yield cls(name)
+
+    @classmethod
+    def is_binary(self):
+        return False
+
+    @classmethod
+    def get_datasets(cls):
+        return cls.DATASETS
+
+
+class UCIDatasetHandler(DatasetHandler, ABC):
+    DATASETS = []
+
+    def __init__(self, name, n_val_samples, n_test_samples, sample_size, random_state):
+        super().__init__(name=name)
+        self.sample_size = sample_size
+        self.n_val_samples = n_val_samples
+        self.n_test_samples = n_test_samples
+        self.random_state = random_state
+
+    def get_training(self):
+        return self.dataset().training
+
+    def get_train_testprot_for_eval(self):
+        training, test = self.dataset().train_test
+        test_generator = qp.protocol.UPP(test, sample_size=self.sample_size, repeats=self.n_test_samples,
+                                         random_state=self.random_state)
+        return training, test_generator
+
+    def get_train_valprot_for_modsel(self):
+        training = self.dataset().training
+        training, val = training.split_stratified(train_prop=0.6, random_state=self.random_state)
+        val_generator = qp.protocol.UPP(val, sample_size=self.sample_size, repeats=self.n_val_samples,
+                                        random_state=self.random_state)
+        return training, val_generator
+
+    @classmethod
+    def get_datasets(cls):
+        return cls.DATASETS
+
+    @classmethod
+    def iter(cls):
+        for name in cls.DATASETS:
+            yield cls(name)
+
+
+class UCIMulticlassHandler(UCIDatasetHandler):
+
+    DATASETS = sorted([d for d in qp.datasets.UCI_MULTICLASS_DATASETS if d not in frozenset(['hcv', 'poker_hand'])])
+
+    def __init__(self, name, n_val_samples=100, n_test_samples=100, sample_size=1000, random_state=0):
+        super().__init__(name, n_val_samples, n_test_samples, sample_size, random_state)
+
+    @lru_cache(maxsize=None)
+    def dataset(self):
+        return qp.datasets.fetch_UCIMulticlassDataset(self.name, min_class_support=0.01)
+
+    def __repr__(self):
+        return f'{self.name}(UCI-multiclass)'
+
+    @classmethod
+    def is_binary(self):
+        return False
+
+
+class UCIBinaryHandler(DatasetHandler):
+
+    DATASETS = qp.datasets.UCI_BINARY_DATASETS.copy()
+
+    def __init__(self, name, n_val_samples=100, n_test_samples=100, sample_size=500, random_state=0):
+        super().__init__(name, n_val_samples, n_test_samples, sample_size, random_state)
+
+    @lru_cache(maxsize=None)
+    def dataset(self):
+        return qp.datasets.fetch_UCIBinaryDataset(self.name)
+
+    def __repr__(self):
+        return f'{self.name}(UCI-binary)'
+
+
+    @classmethod
+    def is_binary(self):
+        return True
+
+
+if __name__ == '__main__':
+    train, val, test = fetchMNIST(modality='predictions')
+    cls = MockClassifierFromPosteriors()
+    cls.fit(*train.Xy)
+    # q = KDEyML(classifier=cls, fit_classifier=False)
+    # q = PACC(classifier=cls, fit_classifier=False)
+    q = EMQ(classifier=cls, fit_classifier=False)
+    q.fit(*val.Xy)
+    test_prot = UPP(test, repeats=100, sample_size=500)
+    report = qp.evaluation.evaluation_report(q, test_prot, verbose=True)
+    print(report.mean(numeric_only=True))
--- a/BayesianKDEy/error_expected.py
+++ b/BayesianKDEy/error_expected.py
@ -0,0 +1,9 @@
+import quapy as qp
+
+from quapy.error import mae
+import quapy.functional as F
+
+for n in range(2,100,5):
+    a = F.uniform_prevalence_sampling(n_classes=n, size=10_000)
+    b = F.uniform_prevalence_sampling(n_classes=n, size=10_000)
+    print(f'{n=} ae={mae(a, b):.5f}')
--- a/BayesianKDEy/full_experiments.py
+++ b/BayesianKDEy/full_experiments.py
@ -1,50 +1,31 @@
-import os
-import warnings
-from os.path import join
 from pathlib import Path

-from sklearn.calibration import CalibratedClassifierCV
-from sklearn.linear_model import LogisticRegression as LR
-from sklearn.model_selection import GridSearchCV, StratifiedKFold
+from sklearn.linear_model import LogisticRegression
 from copy import deepcopy as cp
 import quapy as qp
-from BayesianKDEy._bayeisan_kdey import BayesianKDEy
+from BayesianKDEy.commons import KDEyReduce
+from BayesianKDEy.methods import get_experimental_methods, MethodDescriptor
+from _bayeisan_kdey import BayesianKDEy
+from _bayesian_mapls import BayesianMAPLS
+from commons import experiment_path, KDEyCLR, RESULT_DIR, MockClassifierFromPosteriors, KDEyScaledB, KDEyFresh
+# import datasets
+from datasets import LeQuaHandler, UCIMulticlassHandler, DatasetHandler, VisualDataHandler, CIFAR100Handler
+from temperature_calibration import temp_calibration
 from build.lib.quapy.data import LabelledCollection
-from quapy.method.aggregative import DistributionMatchingY as DMy, AggregativeQuantifier, EMQ
-from quapy.method.base import BinaryQuantifier, BaseQuantifier
+from quapy.method.aggregative import DistributionMatchingY as DMy, AggregativeQuantifier, EMQ, CC
 from quapy.model_selection import GridSearchQ
 from quapy.data import Dataset
-# from BayesianKDEy.plot_simplex import plot_prev_points, plot_prev_points_matplot
-from quapy.method.confidence import ConfidenceIntervals, BayesianCC, PQ, WithConfidenceABC, AggregativeBootstrap
-from quapy.functional import strprev
+from quapy.method.confidence import BayesianCC, AggregativeBootstrap
 from quapy.method.aggregative import KDEyML, ACC
 from quapy.protocol import UPP
-import quapy.functional as F
 import numpy as np
 from tqdm import tqdm
-from scipy.stats import dirichlet
 from collections import defaultdict
 from time import time
-from sklearn.base import clone, BaseEstimator


-class KDEyCLR(KDEyML):
-    def __init__(self, classifier: BaseEstimator=None, fit_classifier=True, val_split=5, bandwidth=1., random_state=None):
-        super().__init__(
-            classifier=classifier, fit_classifier=fit_classifier, val_split=val_split, bandwidth=bandwidth,
-            random_state=random_state, kernel='aitchison'
-        )

-
-class KDEyILR(KDEyML):
-    def __init__(self, classifier: BaseEstimator=None, fit_classifier=True, val_split=5, bandwidth=1., random_state=None):
-        super().__init__(
-            classifier=classifier, fit_classifier=fit_classifier, val_split=val_split, bandwidth=bandwidth,
-            random_state=random_state, kernel='ilr'
-        )
-
-
-def methods():
+def methods___depr():
    """
    Returns a tuple (name, quantifier, hyperparams, bayesian/bootstrap_constructor), where:
    - name: is a str representing the name of the method (e.g., 'BayesianKDEy')
@ -53,43 +34,86 @@ def methods():
    - bayesian/bootstrap_constructor: is a function that instantiates the bayesian o bootstrap method with the
        quantifier with optimized hyperparameters
    """
-    acc_hyper = {}
-    emq_hyper = {'calib': ['nbvs', 'bcts', 'ts', 'vs']}
-    hdy_hyper = {'nbins': [3,4,5,8,16,32]}
-    kdey_hyper = {'bandwidth': [0.001, 0.005, 0.01, 0.05, 0.1, 0.2]}
-    kdey_hyper_clr = {'bandwidth': [0.05, 0.1, 0.5, 1., 2., 5.]}

+    Cls = LogisticRegression
+    cls_hyper = {'classifier__C': np.logspace(-4,4,9), 'classifier__class_weight': ['balanced', None]}
+    val_split = 5  # k-fold cross-validation
+    cc_hyper = cls_hyper
+    acc_hyper = cls_hyper
+    # emq_hyper = {'calib': ['nbvs', 'bcts', 'ts', 'vs'], **cls_hyper}
+    hdy_hyper = {'nbins': [3,4,5,8,16,32], **cls_hyper}
+    kdey_hyper = {'bandwidth': np.logspace(-3, -1, 10), **cls_hyper}
+    kdey_hyper_clr = {'bandwidth': np.logspace(-2, 2, 10), **cls_hyper}
+    band ={'bandwidth':np.logspace(-3,-1,10)}
    multiclass_method = 'multiclass'
    only_binary = 'only_binary'
    only_multiclass = 'only_multiclass'

-    # yield 'BootstrapACC', ACC(LR()), acc_hyper, lambda hyper: AggregativeBootstrap(ACC(LR()), n_test_samples=1000, random_state=0), multiclass_method
-    # yield 'BayesianACC', ACC(LR()), acc_hyper, lambda hyper: BayesianCC(LR(), mcmc_seed=0), multiclass_method
-
-    yield 'BootstrapEMQ', EMQ(LR(), on_calib_error='backup', val_split=5), emq_hyper, lambda hyper: AggregativeBootstrap(EMQ(LR(), on_calib_error='backup', calib=hyper['calib'], val_split=5), n_test_samples=1000, random_state=0), multiclass_method
-
-    # yield 'BootstrapHDy', DMy(LR()), hdy_hyper, lambda hyper: AggregativeBootstrap(DMy(LR(), **hyper), n_test_samples=1000, random_state=0), multiclass_method
-    # yield 'BayesianHDy', DMy(LR()), hdy_hyper, lambda hyper: PQ(LR(), stan_seed=0, **hyper), only_binary
-    #
-    # yield 'BootstrapKDEy', KDEyML(LR()), kdey_hyper, lambda hyper: AggregativeBootstrap(KDEyML(LR(), **hyper), n_test_samples=1000, random_state=0, verbose=True), multiclass_method
-    # yield 'BayesianKDEy', KDEyML(LR()), kdey_hyper, lambda hyper: BayesianKDEy(mcmc_seed=0, **hyper), multiclass_method
-    # yield 'BayesianKDEy*', KDEyCLR(LR()), kdey_hyper_clr, lambda hyper: BayesianKDEy(kernel='aitchison', mcmc_seed=0, **hyper), multiclass_method
-    # yield 'BayKDEy*CLR', KDEyCLR(LR()), kdey_hyper_clr, lambda hyper: BayesianKDEy(kernel='aitchison', mcmc_seed=0, explore='clr', step_size=.15, **hyper), multiclass_method
-    # yield 'BayKDEy*CLR2', KDEyCLR(LR()), kdey_hyper_clr, lambda hyper: BayesianKDEy(kernel='aitchison', mcmc_seed=0, explore='clr', step_size=.05, **hyper), multiclass_method
-    # yield 'BayKDEy*ILR', KDEyCLR(LR()), kdey_hyper_clr, lambda hyper: BayesianKDEy(kernel='aitchison', mcmc_seed=0, explore='ilr', step_size=.15, **hyper), only_multiclass
-    # yield 'BayKDEy*ILR2', KDEyILR(LR()), kdey_hyper_clr, lambda hyper: BayesianKDEy(kernel='ilr', mcmc_seed=0, explore='ilr', step_size=.1, **hyper), only_multiclass
+    # surrogate quantifiers
+    cc = CC(Cls())
+    acc = ACC(Cls(), val_split=val_split)
+    hdy = DMy(Cls(), val_split=val_split)
+    kde_gau = KDEyML(Cls(), val_split=val_split)
+    kde_gau_scale = KDEyScaledB(Cls(), val_split=val_split)
+    kde_gau_pca = KDEyReduce(Cls(), val_split=val_split, n_components=5)
+    kde_gau_pca10 = KDEyReduce(Cls(), val_split=val_split, n_components=10)
+    kde_ait = KDEyCLR(Cls(), val_split=val_split)
+    emq = EMQ(Cls(), exact_train_prev=False, val_split=val_split)


-def model_selection(train: LabelledCollection, point_quantifier: AggregativeQuantifier, grid: dict):
+    # Bootstrap approaches:
+    # --------------------------------------------------------------------------------------------------------
+    # yield 'BootstrapCC', cc, cc_hyper, lambda hyper: AggregativeBootstrap(CC(Cls()), n_test_samples=1000, random_state=0), multiclass_method
+    #yield 'BootstrapACC', acc, acc_hyper, lambda hyper: _AggregativeBootstrap(ACC(Cls()), n_test_samples=1000, random_state=0), multiclass_method
+    #yield 'BootstrapEMQ', emq, on_calib_error='backup', val_split=5), emq_hyper, lambda hyper: _AggregativeBootstrap(EMQ(Cls(), on_calib_error='backup', calib=hyper['calib'], val_split=5), n_test_samples=1000, random_state=0), multiclass_method
+    #yield 'BootstrapHDy', hdy, hdy_hyper, lambda hyper: _AggregativeBootstrap(DMy(Cls(), **hyper), n_test_samples=1000, random_state=0), multiclass_method
+    #yield 'BootstrapKDEy', kde_gau, kdey_hyper, lambda hyper: _AggregativeBootstrap(KDEyML(Cls(), **hyper), n_test_samples=1000, random_state=0, verbose=True), multiclass_method
+
+    # Bayesian approaches: (*=temp calibration auto threshold and coverage sim to nominal; +=temp calibration w/o amplitude coverage, for winkler criterion, !=same but alpha=0.005 for winkler)
+    # --------------------------------------------------------------------------------------------------------
+    # yield 'BayesianACC', acc, acc_hyper, lambda hyper: BayesianCC(Cls(), val_split=val_split, mcmc_seed=0), multiclass_method
+    # yield 'BayesianACC*', acc, acc_hyper, lambda hyper: BayesianCC(Cls(), val_split=val_split, temperature=None, mcmc_seed=0), multiclass_method
+    # yield 'BayesianACC+', acc, acc_hyper, lambda hyper: BayesianCC(Cls(), val_split=val_split, temperature=None, mcmc_seed=0), multiclass_method
+    # yield 'BayesianACC!', acc, acc_hyper, lambda hyper: BayesianCC(Cls(), val_split=val_split, temperature=None, mcmc_seed=0), multiclass_method
+    #yield 'BayesianHDy', hdy, hdy_hyper, lambda hyper: PQ(Cls(), val_split=val_split, stan_seed=0, **hyper), only_binary
+    # yield f'BaKDE-Ait-numpyro', kde_ait, kdey_hyper_clr, lambda hyper: BayesianKDEy(Cls(), kernel='aitchison', mcmc_seed=0, engine='numpyro', val_split=val_split,  **hyper), multiclass_method
+    #yield f'BaKDE-Gau-numpyro', kde_gau, kdey_hyper, lambda hyper: BayesianKDEy(Cls(), kernel='gaussian', mcmc_seed=0, engine='numpyro', val_split=val_split,  **hyper), multiclass_method
+    #yield f'BaKDE-Gau-scale', kde_gau_scale, kdey_hyper, lambda hyper: BayesianKDEy(Cls(), kernel='gaussian', mcmc_seed=0, engine='numpyro', val_split=val_split,  **hyper), multiclass_method
+    #yield f'BaKDE-Gau-pca5', kde_gau_pca, band, lambda hyper: BayesianKDEy(Cls(), reduce=5, kernel='gaussian', mcmc_seed=0, engine='numpyro', val_split=val_split,  **hyper), multiclass_method
+    #yield f'BaKDE-Gau-pca5*', kde_gau_pca, band, lambda hyper: BayesianKDEy(Cls(), reduce=5, temperature=None, kernel='gaussian', mcmc_seed=0, engine='numpyro', val_split=val_split,  **hyper), multiclass_method
+    #yield f'BaKDE-Gau-pca10', kde_gau_pca10, band, lambda hyper: BayesianKDEy(Cls(), reduce=10, kernel='gaussian', mcmc_seed=0, engine='numpyro', val_split=val_split,  **hyper), multiclass_method
+    #yield f'BaKDE-Gau-pca10*', kde_gau_pca10, band, lambda hyper: BayesianKDEy(Cls(), reduce=10, temperature=None, kernel='gaussian', mcmc_seed=0, engine='numpyro', val_split=val_split,  **hyper), multiclass_method
+    # yield f'BaKDE-Gau-H0', KDEyFresh(Cls(), bandwidth=0.4), cls_hyper, lambda hyper: BayesianKDEy(Cls(), bandwidth=0.4, kernel='gaussian', mcmc_seed=0, engine='numpyro', **hyper), multiclass_method
+    # yield f'BaKDE-Gau-H1', KDEyFresh(Cls(), bandwidth=1.), cls_hyper, lambda hyper: BayesianKDEy(Cls(), bandwidth=1., kernel='gaussian', mcmc_seed=0, engine='numpyro', **hyper), multiclass_method
+    # yield f'BaKDE-Gau-H2', KDEyFresh(Cls(), bandwidth=1.5), cls_hyper, lambda hyper: BayesianKDEy(Cls(), bandwidth=1.5,
+    #                                                                                          kernel='gaussian',
+    #                                                                                          mcmc_seed=0,
+    #                                                                                          engine='numpyro',
+    #                                                                                          **hyper), multiclass_method
+    # yield f'BaKDE-Ait-T*', kde_ait, kdey_hyper_clr, lambda hyper: BayesianKDEy(Cls(),kernel='aitchison', mcmc_seed=0, engine='numpyro', temperature=None, val_split=val_split, **hyper), multiclass_method
+    # yield f'BaKDE-Ait-T!', kde_ait, kdey_hyper_clr, lambda hyper: BayesianKDEy(Cls(),kernel='aitchison', mcmc_seed=0, engine='numpyro', temperature=None, val_split=val_split, **hyper), multiclass_method
+    #yield f'BaKDE-Gau-T*', kde_gau, kdey_hyper, lambda hyper: BayesianKDEy(Cls(), kernel='gaussian', mcmc_seed=0, engine='numpyro', temperature=None, val_split=val_split, **hyper), multiclass_method
+    #yield f'BaKDE-Gau-T!', kde_gau, kdey_hyper, lambda hyper: BayesianKDEy(Cls(), kernel='gaussian', mcmc_seed=0, engine='numpyro', temperature=None, val_split=val_split, **hyper), multiclass_method
+    # yield 'BayEMQ', emq, acc_hyper, lambda hyper: BayesianMAPLS(Cls(), prior='uniform', temperature=1, exact_train_prev=False, val_split=val_split), multiclass_method
+    # yield 'BayEMQ*', emq, acc_hyper, lambda hyper: BayesianMAPLS(Cls(), prior='uniform', temperature=None, exact_train_prev=False, val_split=val_split), multiclass_method
+    # yield 'BayEMQ!', emq, acc_hyper, lambda hyper: BayesianMAPLS(Cls(), prior='uniform', temperature=None, exact_train_prev=False, val_split=val_split), multiclass_method
+    # yield 'BaEMQ', emq, acc_hyper, lambda hyper: BayesianMAPLS(Cls(**{k.replace('classifier__', ''): v for k, v in hyper.items()}), prior='uniform', temperature=1, exact_train_prev=False, val_split=val_split), multiclass_method
+
+    # yield 'BaACC!', acc, acc_hyper, lambda hyper: BayesianCC(Cls(**{k.replace('classifier__', ''): v for k, v in hyper.items()}), temperature=None, mcmc_seed=0), multiclass_method
+    # yield 'BaEMQ!', emq, acc_hyper, lambda hyper: BayesianMAPLS(Cls(**{k.replace('classifier__', ''): v for k, v in hyper.items()}), prior='uniform', temperature=None, exact_train_prev=False), multiclass_method
+
+
+def model_selection(dataset: DatasetHandler, point_quantifier: AggregativeQuantifier, grid: dict):
    with qp.util.temp_seed(0):
+        point_quantifier = cp(point_quantifier)
        print(f'performing model selection for {point_quantifier.__class__.__name__} with grid {grid}')
        # model selection
        if len(grid)>0:
-            train, val  = train.split_stratified(train_prop=0.6, random_state=0)
+            train, val_prot = dataset.get_train_valprot_for_modsel()
            mod_sel = GridSearchQ(
                model=point_quantifier,
                param_grid=grid,
-                protocol=qp.protocol.UPP(val, repeats=250, random_state=0),
+                protocol=val_prot,
                refit=False,
                n_jobs=-1,
                verbose=True
@ -101,93 +125,103 @@ def model_selection(train: LabelledCollection, point_quantifier: AggregativeQuan
        return best_params


-def experiment(dataset: Dataset, point_quantifier: AggregativeQuantifier, method_name:str, grid: dict, withconf_constructor, hyper_choice_path: Path):
-    with qp.util.temp_seed(0):
+def temperature_calibration(dataset: DatasetHandler, uncertainty_quantifier):
+    temperature = None
+    if hasattr(uncertainty_quantifier, 'temperature'): 
+        if uncertainty_quantifier.temperature is None:
+            print('calibrating temperature')
+            train, val_prot = dataset.get_train_valprot_for_modsel()
+            temp_grid=[1., .5, 1.5, 2., 5., 10., 100., 1000.]
+            temperature = temp_calibration(uncertainty_quantifier, train, val_prot, temp_grid=temp_grid, n_jobs=-1, amplitude_threshold=1., criterion='winkler')
+            uncertainty_quantifier.temperature = temperature
+        else:
+            temperature = uncertainty_quantifier.temperature
+    return temperature
+    

-        training, test = dataset.train_test
+def experiment(dataset: DatasetHandler, method: MethodDescriptor, hyper_choice_path: Path):
+
+    with qp.util.temp_seed(0):

        # model selection
        best_hyperparams = qp.util.pickled_resource(
-            hyper_choice_path, model_selection, training, cp(point_quantifier), grid
+            hyper_choice_path, model_selection, dataset, method.surrogate_quantifier(), method.hyper_parameters
        )
+        print(f'{best_hyperparams=}')

        t_init = time()
-        withconf_quantifier = withconf_constructor(best_hyperparams).fit(*training.Xy)
+        uncertainty_quantifier = method.uncertainty_aware_quantifier(best_hyperparams)
+        temperature = temperature_calibration(dataset, uncertainty_quantifier)
+        training, test_generator = dataset.get_train_testprot_for_eval()
+        uncertainty_quantifier.fit(*training.Xy)
        tr_time = time() - t_init

        # test
        train_prevalence = training.prevalence()
        results = defaultdict(list)
-        test_generator = UPP(test, repeats=100, random_state=0)
-        for i, (sample_X, true_prevalence) in tqdm(enumerate(test_generator()), total=test_generator.total(), desc=f'{method_name} predictions'):
+        pbar = tqdm(enumerate(test_generator()), total=test_generator.total())
+        for i, (sample_X, true_prevalence) in pbar:
            t_init = time()
-            point_estimate, region = withconf_quantifier.predict_conf(sample_X)
+            point_estimate, region = uncertainty_quantifier.predict_conf(sample_X)
            ttime = time()-t_init
+
            results['true-prevs'].append(true_prevalence)
            results['point-estim'].append(point_estimate)
            results['shift'].append(qp.error.ae(true_prevalence, train_prevalence))
            results['ae'].append(qp.error.ae(prevs_true=true_prevalence, prevs_hat=point_estimate))
            results['rae'].append(qp.error.rae(prevs_true=true_prevalence, prevs_hat=point_estimate))
+            results['sre'].append(qp.error.sre(prevs_true=true_prevalence, prevs_hat=point_estimate, prevs_train=train_prevalence))
            results['coverage'].append(region.coverage(true_prevalence))
            results['amplitude'].append(region.montecarlo_proportion(n_trials=50_000))
            results['test-time'].append(ttime)
            results['samples'].append(region.samples)

+            pbar.set_description(
+                f'{method.name} '
+                f'MAE={np.mean(results["ae"]):.5f} '
+                f'W={np.mean(results["sre"]):.5f} '
+                f'Cov={np.mean(results["coverage"]):.5f} '
+                f'AMP={np.mean(results["amplitude"]):.5f}'
+            )
+
        report = {
            'optim_hyper': best_hyperparams,
            'train_time': tr_time,
            'train-prev': train_prevalence,
-            'results': {k:np.asarray(v) for k,v in results.items()}
+            'results': {k:np.asarray(v) for k,v in results.items()},
+            'temperature': temperature
        }

        return report


-def experiment_path(dir:Path, dataset_name:str, method_name:str):
-    os.makedirs(dir, exist_ok=True)
-    return dir/f'{dataset_name}__{method_name}.pkl'
-
-
 if __name__ == '__main__':

-    binary = {
-        'datasets': qp.datasets.UCI_BINARY_DATASETS,
-        'fetch_fn': qp.datasets.fetch_UCIBinaryDataset,
-        'sample_size': 500
-    }
+    result_dir = RESULT_DIR

-    multiclass = {
-        'datasets': qp.datasets.UCI_MULTICLASS_DATASETS,
-        'fetch_fn': qp.datasets.fetch_UCIMulticlassDataset,
-        'sample_size': 1000
-    }
+    for data_handler in [LeQuaHandler]:#, UCIMulticlassHandler, LeQuaHandler, VisualDataHandler, CIFAR100Handler]:
+        for dataset in data_handler.iter():
+            qp.environ['SAMPLE_SIZE'] = dataset.sample_size
+            print(f'dataset={dataset.name}')

-    result_dir = Path('./results')
+            problem_type = 'binary' if dataset.is_binary() else 'multiclass'

-    for setup in [binary, multiclass]: # [binary, multiclass]:
-        qp.environ['SAMPLE_SIZE'] = setup['sample_size']
-        for data_name in setup['datasets']:
-            print(f'dataset={data_name}')
-            # if data_name=='breast-cancer' or data_name.startswith("cmc") or data_name.startswith("ctg"):
-            #     print(f'skipping dataset: {data_name}')
-            #     continue
-            data = setup['fetch_fn'](data_name)
-            is_binary = data.n_classes==2
-            result_subdir = result_dir / ('binary' if is_binary else 'multiclass')
-            hyper_subdir  = result_dir / 'hyperparams' / ('binary' if is_binary else 'multiclass')
-            for method_name, method, hyper_params, withconf_constructor, method_scope in methods():
-                if method_scope == 'only_binary' and not is_binary:
+            for method in get_experimental_methods():
+                # skip combination?
+                if method.binary_only() and not dataset.is_binary():
                    continue
-                if method_scope == 'only_multiclass' and is_binary:
-                    continue
-                result_path = experiment_path(result_subdir, data_name, method_name)
-                hyper_path  = experiment_path(hyper_subdir, data_name, method.__class__.__name__)
+
+                result_path = experiment_path(result_dir / problem_type, dataset.name, method.name)
+                hyper_path  = experiment_path(result_dir / 'hyperparams' / problem_type, dataset.name, method.surrogate_quantifier_name())
+
                report = qp.util.pickled_resource(
-                    result_path, experiment, data, method, method_name, hyper_params, withconf_constructor, hyper_path
+                    result_path, experiment, dataset, method, hyper_path
                )
-                print(f'dataset={data_name}, '
-                      f'method={method_name}: '
-                      f'mae={report["results"]["ae"].mean():.3f}, '
+
+                print(f'dataset={dataset.name}, '
+                      f'method={method.name}: '
+                      f'mae={report["results"]["ae"].mean():.5f}, '
+                      f'W={report["results"]["sre"].mean():.5f}, '
                      f'coverage={report["results"]["coverage"].mean():.5f}, '
                      f'amplitude={report["results"]["amplitude"].mean():.5f}, ')

--- a/BayesianKDEy/generate_results.py
+++ b/BayesianKDEy/generate_results.py
@ -1,15 +1,25 @@
 import pickle
 from collections import defaultdict
-
+import matplotlib.pyplot as plt
+import seaborn as sns
 from joblib import Parallel, delayed
 from tqdm import tqdm
 import pandas as pd
 from glob import glob
 from pathlib import Path
 import quapy as qp
-from error import dist_aitchison
-from quapy.method.confidence import ConfidenceIntervals
+from BayesianKDEy.commons import RESULT_DIR
+from BayesianKDEy.datasets import LeQuaHandler, UCIMulticlassHandler, VisualDataHandler, CIFAR100Handler
+from comparison_group import SelectGreaterThan, SelectByName, SelectSmallerThan
+from format import FormatModifierSelectColor
+from quapy.error import dist_aitchison
+from quapy.method.confidence import ConfidenceIntervals, ConfidenceIntervalsILR, ConfidenceIntervalsCLR
 from quapy.method.confidence import ConfidenceEllipseSimplex, ConfidenceEllipseCLR, ConfidenceEllipseILR, ConfidenceIntervals, ConfidenceRegionABC
+import quapy.functional as F
+from result_path.src.table import LatexTable
+import numpy as np
+import pandas as pd
+from itertools import chain

 pd.set_option('display.max_columns', None)
 pd.set_option('display.width', 2000)
@ -19,6 +29,19 @@ pd.set_option("display.precision", 4)
 pd.set_option("display.float_format", "{:.4f}".format)


+# methods = None  # show all methods
+methods = ['BoCC',
+           'BaACC!',
+           'BaEMQ!',
+           'BaKDE-Gau-T!',
+           'BaKDE-Ait-T!',
+           'BaKDE-Ait-T!2'
+           #'BootstrapACC',
+           #'BootstrapHDy',
+           #'BootstrapKDEy',
+           #'BootstrapEMQ'
+           ]
+
 def region_score(true_prev, region: ConfidenceRegionABC):
    amp = region.montecarlo_proportion(50_000)
    if true_prev in region:
@ -36,11 +59,15 @@ def compute_coverage_amplitude(region_constructor, **kwargs):

    def process_one(samples, true_prevs):
        region = region_constructor(samples, **kwargs)
-        if isinstance(region, ConfidenceIntervals):
+        if isinstance(region, ConfidenceIntervals) or isinstance(region, ConfidenceIntervalsCLR) or isinstance(region, ConfidenceIntervalsILR):
            winkler = region.mean_winkler_score(true_prevs)
+            # winkler_e = region.mean_winkler_score(true_prevs, add_ae=True)
+            cov_soft = region.coverage_soft(true_prevs)
        else:
            winkler = None
-        return region.coverage(true_prevs), region.montecarlo_proportion(), winkler
+            # winkler_e = None
+            cov_soft = None
+        return region.coverage(true_prevs), region.montecarlo_proportion(), winkler, cov_soft

    out = Parallel(n_jobs=3)(
        delayed(process_one)(samples, true_prevs)
@ -52,8 +79,8 @@ def compute_coverage_amplitude(region_constructor, **kwargs):
    )

    # unzip results
-    coverage, amplitude, winkler = zip(*out)
-    return list(coverage), list(amplitude), list(winkler)
+    coverage, amplitude, winkler, cov_soft = zip(*out)
+    return list(coverage), list(amplitude), list(winkler), list(cov_soft)


 def update_pickle(report, pickle_path, updated_dict:dict):
@ -64,107 +91,317 @@ def update_pickle(report, pickle_path, updated_dict:dict):

 def update_pickle_with_region(report, file, conf_name, conf_region_class, **kwargs):
    if f'coverage-{conf_name}' not in report:
-        covs, amps, winkler = compute_coverage_amplitude(conf_region_class, **kwargs)
+        covs, amps, winkler, cov_soft = compute_coverage_amplitude(conf_region_class, **kwargs)

        update_fields = {
            f'coverage-{conf_name}': covs,
            f'amplitude-{conf_name}': amps,
-            f'winkler-{conf_name}': winkler
+            f'winkler-{conf_name}': winkler,
+            f'coverage-soft-{conf_name}': cov_soft
        }

        update_pickle(report, file, update_fields)

-methods = None  # show all methods
-# methods = ['BayesianACC', 'BayesianKDEy']

-for setup in ['multiclass']:
-    path = f'./results/{setup}/*.pkl'
-    table = defaultdict(list)
-    for file in tqdm(glob(path), desc='processing results', total=len(glob(path))):
-        file = Path(file)
-        dataset, method = file.name.replace('.pkl', '').split('__')
-        if methods is not None and method not in methods:
+def pareto_front(df, x_col, y_col, maximize_y=True, minimize_x=True):
+    """
+    Returns a boolean mask indicating whether each row is Pareto-optimal.
+    """
+    X = df[x_col].values
+    Y = df[y_col].values
+
+    is_pareto = np.ones(len(df), dtype=bool)
+
+    for i in range(len(df)):
+        if not is_pareto[i]:
            continue
-        report = pickle.load(open(file, 'rb'))
-        results = report['results']
-        n_samples = len(results['ae'])
-        table['method'].extend([method.replace('Bayesian','Ba').replace('Bootstrap', 'Bo')] * n_samples)
-        table['dataset'].extend([dataset] * n_samples)
-        table['ae'].extend(results['ae'])
-        table['rae'].extend(results['rae'])
-        # table['c-CI'].extend(results['coverage'])
-        # table['a-CI'].extend(results['amplitude'])
+        for j in range(len(df)):
+            if i == j:
+                continue

-        update_pickle_with_region(report, file, conf_name='CI', conf_region_class=ConfidenceIntervals, bonferroni_correction=True)
-        update_pickle_with_region(report, file, conf_name='CE', conf_region_class=ConfidenceEllipseSimplex)
-        update_pickle_with_region(report, file, conf_name='CLR', conf_region_class=ConfidenceEllipseCLR)
-        update_pickle_with_region(report, file, conf_name='ILR', conf_region_class=ConfidenceEllipseILR)
+            better_or_equal_x = X[j] <= X[i] if minimize_x else X[j] >= X[i]
+            better_or_equal_y = Y[j] >= Y[i] if maximize_y else Y[j] <= Y[i]

-        table['c-CI'].extend(report['coverage-CI'])
-        table['a-CI'].extend(report['amplitude-CI'])
-        table['w-CI'].extend(report['winkler-CI'])
+            strictly_better = (
+                (X[j] < X[i] if minimize_x else X[j] > X[i]) or
+                (Y[j] > Y[i] if maximize_y else Y[j] < Y[i])
+            )

-        table['c-CE'].extend(report['coverage-CE'])
-        table['a-CE'].extend(report['amplitude-CE'])
+            if better_or_equal_x and better_or_equal_y and strictly_better:
+                is_pareto[i] = False
+                break

-        table['c-CLR'].extend(report['coverage-CLR'])
-        table['a-CLR'].extend(report['amplitude-CLR'])
+    return is_pareto

-        table['c-ILR'].extend(report['coverage-ILR'])
-        table['a-ILR'].extend(report['amplitude-ILR'])
+def plot_coverage_vs_amplitude(
+    df,
+    coverage_col,
+    amplitude_col="a-CI",
+    method_col="method",
+    dataset_col=None,
+    error_col=None,
+    error_threshold=None,
+    nominal_coverage=0.95,
+    title=None,
+):
+    df_plot = df.copy()

-        table['aitch'].extend(qp.error.dist_aitchison(results['true-prevs'], results['point-estim']))
-        # table['aitch-well'].extend(qp.error.dist_aitchison(results['true-prevs'], [ConfidenceEllipseILR(samples).mean_ for samples in results['samples']]))
-        # table['aitch'].extend()
-        table['reg-score-ILR'].extend(
-            [region_score(true_prev, ConfidenceEllipseILR(samples)) for true_prev, samples in zip(results['true-prevs'], results['samples'])]
-        )
+    # Optional error filtering
+    if error_col is not None and error_threshold is not None:
+        df_plot = df_plot[df_plot[error_col] <= error_threshold]
+
+    # Compute Pareto front
+    pareto_mask = pareto_front(
+        df_plot,
+        x_col=amplitude_col,
+        y_col=coverage_col,
+        maximize_y=True,
+        minimize_x=True
+    )
+
+    plt.figure(figsize=(7, 6))
+
+    # Base scatter
+    sns.scatterplot(
+        data=df_plot,
+        x=amplitude_col,
+        y=coverage_col,
+        hue=method_col,
+        # style=dataset_col,
+        alpha=0.6,
+        s=60,
+        legend=True
+    )
+
+    # Highlight Pareto front
+    plt.scatter(
+        df_plot.loc[pareto_mask, amplitude_col],
+        df_plot.loc[pareto_mask, coverage_col],
+        facecolors='none',
+        edgecolors='black',
+        s=120,
+        linewidths=1.5,
+        label="Pareto front"
+    )
+
+    # Nominal coverage line
+    plt.axhline(
+        nominal_coverage,
+        linestyle="--",
+        color="gray",
+        linewidth=1,
+        label="Nominal coverage"
+    )
+
+    plt.xlabel("Amplitude (fraction of simplex)")
+    plt.ylabel("Coverage")
+    plt.ylim(0, 1.05)
+
+    if title is not None:
+        plt.title(title)
+
+    plt.legend(bbox_to_anchor=(1.05, 1), loc="upper left")
+    plt.tight_layout()
+    plt.show()
+
+
+def nicer_method(name:str):
+    replacements = {
+        # 'Bayesian': 'Ba',
+        'Bootstrap': 'Bo',
+        '-numpyro': '',
+        'emcee': 'emc',
+        '-T*': '*',
+        '-T!': '',
+        '!': '',
+        '-Ait': r'$^{(\mathrm{Ait})}$',
+        '-Gau': r'$^{(\mathrm{Gau})}$'
+    }
+    for k, v in replacements.items():
+        name = name.replace(k,v)
+    return name
+
+
+def nicer_data(name:str):
+    replacements = {
+        'cifar': 'CIFAR',
+        '-l': '',
+        'mnist': 'MNIST',
+        'fashionmnist': 'fashionMNIST',
+        'svhn': 'SVHN',
+        '100coarse': '100(20)',
+    }
+    for k, v in replacements.items():
+        name = name.replace(k, v)
+    return name
+
+
+base_dir = RESULT_DIR
+
+table = defaultdict(list)
+n_classes = {}
+tr_size   = {}
+tr_prev   = {}
+
+dataset_class = [UCIMulticlassHandler, CIFAR100Handler, VisualDataHandler, LeQuaHandler]
+dataset_order = []
+for handler in dataset_class:
+    for dataset in handler.iter():
+        dataset_order.append(dataset.name)
+        train = dataset.get_training()
+        n_classes[dataset.name] = train.n_classes
+        tr_size[dataset.name] = len(train)
+        tr_prev[dataset.name] = F.strprev(train.prevalence())
+
+
+problem_type = 'multiclass'
+path = f'./{base_dir}/{problem_type}/*.pkl'
+    
+for file in tqdm(glob(path), desc='processing results', total=len(glob(path))):
+    file = Path(file)
+    dataset, method = file.name.replace('.pkl', '').split('__')
+    if (method not in methods) or (dataset not in dataset_order):
+        continue
+
+    report = pickle.load(open(file, 'rb'))
+    results = report['results']
+    n_samples = len(results['ae'])
+    table['method'].extend([nicer_method(method)] * n_samples)
+    table['dataset'].extend([nicer_data(dataset)] * n_samples)
+    table['ae'].extend(results['ae'])
+    table['rae'].extend(results['rae'])
+    # table['c-CI'].extend(results['coverage'])
+    # table['a-CI'].extend(results['amplitude'])
+
+    # update_pickle_with_region(report, file, conf_name='CI-ILR', conf_region_class=ConfidenceIntervalsILR, bonferroni_correction=True)
+    # update_pickle_with_region(report, file, conf_name='CI-CLR', conf_region_class=ConfidenceIntervalsCLR, bonferroni_correction=True)
+    update_pickle_with_region(report, file, conf_name='CI', conf_region_class=ConfidenceIntervals, bonferroni_correction=True)
+    update_pickle_with_region(report, file, conf_name='CInb', conf_region_class=ConfidenceIntervals, bonferroni_correction=False)  # no Bonferroni-correction
+    # update_pickle_with_region(report, file, conf_name='CE', conf_region_class=ConfidenceEllipseSimplex)
+    # update_pickle_with_region(report, file, conf_name='CLR', conf_region_class=ConfidenceEllipseCLR)
+    # update_pickle_with_region(report, file, conf_name='ILR', conf_region_class=ConfidenceEllipseILR)
+
+    conf_bonferroni = 'CI'
+    conf_name='CInb'
+    table['c-CI'].extend(report[f'coverage-{conf_bonferroni}'])  # the true coverage is better measured with Bonferroni-correction
+    table['w-CI'].extend(report[f'winkler-{conf_name}'])
+    table['cs-CI'].extend(report[f'coverage-soft-{conf_name}'])
+    table['a-CI'].extend(report[f'amplitude-{conf_name}'])
+
+    # table['aitch'].extend(qp.error.dist_aitchison(results['true-prevs'], results['point-estim']))  # not in this paper...
+    table['SRE'].extend(qp.error.sre(results['true-prevs'], results['point-estim'], report['train-prev'], eps=0.001))



-    df = pd.DataFrame(table)
+# remove datasets with more than max_classes classes
+# max_classes = 25
+# min_train   = 500
+# ignore_datasets = ['poker_hand', 'hcv']
+# for data_name, n in n_classes.items():
+#     if n > max_classes:
+#         df = df[df["dataset"] != data_name]
+# for data_name, n in tr_size.items():
+#     if n < min_train:
+#         df = df[df["dataset"] != data_name]
+# for data_name, n in tr_size.items():
+#     if data_name in ignore_datasets:
+#         df = df[df["dataset"] != data_name]

-    n_classes = {}
-    tr_size   = {}
-    for dataset in df['dataset'].unique():
-        fetch_fn = {
-            'binary': qp.datasets.fetch_UCIBinaryDataset,
-            'multiclass': qp.datasets.fetch_UCIMulticlassDataset
-        }[setup]
-        data = fetch_fn(dataset)
-        n_classes[dataset] = data.n_classes
-        tr_size[dataset] = len(data.training)

-    # remove datasets with more than max_classes classes
-    # max_classes = 30
-    # min_train   = 1000
-    # for data_name, n in n_classes.items():
-    #     if n > max_classes:
-    #         df = df[df["dataset"] != data_name]
-    # for data_name, n in tr_size.items():
-    #     if n < min_train:
-    #         df = df[df["dataset"] != data_name]

-    for region in ['ILR']: # , 'CI', 'CE', 'CLR', 'ILR']:
-        if setup == 'binary' and region=='ILR':
-            continue
-        # pv = pd.pivot_table(
-        #     df, index='dataset', columns='method', values=['ae', f'c-{region}', f'a-{region}'], margins=True
-        # )
+
+df = pd.DataFrame(table)
+df['a-CI'] *= 100
+df['c-CI'] *= 100
+df['cs-CI'] *= 100
+
+for region in ['CI']: #, 'CLR', 'ILR', 'CI']:
+    if problem_type == 'binary' and region=='ILR':
+        continue
+    for column in [f'a-{region}', 'ae', 'SRE', f'c-{region}', f'cs-{region}']: # f'w-{region}'
        pv = pd.pivot_table(
-            df, index='dataset', columns='method', values=[
-                #f'w-{region}',
-                # 'ae',
-                # 'rae',
-                # f'aitch',
-                # f'aitch-well'
-                'reg-score-ILR',
-            ], margins=True
+            df, index='dataset', columns='method', values=column, margins=True
        )
        pv['n_classes'] = pv.index.map(n_classes).astype('Int64')
        pv['tr_size'] = pv.index.map(tr_size).astype('Int64')
-        pv = pv.drop(columns=[col for col in pv.columns if col[-1] == "All"])
-        print(f'{setup=}')
+        #pv['tr-prev'] = pv.index.map(tr_prev)
+        pv = pv.drop(columns=[col for col in pv.columns if col == "All" or col[-1]=='All'])
+        print(f'{problem_type=} {column=}')
        print(pv)
        print('-'*80)

+        latex = LatexTable.from_dataframe(df, method='method', benchmark='dataset', value=column, name=column)
+        latex.format.configuration.show_std = False
+        #latex.reorder_methods([nicer_method(m) for m in methods])
+        latex.reorder_benchmarks([nicer_data(d) for d in dataset_order])
+        if column in ['ae', 'SRE']:
+            latex.format.configuration.lower_is_better = True
+            latex.format.configuration.stat_test = 'wilcoxon'
+            #latex.format.configuration.stat_test = None
+            # latex.format.configuration.show_std = True
+
+        if column in [f'c-{region}', f'cs-{region}']:
+            latex.format.configuration.lower_is_better = False
+            latex.format.configuration.stat_test = None
+            latex.format.configuration.with_color = False
+            latex.format.configuration.best_in_bold = False
+            latex.format.configuration.with_rank = False
+            latex.format.configuration.mean_prec = 0
+            latex.add_format_modifier(
+                format_modifier=FormatModifierSelectColor(
+                    comparison=SelectGreaterThan(reference_selector=89, input_selector=SelectByName())
+                )
+            )
+        if column in [f'a-{region}']:
+            latex.format.configuration.lower_is_better = True
+            latex.format.configuration.stat_test = None
+            latex.format.configuration.with_color = False
+            latex.format.configuration.best_in_bold = False
+            latex.format.configuration.mean_prec = 2
+            latex.add_format_modifier(
+                format_modifier=FormatModifierSelectColor(
+                    comparison=SelectSmallerThan(reference_selector=11, input_selector=SelectByName())
+                )
+            )
+            # latex.add_format_modifier(
+            #     format_modifier=FormatModifierSelectColor(
+            #         comparison=SelectSmallerThan(reference_selector=0.01, input_selector=SelectByName()),
+            #         intensity=50
+            #     )
+            # )
+
+        latex.format.configuration.resizebox=.5
+        latex.latexPDF(pdf_path=f'./tables/{latex.name}.pdf')
+
+
+df = df[df['method']!='BaACC']
+df = df[df['method']!='BaACC*']
+df = df[df['method']!='BaACC+']
+df = df[df['method']!='BaKDE-Ait*']
+df = df[df['method']!='BaKDE-Gau*']
+df = df[df['method']!='BayEMQ*']
+grouped = df.groupby(["method", "dataset"])
+agg = grouped.agg(
+    ae_mean=("ae", "mean"),
+    ae_std=("ae", "std"),
+    sre_mean=("SRE", "mean"),
+    sre_std=("SRE", "std"),
+    coverage_mean=("c-CI", "mean"),
+    coverage_std=("c-CI", "std"),
+    coverage_soft_mean=("cs-CI", "mean"),
+    amplitude_mean=("a-CI", "mean"),
+    amplitude_std=("a-CI", "std"),
+).reset_index()
+
+#plot_coverage_vs_amplitude(
+#    agg,
+#    coverage_col="coverage_soft_mean",
+#    amplitude_col="amplitude_mean",
+#    method_col="method",
+#    dataset_col="dataset",
+#    nominal_coverage=0.95,
+#    title="Marginal coverage vs amplitude"
+#)
+
+
+#print('RESTITUIR EL WILCOXON')
--- a/BayesianKDEy/map_experiments.py
+++ b/BayesianKDEy/map_experiments.py
@ -0,0 +1,169 @@
+import os.path
+from pathlib import Path
+import pandas as pd
+from sklearn.linear_model import LogisticRegression
+from copy import deepcopy as cp
+import quapy as qp
+from BayesianKDEy.commons import KDEyReduce
+from _bayeisan_kdey import BayesianKDEy
+from _bayesian_mapls import BayesianMAPLS
+from commons import experiment_path, KDEyCLR, RESULT_DIR, MockClassifierFromPosteriors, KDEyScaledB, KDEyFresh
+# import datasets
+from datasets import LeQuaHandler, UCIMulticlassHandler, DatasetHandler, VisualDataHandler, CIFAR100Handler
+from method.confidence import ConfidenceIntervals
+from temperature_calibration import temp_calibration
+from build.lib.quapy.data import LabelledCollection
+from quapy.method.aggregative import DistributionMatchingY as DMy, AggregativeQuantifier, EMQ, CC
+from quapy.model_selection import GridSearchQ
+from quapy.data import Dataset
+from quapy.method.confidence import BayesianCC, AggregativeBootstrap
+from quapy.method.aggregative import KDEyML, ACC
+from quapy.protocol import UPP
+import numpy as np
+from tqdm import tqdm
+from collections import defaultdict
+from time import time
+
+
+def methods(data_handler: DatasetHandler):
+    """
+    Returns a tuple (name, quantifier, hyperparams, bayesian/bootstrap_constructor), where:
+    - name: is a str representing the name of the method (e.g., 'BayesianKDEy')
+    - quantifier: is the base model (e.g., KDEyML())
+    - hyperparams: is a dictionary for the quantifier (e.g., {'bandwidth': [0.001, 0.005, 0.01, 0.05, 0.1, 0.2]})
+    - bayesian/bootstrap_constructor: is a function that instantiates the bayesian o bootstrap method with the
+        quantifier with optimized hyperparameters
+    """
+    if isinstance(data_handler, VisualDataHandler):
+        Cls = LogisticRegression
+        cls_hyper = {}
+    else:
+        Cls = LogisticRegression
+        cls_hyper = {'classifier__C': np.logspace(-4, 4, 9), 'classifier__class_weight': ['balanced', None]}
+    kdey_hyper = {'bandwidth': np.logspace(-3, -1, 10), **cls_hyper}
+    kdey_hyper_larger = {'bandwidth': np.logspace(-1, 0, 10), **cls_hyper}
+    kdey_hyper_clr = {'bandwidth': np.logspace(-2, 2, 10), **cls_hyper}
+
+    # surrogate quantifiers
+    kde_gau_scale = KDEyScaledB(Cls())
+
+    yield 'KDEy-G-exp', kdey_hyper, KDEyML(Cls())
+    # yield 'KDEy-G-exp2', kdey_hyper_larger, KDEyML(Cls())
+    # yield 'KDEy-G-log', kdey_hyper, KDEyML(Cls(), logdensities=True)
+    yield 'KDEy-Ait', kdey_hyper_clr, KDEyCLR(Cls())
+
+
+def model_selection(dataset: DatasetHandler, point_quantifier: AggregativeQuantifier, grid: dict):
+    with qp.util.temp_seed(0):
+        if isinstance(point_quantifier, KDEyScaledB) and 'bandwidth' in grid:
+            def scale_bandwidth(bandwidth, n_classes, beta=0.5):
+                return bandwidth * np.power(n_classes, beta)
+
+            n = dataset.get_training().n_classes
+            grid['bandwidth'] = [scale_bandwidth(b, n) for b in grid['bandwidth']]
+            print('bandwidth scaled')
+        print(f'performing model selection for {point_quantifier.__class__.__name__} with grid {grid}')
+        # model selection
+        if len(grid) > 0:
+            train, val_prot = dataset.get_train_valprot_for_modsel()
+            mod_sel = GridSearchQ(
+                model=point_quantifier,
+                param_grid=grid,
+                protocol=val_prot,
+                refit=False,
+                n_jobs=-1,
+                verbose=True
+            ).fit(*train.Xy)
+            best_params = mod_sel.best_params_
+        else:
+            best_params = {}
+
+        return best_params
+
+
+def experiment(dataset: DatasetHandler,
+               point_quantifier: AggregativeQuantifier,
+               method_name: str,
+               grid: dict,
+               hyper_choice_path: Path):
+
+    with qp.util.temp_seed(0):
+        # model selection
+        best_hyperparams = qp.util.pickled_resource(
+            hyper_choice_path, model_selection, dataset, cp(point_quantifier), grid
+        )
+        print(f'{best_hyperparams=}')
+
+        t_init = time()
+        training, test_generator = dataset.get_train_testprot_for_eval()
+        point_quantifier.fit(*training.Xy)
+        tr_time = time() - t_init
+
+        # test
+        train_prevalence = training.prevalence()
+        results = defaultdict(list)
+        pbar = tqdm(enumerate(test_generator()), total=test_generator.total())
+        for i, (sample_X, true_prevalence) in pbar:
+            t_init = time()
+            point_estimate = point_quantifier.predict(sample_X)
+            ttime = time() - t_init
+
+            results['true-prevs'].append(true_prevalence)
+            results['point-estim'].append(point_estimate)
+            results['shift'].append(qp.error.ae(true_prevalence, train_prevalence))
+            results['ae'].append(qp.error.ae(prevs_true=true_prevalence, prevs_hat=point_estimate))
+            results['rae'].append(qp.error.rae(prevs_true=true_prevalence, prevs_hat=point_estimate))
+            results['sre'].append(qp.error.sre(prevs_true=true_prevalence, prevs_hat=point_estimate, prevs_train=train_prevalence))
+            results['test-time'].append(ttime)
+
+            pbar.set_description(
+                f'{method_name} MAE={np.mean(results["ae"]):.5f} W={np.mean(results["sre"]):.5f}')
+
+        report = {
+            'optim_hyper': best_hyperparams,
+            'train_time': tr_time,
+            'train-prev': train_prevalence,
+            'results': {k: np.asarray(v) for k, v in results.items()},
+        }
+
+        return report
+
+
+if __name__ == '__main__':
+
+    result_dir = Path('results_map')
+
+    reports = defaultdict(list)
+
+    for data_handler in [UCIMulticlassHandler]:  # , UCIMulticlassHandler, LeQuaHandler, VisualDataHandler, CIFAR100Handler]:
+        for dataset in data_handler.iter():
+            qp.environ['SAMPLE_SIZE'] = dataset.sample_size
+            # print(f'dataset={dataset.name}')
+
+            problem_type = 'binary' if dataset.is_binary() else 'multiclass'
+            for method_name, hyper_params, quantifier in methods(dataset):
+
+                result_path = experiment_path(result_dir / problem_type, dataset.name, method_name)
+                hyper_path = experiment_path(result_dir / 'hyperparams' / problem_type, dataset.name, method_name)
+
+                # if os.path.exists(result_path):
+                report = qp.util.pickled_resource(
+                    result_path, experiment, dataset, quantifier, method_name, hyper_params, hyper_path
+                )
+                reports['dataset'].append(dataset.name)
+                reports['method'].append(method_name)
+                reports['MAE'].append(report["results"]["ae"].mean())
+                reports['SRE'].append(report["results"]["sre"].mean())
+                reports['h'].append(report["optim_hyper"]["bandwidth"])
+
+                print(f'dataset={dataset.name}, '
+                      f'method={method_name}: '
+                      f'mae={reports["MAE"][-1]:.5f}, '
+                      f'W={reports["SRE"][-1]:.5f} '
+                      f'h={reports["h"][-1]}')
+
+    pv = pd.DataFrame(reports).pivot_table(values=['MAE', 'SRE', 'h'], index='dataset', columns='method', margins=True)
+    print(pv)
+
+
+
--- a/BayesianKDEy/methods.py
+++ b/BayesianKDEy/methods.py
@ -0,0 +1,168 @@
+from abc import ABC, abstractmethod
+import numpy as np
+from sklearn.linear_model import LogisticRegression
+
+from BayesianKDEy._bayeisan_kdey import BayesianKDEy
+from BayesianKDEy._bayesian_mapls import BayesianMAPLS
+from BayesianKDEy.commons import KDEyCLR, KDEyCLR2
+from quapy.method.aggregative import CC, ACC, EMQ, DMy, KDEyML
+from quapy.method.base import BaseQuantifier
+from quapy.method.confidence import AggregativeBootstrap, BayesianCC
+
+
+def get_experimental_methods():
+    #yield BootsCC()
+    #yield BayACC()
+    #yield BayEMQ()
+    #yield BayKDEyGau()
+    #yield BayKDEyAit()
+    yield BayKDEyAit2()
+
+
+# commons
+# ------------------------------------------------------------
+Cls = LogisticRegression
+cls_hyper = {
+    'classifier__C': np.logspace(-4,4,9),
+    'classifier__class_weight': ['balanced', None]
+}
+hdy_hyper = {'nbins': [3, 4, 5, 8, 9, 10, 12, 14, 16, 32], **cls_hyper}
+kdey_hyper = {'bandwidth': np.logspace(-3, -1, 10), **cls_hyper}
+kdey_hyper_clr = {'bandwidth': np.logspace(-2, 2, 10), **cls_hyper}
+
+def hyper2cls(hyperparams):
+    return {k.replace('classifier__', ''): v for k, v in hyperparams.items()}
+
+
+# method descriptor logic
+# ------------------------------------------------------------
+class MethodDescriptor(ABC):
+    def __init__(self,
+                 name: str,
+                 surrogate_quantifier: BaseQuantifier,
+                 hyper_parameters: dict,
+                 ):
+        self.name = name
+        self.surrogate_quantifier_ = surrogate_quantifier
+        self.hyper_parameters = hyper_parameters
+
+    @abstractmethod
+    def binary_only(self): ...
+
+    def surrogate_quantifier(self):
+        return self.surrogate_quantifier_
+
+    def surrogate_quantifier_name(self):
+        return self.surrogate_quantifier_.__class__.__name__
+
+    @abstractmethod
+    def uncertainty_aware_quantifier(self, hyperparameters): ...
+
+
+class MulticlassMethodDescriptor(MethodDescriptor):
+
+    def binary_only(self):
+        return False
+
+
+# specific methods definitions
+# ------------------------------------------------------------
+# ------------------------------------------------------------
+
+# Bootstrap approaches:
+# ------------------------------------------------------------
+class BootsCC(MulticlassMethodDescriptor):
+    def __init__(self):
+        super().__init__(name='BoCC', surrogate_quantifier=CC(Cls()), hyper_parameters=cls_hyper)
+
+    def uncertainty_aware_quantifier(self, hyperparameters):
+        quantifier = CC(Cls()).set_params(**hyperparameters)
+        return AggregativeBootstrap(quantifier, n_test_samples=1000, random_state=0)
+
+
+# class BootsACC(MulticlassMethodDescriptor):
+#     def __init__(self):
+#         super().__init__(name='BoACC', surrogate_quantifier=ACC(Cls()), hyper_parameters=cls_hyper)
+#
+#     def uncertainty_aware_quantifier(self, hyperparameters):
+#         quantifier = ACC(Cls()).set_params(**hyperparameters)
+#         return AggregativeBootstrap(quantifier, n_test_samples=1000, random_state=0)
+#
+#
+# class BootsEMQ(MulticlassMethodDescriptor):
+#     def __init__(self):
+#         super().__init__(name='BoEMQ', surrogate_quantifier=EMQ(Cls(), exact_train_prev=False), hyper_parameters=cls_hyper)
+#
+#     def uncertainty_aware_quantifier(self, hyperparameters):
+#         quantifier = EMQ(Cls(), exact_train_prev=False).set_params(**hyperparameters)
+#         return AggregativeBootstrap(quantifier, n_test_samples=1000, random_state=0)
+
+
+# class BootsHDy(MethodDescriptor):
+#     def __init__(self):
+#         super().__init__(name='BoHDy', surrogate_quantifier=DMy(Cls()), hyper_parameters=hdy_hyper)
+#
+#     def uncertainty_aware_quantifier(self, hyperparameters):
+#         quantifier = DMy(Cls()).set_params(**hyperparameters)
+#         return AggregativeBootstrap(quantifier, n_test_samples=1000, random_state=0)
+#
+#     def binary_only(self):
+#         return True
+
+# class BootsKDEy(MulticlassMethodDescriptor):
+#     def __init__(self):
+#         super().__init__(name='BoKDEy', surrogate_quantifier=KDEyML(Cls()), hyper_parameters=kdey_hyper)
+#
+#     def uncertainty_aware_quantifier(self, hyperparameters):
+#         quantifier = KDEyML(Cls()).set_params(**hyperparameters)
+#         return AggregativeBootstrap(quantifier, n_test_samples=1000, random_state=0)
+
+
+# Bayesian approaches:
+# ------------------------------------------------------------
+class BayACC(MulticlassMethodDescriptor):
+    def __init__(self):
+        super().__init__(name='BaACC!', surrogate_quantifier=ACC(Cls()), hyper_parameters=cls_hyper)
+
+    def uncertainty_aware_quantifier(self, hyperparameters):
+        classifier = Cls(**hyper2cls(hyperparameters))
+        return BayesianCC(classifier, temperature=None, mcmc_seed=0)  # is actually a Bayesian variant of ACC
+
+
+class BayEMQ(MulticlassMethodDescriptor):
+    def __init__(self):
+        super().__init__(name='BaEMQ!', surrogate_quantifier=EMQ(Cls(), exact_train_prev=False), hyper_parameters=cls_hyper)
+
+    def uncertainty_aware_quantifier(self, hyperparameters):
+        classifier = Cls(**hyper2cls(hyperparameters))
+        return BayesianMAPLS(classifier, prior='uniform', temperature=None, exact_train_prev=False)
+
+
+class BayKDEyGau(MulticlassMethodDescriptor):
+    def __init__(self):
+        kdey_hyper = {'bandwidth': np.logspace(-3, -1, 10), **cls_hyper}
+        super().__init__(name='BaKDE-Gau-T!', surrogate_quantifier=KDEyML(Cls()), hyper_parameters=kdey_hyper)
+
+    def uncertainty_aware_quantifier(self, hyperparameters):
+        return BayesianKDEy(Cls(), kernel='gaussian', temperature=None, mcmc_seed=0, **hyperparameters)
+
+
+class BayKDEyAit(MulticlassMethodDescriptor):
+    def __init__(self):
+        kdey_hyper = {'bandwidth': np.logspace(-2, 2, 10), **cls_hyper}
+        super().__init__(name='BaKDE-Ait-T!', surrogate_quantifier=KDEyCLR(Cls()), hyper_parameters=kdey_hyper)
+
+    def uncertainty_aware_quantifier(self, hyperparameters):
+        return BayesianKDEy(Cls(), kernel='aitchison', temperature=None, mcmc_seed=0, **hyperparameters)
+    
+
+class BayKDEyAit2(MulticlassMethodDescriptor):
+    def __init__(self):
+        kdey_hyper = {'bandwidth': np.linspace(0.05, 2., 10), **cls_hyper}
+        super().__init__(name='BaKDE-Ait-T!2', surrogate_quantifier=KDEyCLR2(Cls()), hyper_parameters=kdey_hyper)
+
+    def uncertainty_aware_quantifier(self, hyperparameters):
+        return BayesianKDEy(Cls(), kernel='aitchison', temperature=None, mcmc_seed=0, **hyperparameters)
+
+
+
--- a/BayesianKDEy/plot_simplex.py
+++ b/BayesianKDEy/plot_simplex.py
@ -4,10 +4,14 @@ from pathlib import Path

 import numpy as np
 import matplotlib.pyplot as plt
-from matplotlib.colors import ListedColormap
+from matplotlib.colors import ListedColormap, LinearSegmentedColormap
 from scipy.stats import gaussian_kde
+from sklearn.preprocessing import MinMaxScaler

+from BayesianKDEy.commons import antagonistic_prevalence, in_simplex
 from method.confidence import (ConfidenceIntervals as CI,
+                               ConfidenceIntervalsCLR as CICLR,
+                               ConfidenceIntervalsILR as CIILR,
                               ConfidenceEllipseSimplex as CE,
                               ConfidenceEllipseCLR as CLR,
                               ConfidenceEllipseILR as ILR)
@ -36,7 +40,10 @@ def get_region_colormap(name="blue", alpha=0.40):
 def plot_prev_points(samples=None,
                     show_samples=True,
                     true_prev=None,
-                     point_estim=None, train_prev=None, show_mean=True, show_legend=True,
+                     point_estim=None,
+                     train_prev=None,
+                     show_mean=True,
+                     show_legend=True,
                     region=None,
                     region_resolution=1000,
                     confine_region_in_simplex=False,
@ -100,9 +107,7 @@ def plot_prev_points(samples=None,
        else:
            in_simplex = np.full(shape=(region_resolution, region_resolution), fill_value=True, dtype=bool)

-        # --- Colormap 0 → blanco, 1 → rojo semitransparente ---
-
-        # iterar sobre todas las regiones
+        # iterate over regions
        for (rname, rfun) in region_list:
            mask = np.zeros_like(in_simplex, dtype=float)
            valid_pts = pts_bary[in_simplex]
@ -127,7 +132,7 @@ def plot_prev_points(samples=None,
        else:
            raise ValueError(f'show_mean should either be a boolean (if True, then samples must be provided) or '
                             f'the mean point itself')
-    if train_prev is not None:
+    if true_prev is not None:
        ax.scatter(*cartesian(true_prev), s=10, alpha=1, label='true-prev', edgecolors='black')
    if point_estim is not None:
        ax.scatter(*cartesian(point_estim), s=10, alpha=1, label='KDEy-estim', edgecolors='black')
@ -210,49 +215,346 @@ def plot_prev_points_matplot(points):
    ax.axis('off')
    plt.show()

+# -------- new function
+
+def cartesian(p):
+    dim = p.shape[-1]
+    p = np.atleast_2d(p)
+    x = p[:, 1] + p[:, 2] * 0.5
+    y = p[:, 2] * np.sqrt(3) / 2
+    return x, y
+
+
+def barycentric_from_xy(x, y):
+    """
+    Given cartesian (x,y) in simplex returns baricentric coordinates (p1,p2,p3).
+    """
+    p3 = 2 * y / np.sqrt(3)
+    p2 = x - 0.5 * p3
+    p1 = 1 - p2 - p3
+    return np.stack([p1, p2, p3], axis=-1)
+
+
+def plot_regions(ax, region_layers, resolution, confine):
+    xs = np.linspace(-0.2, 1.2, resolution)
+    ys = np.linspace(-0.2, np.sqrt(3)/2 + 0.2, resolution)
+    grid_x, grid_y = np.meshgrid(xs, ys)
+
+    pts_bary = barycentric_from_xy(grid_x, grid_y)
+
+    if confine:
+        mask_simplex = np.all(pts_bary >= 0, axis=-1)
+    else:
+        mask_simplex = np.ones(grid_x.shape, dtype=bool)
+
+    for region in region_layers:
+        mask = np.zeros_like(mask_simplex, dtype=float)
+        valid_pts = pts_bary[mask_simplex]
+        mask_vals = np.array([float(region["fn"](p)) for p in valid_pts])
+        mask[mask_simplex] = mask_vals
+
+        ax.pcolormesh(
+            xs, ys, mask,
+            shading="auto",
+            cmap=get_region_colormap(region.get("color", "blue")),
+            alpha=region.get("alpha", 0.3),
+            label=region.get("label", None),
+        )
+
+
+
+
+def plot_points(ax, point_layers):
+    for layer in point_layers:
+        pts = layer["points"]
+        style = layer.get("style", {})
+        ax.scatter(
+            *cartesian(pts),
+            label=layer.get("label", None),
+            **style
+        )
+
+def plot_density(
+    ax,
+    density_fn,
+    resolution,
+    densecolor="blue",
+    alpha=1,
+    high_contrast=True  # maps the density values to [0,1] to enhance visualization with the cmap
+):
+    xs = np.linspace(-0.2, 1.2, resolution)
+    ys = np.linspace(-0.2, np.sqrt(3)/2 + 0.2, resolution)
+    grid_x, grid_y = np.meshgrid(xs, ys)
+
+    white_to_color = LinearSegmentedColormap.from_list(
+        "white_to_color",
+        ["white", densecolor]
+    )
+
+    pts_bary = barycentric_from_xy(grid_x, grid_y)
+
+    density = np.zeros(grid_x.shape)
+    flat_pts = pts_bary.reshape(-1, 3)
+
+    density_vals = np.array(density_fn(flat_pts))
+    #density_vals = np.array([density_fn(p) for p in flat_pts])
+    if high_contrast:
+        min_v, max_v = np.min(density_vals), np.max(density_vals)
+        density_vals = (density_vals - min_v)/(max_v-min_v)
+    density[:] = density_vals.reshape(grid_x.shape)
+
+
+    ax.pcolormesh(
+        xs,
+        ys,
+        density,
+        shading="auto",
+        cmap=white_to_color,
+        alpha=alpha,
+    )
+
+
+def plot_simplex(
+    point_layers=None,
+    region_layers=None,
+    density_function=None,
+    density_color="#1f77b4", # blue from tab10
+    density_alpha=1,
+    resolution=1000,
+    confine_region_in_simplex=False,
+    show_legend=True,
+    save_path=None,
+):
+    fig, ax = plt.subplots(figsize=(6, 6))
+
+    if density_function is not None:
+        plot_density(
+            ax,
+            density_function,
+            resolution,
+            density_color,
+            density_alpha,
+        )
+
+    if region_layers:
+        plot_regions(ax, region_layers, resolution, confine_region_in_simplex)
+
+    if point_layers:
+        plot_points(ax, point_layers)
+
+    # simplex edges
+    triangle = np.array([[0,0],[1,0],[0.5,np.sqrt(3)/2],[0,0]])
+    ax.plot(triangle[:,0], triangle[:,1], color="black")
+
+    # labels
+    ax.text(-0.05, -0.05, "Y=1", ha="right", va="top")
+    ax.text(1.05, -0.05, "Y=2", ha="left", va="top")
+    ax.text(0.5, np.sqrt(3)/2 + 0.05, "Y=3", ha="center", va="bottom")
+
+    ax.set_aspect("equal")
+    ax.axis("off")
+
+    if show_legend:
+        ax.legend(loc="center left", bbox_to_anchor=(1.05, 0.5))
+
+    plt.tight_layout()
+    if save_path:
+        os.makedirs(Path(save_path).parent, exist_ok=True)
+        plt.savefig(save_path, bbox_inches="tight")
+    else:
+        plt.show()
+
+
+def gaussian_kernel(p, mu=np.array([0.6, 0.2, 0.2]), sigma=0.5):
+    return np.exp(-np.sum((p - mu) ** 2) / (2 * sigma ** 2))
+
+
+def plot_kernels():
+    from quapy.method._kdey import KDEBase
+
+    kernel = 'aitchison'
+
+    def kernel(p, center, bandwidth=0.1, kernel='gaussian', within_simplex=False):
+        assert within_simplex or kernel == 'gaussian', f'cannot compute {kernel=} outside the simplex'
+        X = np.asarray(center).reshape(-1, 3)
+        p = np.asarray(p).reshape(-1, 3)
+        KDE = KDEBase()
+        kde = KDE.get_kde_function(X, bandwidth=bandwidth, kernel=kernel)
+
+        if within_simplex:
+            density = np.zeros(shape=p.shape[0])
+            p_mask = in_simplex(p)
+            density_vals = KDE.pdf(kde, p[p_mask], kernel=kernel)
+            density[p_mask] = density_vals
+        else:
+            density = KDE.pdf(kde, p, kernel=kernel)
+        return density
+
+    plt.rcParams.update({
+        'font.size': 15,
+        'axes.titlesize': 12,
+        'axes.labelsize': 10,
+        'xtick.labelsize': 8,
+        'ytick.labelsize': 8,
+        'legend.fontsize': 9,
+    })
+
+    dot_style = {"color": "red", "alpha": 1, "s": 100, 'linewidth': 1.5, 'edgecolors': "white"}
+    # center = np.asarray([[0.05, 0.03, 0.92],[0.5, 0.4, 0.1]])
+    center = np.asarray([0.33, 0.33, 0.34])
+    point_layer = [
+        {"points": center, "label": "kernels", "style": dot_style},
+    ]
+    density_fn = lambda p: kernel(p, center, bandwidth=0.2, kernel='gaussian', within_simplex=False)
+    plot_simplex(point_layers=point_layer, density_function=density_fn, show_legend=False,
+                 save_path='./plots/kernels/gaussian_center.png')
+
+    density_fn = lambda p: kernel(p, center, bandwidth=.6, kernel='aitchison', within_simplex=True)
+    plot_simplex(point_layers=point_layer, density_function=density_fn, show_legend=False,
+                 save_path='./plots/kernels/aitchison_center.png')
+
+    center = np.asarray([0.05, 0.05, 0.9])
+    point_layer[0]['points'] = center
+    density_fn = lambda p: kernel(p, center, bandwidth=0.2, kernel='gaussian', within_simplex=False)
+    plot_simplex(point_layers=point_layer, density_function=density_fn, show_legend=False,
+                 save_path='./plots/kernels/gaussian_vertex_005_005_09.png')
+
+    density_fn = lambda p: kernel(p, center, bandwidth=1, kernel='aitchison', within_simplex=True)
+    plot_simplex(point_layers=point_layer, density_function=density_fn, show_legend=False,
+                 save_path='./plots/kernels/aitchison_vertex_005_005_09.png')
+
+    center = np.asarray([0.45, 0.45, 0.1])
+    point_layer[0]['points'] = center
+    density_fn = lambda p: kernel(p, center, bandwidth=0.2, kernel='gaussian', within_simplex=False)
+    plot_simplex(point_layers=point_layer, density_function=density_fn, show_legend=False,
+                 save_path='./plots/kernels/gaussian_border_045_045_01.png')
+
+    density_fn = lambda p: kernel(p, center, bandwidth=.7, kernel='aitchison', within_simplex=True)
+    plot_simplex(point_layers=point_layer, density_function=density_fn, show_legend=False,
+                 save_path='./plots/kernels/aitchison_border_045_045_01.png')
+
+
 if __name__ == '__main__':
    np.random.seed(1)

    n = 1000
-    # alpha = [3,5,10]
-    alpha = [10,1,1]
+    alpha = [15,10,7]
    prevs = np.random.dirichlet(alpha, size=n)

    def regions():
-        confs = [0.99, 0.95, 0.90]
-        yield 'CI', [(f'{int(c*100)}%', CI(prevs, confidence_level=c).coverage) for c in confs]
+        confs = [0.9, 0.95, 0.99]
+        # yield 'CI', [(f'{int(c*100)}%', CI(prevs, confidence_level=c).coverage) for c in confs]
        # yield 'CI-b', [(f'{int(c * 100)}%', CI(prevs, confidence_level=c, bonferroni_correction=True).coverage) for c in confs]
+        # yield 'CI-CLR', [(f'{int(c * 100)}%', CICLR(prevs, confidence_level=c).coverage) for c in confs]
+        # yield 'CI-CLR-b', [(f'{int(c * 100)}%', CICLR(prevs, confidence_level=c, bonferroni_correction=True).coverage) for c in confs]
+        # yield 'CI-ILR', [(f'{int(c * 100)}%', CIILR(prevs, confidence_level=c).coverage) for c in confs]
+        yield 'CI-ILR-b', [(f'{int(c * 100)}%', CIILR(prevs, confidence_level=c, bonferroni_correction=True).coverage) for c in confs]
        # yield 'CE', [(f'{int(c*100)}%', CE(prevs, confidence_level=c).coverage) for c in confs]
-        # yield 'CLR', [(f'{int(c*100)}%', CLR(prevs, confidence_level=c).coverage) for c in confs]
-        # yield 'ILR', [(f'{int(c*100)}%', ILR(prevs, confidence_level=c).coverage) for c in confs]
-
-    # resolution = 1000
-    # alpha_str = ','.join([f'{str(i)}' for i in alpha])
-    # for crname, cr in regions():
-    #     plot_prev_points(prevs, show_mean=True, show_legend=False, region=cr, region_resolution=resolution,
-    #                      color='blue',
-    #                      save_path=f'./plots/simplex_{crname}_alpha{alpha_str}_res{resolution}.png',
-    #                      )
-
-
-    def regions():
-        confs = [0.99, 0.95, 0.90]
-        yield 'CI', [(f'{int(c*100)}%', CI(prevs, confidence_level=c).coverage) for c in confs]
-        # yield 'CI-b', [(f'{int(c * 100)}%', CI(prevs, confidence_level=c, bonferroni_correction=True).coverage) for c in confs]
-        # yield 'CE', [(f'{int(c*100)}%', CE(prevs, confidence_level=c).coverage) for c in confs]
-        # yield 'CLR', [(f'{int(c*100)}%', CLR(prevs, confidence_level=c).coverage) for c in confs]
-        # yield 'ILR', [(f'{int(c*100)}%', ILR(prevs, confidence_level=c).coverage) for c in confs]
+        # yield 'CE-CLR', [(f'{int(c*100)}%', CLR(prevs, confidence_level=c).coverage) for c in confs]
+        # yield 'CE-ILR', [(f'{int(c*100)}%', ILR(prevs, confidence_level=c).coverage) for c in confs]

    resolution = 1000
    alpha_str = ','.join([f'{str(i)}' for i in alpha])
-    region  = ILR(prevs, confidence_level=.99)
-    p = np.asarray([0.1, 0.8, 0.1])
-    plot_prev_points(prevs, show_samples=False,
-                     show_mean=region.mean_,
-                     # show_mean=prevs.mean(axis=0),
-                     show_legend=False, region=[('', region.coverage)], region_resolution=resolution,
-                     color='blue',
-                     true_prev=p,
-                     train_prev=region.closest_point_in_region(p),
-                     save_path=f'./plots3/simplex_ilr.png',
-                     )
+
+    dot_style = {"color": "gray", "alpha": .5, "s": 15, 'linewidth': .25, 'edgecolors': "black"}
+    point_layer = [
+        {"points": prevs, "label": "points", "style": dot_style},
+    ]
+
+    for crname, cr in regions():
+        region = [{'fn': fn, 'alpha':.6, 'label':label} for label, fn in cr]
+
+        plot_simplex(point_layers=point_layer, region_layers=region, show_legend=False, resolution=resolution, save_path=f'./plots/regions/{crname}.png')
+
+    # def regions():
+    #     confs = [0.99, 0.95, 0.90]
+    #     yield 'CI', [(f'{int(c*100)}%', CI(prevs, confidence_level=c).coverage) for c in confs]
+        # yield 'CI-b', [(f'{int(c * 100)}%', CI(prevs, confidence_level=c, bonferroni_correction=True).coverage) for c in confs]
+        # yield 'CE', [(f'{int(c*100)}%', CE(prevs, confidence_level=c).coverage) for c in confs]
+        # yield 'CLR', [(f'{int(c*100)}%', CLR(prevs, confidence_level=c).coverage) for c in confs]
+        # yield 'ILR', [(f'{int(c*100)}%', ILR(prevs, confidence_level=c).coverage) for c in confs]
+
+    # resolution = 100
+    # alpha_str = ','.join([f'{str(i)}' for i in alpha])
+    # region  = CI(prevs, confidence_level=.95, bonferroni_correction=True)
+    # p = None # np.asarray([0.1, 0.8, 0.1])
+    # plot_prev_points(prevs,
+    #                  show_samples=True,
+    #                  show_mean=None,
+    #                  # show_mean=prevs.mean(axis=0),
+    #                  show_legend=False,
+    #                  # region=[('', region.coverage)],
+    #                  # region_resolution=resolution,
+    #                  color='blue',
+    #                  true_prev=p,
+    #                  # train_prev=region.closest_point_in_region(p),
+    #                  save_path=f'./plots/prior_test/uniform.png',
+    #                  )
+
+    plt.rcParams.update({
+        'font.size': 10,
+        'axes.titlesize': 12,
+        'axes.labelsize': 10,
+        'xtick.labelsize': 8,
+        'ytick.labelsize': 8,
+        'legend.fontsize': 9,
+    })
+
+    n = 1000
+    train_style = {"color": "blue", "alpha": 0.5, "s":15, 'linewidth':0.5, 'edgecolors':None}
+    test_style = {"color": "red", "alpha": 0.5, "s": 15, 'linewidth': 0.5, 'edgecolors': None}
+
+
+    # train_prevs = np.random.dirichlet(alpha=[1, 1, 1], size=n)
+    # test_prevs = np.random.dirichlet(alpha=[1, 1, 1], size=n)
+    # plot_simplex(
+    #     point_layers=[
+    #         {"points": train_prevs, "label": "train", "style": train_style},
+    #         {"points": test_prevs, "label": "test", "style": test_style},
+    #     ],
+    #     save_path=f'./plots/prior_test/uniform.png'
+    # )
+    #
+    # alpha = [40, 10, 10]
+    # train_prevs = np.random.dirichlet(alpha=alpha, size=n)
+    # test_prevs = np.random.dirichlet(alpha=alpha, size=n)
+    # plot_simplex(
+    #     point_layers=[
+    #         {"points": train_prevs, "label": "train", "style": train_style},
+    #         {"points": test_prevs, "label": "test", "style": test_style},
+    #     ],
+    #     save_path=f'./plots/prior_test/informative.png'
+    # )
+
+    # train_prevs = np.random.dirichlet(alpha=[8, 1, 1], size=n)
+    # test_prevs = np.random.dirichlet(alpha=[1, 8, 1], size=n)
+    # plot_simplex(
+    #     point_layers=[
+    #         {"points": train_prevs, "label": "train", "style": train_style},
+    #         {"points": test_prevs, "label": "test", "style": test_style},
+    #     ],
+    #     save_path=f'./plots/prior_test/wrong.png'
+    # )
+
+    # p = 0.6
+    #
+    # K = 3
+    # # alpha = [p] + [(1. - p) / (K - 1)] * (K - 1)
+    # alpha = [0.095, 0.246, 0.658]  # connect-4
+    # alpha = np.array(alpha)
+    #
+    #
+    # for c in [50, 500, 5_000]:
+    #     alpha_tr = alpha * c
+    #     alpha_te = antagonistic_prevalence(alpha, strength=1) * c
+    #     train_prevs = np.random.dirichlet(alpha=alpha_tr, size=n)
+    #     test_prevs = np.random.dirichlet(alpha=alpha_te, size=n)
+    #     plot_simplex(
+    #         point_layers=[
+    #             {"points": train_prevs, "label": "train", "style": train_style},
+    #             {"points": test_prevs, "label": "test", "style": test_style},
+    #         ],
+    #         save_path=f'./plots/prior_test/concentration_{c}.png'
+    #     )
+
+    # plot_kernels()
--- a/BayesianKDEy/prior_effect.py
+++ b/BayesianKDEy/prior_effect.py
@ -0,0 +1,297 @@
+from collections import defaultdict
+import pandas as pd
+
+import model_selection
+import quapy as qp
+from BayesianKDEy._bayeisan_kdey import BayesianKDEy
+from BayesianKDEy.temperature_calibration import temp_calibration
+from commons import *
+from data import Dataset
+from protocol import DirichletProtocol
+from quapy.method.confidence import BayesianCC
+from quapy.method.aggregative import ACC, AggregativeQuantifier
+from sklearn.linear_model import LogisticRegression as LR
+from copy import deepcopy as cp
+from tqdm import tqdm
+from full_experiments import model_selection
+from itertools import chain
+
+
+def select_imbalanced_datasets(top_m=10):
+    datasets_prevs = []
+    # choose top-m imbalanced datasets
+    for data_name in multiclass['datasets']:
+        data_prev = multiclass['fetch_fn'](data_name).training.prevalence()
+        balance = normalized_entropy(data_prev)
+        datasets_prevs.append((data_name, balance))
+    datasets_prevs.sort(key=lambda x: x[1])
+    data_selected = [data_name for data_name, balance in datasets_prevs[:top_m]]
+    return data_selected
+
+
+def methods():
+    acc_hyper = {}
+    kdey_hyper = {'bandwidth': [0.001, 0.005, 0.01, 0.05, 0.1, 0.2]}
+    kdey_hyper_clr = {'bandwidth': [0.05, 0.1, 0.5, 1., 2., 5.]}
+
+    yield 'BayesianACC', ACC(LR()), acc_hyper, lambda hyper: BayesianCC(LR(), mcmc_seed=0, prior='uniform')
+    yield f'BaKDE-Ait', KDEyCLR(LR()), kdey_hyper_clr, lambda hyper: BayesianKDEy(kernel='aitchison', mcmc_seed=0, engine='numpyro', temperature=None, prior='uniform', **hyper)
+    yield f'BaKDE-Ait-T2', KDEyCLR(LR()), kdey_hyper_clr, lambda hyper: BayesianKDEy(kernel='aitchison', mcmc_seed=0,
+                                                                                  engine='numpyro', temperature=2.,
+                                                                                  prior='uniform', **hyper)
+    yield f'BaKDE-Ait-T1', KDEyCLR(LR()), kdey_hyper_clr, lambda hyper: BayesianKDEy(kernel='aitchison', mcmc_seed=0,
+                                                                                  engine='numpyro', temperature=1.,
+                                                                                  prior='uniform', **hyper)
+
+
+def run_test(test, alpha_test, alpha_train, concentration, prior_type, bay_quant, train_prev, dataset_name, method_name, results):
+    test_generator = DirichletProtocol(test, alpha=alpha_test, repeats=100, random_state=0)
+    for i, (sample_X, true_prev) in tqdm(enumerate(test_generator()), total=test_generator.total(),
+                                               desc=f'{method_name} {prior_type} alpha with {concentration=}'):
+        estim_prev, region = bay_quant.predict_conf(sample_X)
+
+        results['dataset'].append(dataset_name)
+        results['method_name'].append(method_name)
+        results['prior-type'].append(prior_type)
+        results['train-prev'].append(train_prev)
+        results['concentration'].append(concentration)
+        results['train-alpha'].append(alpha_train)
+        results['test-alpha'].append(alpha_test)
+        results['true-prevs'].append(true_prev)
+        results['point-estim'].append(estim_prev)
+        results['shift'].append(qp.error.ae(true_prev, train_prev))
+        results['ae'].append(qp.error.ae(prevs_true=true_prev, prevs_hat=estim_prev))
+        results['sre'].append(qp.error.sre(prevs_true=true_prev, prevs_hat=estim_prev, prevs_train=train_prev))
+        results['rae'].append(qp.error.rae(prevs_true=true_prev, prevs_hat=estim_prev))
+        results['coverage'].append(region.coverage(true_prev))
+        results['amplitude'].append(region.montecarlo_proportion(n_trials=50_000))
+        results['samples'].append(region.samples)
+
+
+
+def experiment(dataset: Dataset,
+               dataset_name: str,
+               point_quantifier: AggregativeQuantifier,
+               grid: dict,
+               bay_constructor,
+               method_name:str,
+               hyper_choice_path: Path):
+
+    with qp.util.temp_seed(0):
+
+        training, test = dataset.train_test
+
+        # model selection
+        best_hyperparams = qp.util.pickled_resource(
+            hyper_choice_path, model_selection, training, cp(point_quantifier), grid
+        )
+
+        bay_quant = bay_constructor(best_hyperparams)
+        if hasattr(bay_quant, 'temperature') and bay_quant.temperature is None:
+            train, val = data.training.split_stratified(train_prop=0.6, random_state=0)
+            temperature = temp_calibration(bay_quant, train, val, temp_grid=[.5, 1., 1.5, 2., 5., 10., 100.], n_jobs=-1)
+            bay_quant.temperature = temperature
+        bay_quant.fit(*training.Xy)
+
+        # test
+        train_prev = training.prevalence()
+        results = defaultdict(list)
+
+        for concentration in [50, 500, 5_000]:
+            alpha_train = train_prev * concentration
+            bay_quant.prior = alpha_train
+
+            # informative prior
+            alpha_test_informative = alpha_train
+            prior_type = 'informative'
+            run_test(test, alpha_test_informative, alpha_train, concentration, prior_type, bay_quant, train_prev, dataset_name, method_name, results)
+
+            # informative prior
+            alpha_test_wrong = antagonistic_prevalence(train_prev, strength=0.25) * concentration
+            prior_type = 'wrong'
+            run_test(test, alpha_test_wrong, alpha_train, concentration, prior_type, bay_quant, train_prev, dataset_name, method_name, results)
+
+        return results
+
+
+def concat_reports(reports):
+    final_report = {
+        k: list(chain.from_iterable(report[k] for report in reports))
+        for k in reports[0]
+    }
+    df = pd.DataFrame(final_report)
+    return df
+
+
+def error_vs_concentration_plot(df, err='ae', save_path=None):
+    import seaborn as sns
+    import matplotlib.pyplot as plt
+
+    sns.set_theme(style="whitegrid", context="paper")
+
+    fig, axes = plt.subplots(1, 2, figsize=(10, 4), sharey=True)
+
+    for ax, prior in zip(axes, ['informative', 'wrong']):
+        sub = df[df['prior-type'] == prior]
+
+        sns.lineplot(
+            data=sub,
+            x='concentration',
+            y=err,
+            hue='method_name',
+            marker='o',
+            errorbar='se',  # o 'sd'
+            ax=ax
+        )
+
+        ax.set_xscale('log')
+        ax.set_title(f'Prior: {prior}')
+        ax.set_xlabel('Concentration')
+        ax.set_ylabel('M'+err.upper())
+
+    plt.tight_layout()
+    if save_path is None:
+        plt.show()
+    else:
+        os.makedirs(Path(save_path).parent, exist_ok=True)
+        plt.savefig(save_path)
+
+
+def coverage_vs_concentration_plot(df, save_path=None):
+    import seaborn as sns
+    import matplotlib.pyplot as plt
+
+    fig, axes = plt.subplots(1, 2, figsize=(10, 4), sharey=True)
+
+    for ax, prior in zip(axes, ['informative', 'wrong']):
+        sub = df[df['prior-type'] == prior]
+
+        sns.lineplot(
+            data=sub,
+            x='concentration',
+            y='coverage',
+            hue='method_name',
+            marker='o',
+            errorbar='se',
+            ax=ax
+        )
+
+        ax.set_xscale('log')
+        ax.set_ylim(0, 1.05)
+        ax.set_title(f'Prior: {prior}')
+        ax.set_xlabel('Concentration')
+        ax.set_ylabel('Coverage')
+
+    plt.tight_layout()
+    if save_path is None:
+        plt.show()
+    else:
+        os.makedirs(Path(save_path).parent, exist_ok=True)
+        plt.savefig(save_path)
+
+
+def amplitude_vs_concentration_plot(df, save_path=None):
+    import seaborn as sns
+    import matplotlib.pyplot as plt
+
+    fig, axes = plt.subplots(1, 2, figsize=(10, 4), sharey=True)
+
+    for ax, prior in zip(axes, ['informative', 'wrong']):
+        sub = df[df['prior-type'] == prior]
+
+        sns.lineplot(
+            data=sub,
+            x='concentration',
+            y='amplitude',
+            hue='method_name',
+            marker='o',
+            errorbar='se',
+            ax=ax
+        )
+
+        ax.set_xscale('log')
+        ax.set_title(f'Prior: {prior}')
+        ax.set_xlabel('Concentration')
+        ax.set_ylabel('Amplitude')
+
+    plt.tight_layout()
+    if save_path is None:
+        plt.show()
+    else:
+        os.makedirs(Path(save_path).parent, exist_ok=True)
+        plt.savefig(save_path)
+
+
+def coverage_vs_amplitude_plot(df, save_path=None):
+    import seaborn as sns
+    import matplotlib.pyplot as plt
+
+    agg = (
+        df
+        .groupby(['prior-type', 'method_name', 'concentration'])
+        .agg(
+            coverage=('coverage', 'mean'),
+            amplitude=('amplitude', 'mean')
+        )
+        .reset_index()
+    )
+
+    fig, axes = plt.subplots(1, 2, figsize=(10, 4), sharey=True)
+
+    for ax, prior in zip(axes, ['informative', 'wrong']):
+        sub = agg[agg['prior-type'] == prior]
+
+        sns.scatterplot(
+            data=sub,
+            x='amplitude',
+            y='coverage',
+            hue='method_name',
+            style='concentration',
+            s=80,
+            ax=ax
+        )
+
+        ax.set_ylim(0, 1.05)
+        ax.set_title(f'Prior: {prior}')
+        ax.set_xlabel('Amplitude')
+        ax.set_ylabel('Coverage')
+        ax.axhline(0.95, linestyle='--', color='gray', alpha=0.7)
+
+    plt.tight_layout()
+    if save_path is None:
+        plt.show()
+    else:
+        os.makedirs(Path(save_path).parent, exist_ok=True)
+        plt.savefig(save_path)
+
+
+if __name__ == '__main__':
+    result_dir = Path('./results/prior_effect')
+    selected = select_imbalanced_datasets(10)
+    print(f'selected datasets={selected}')
+    qp.environ['SAMPLE_SIZE'] = multiclass['sample_size']
+    reports = []
+    for data_name in selected:
+        data = multiclass['fetch_fn'](data_name)
+        for method_name, surrogate_quant, hyper_params, bay_constructor in methods():
+            result_path = experiment_path(result_dir, data_name, method_name)
+            hyper_path = experiment_path(result_dir/'hyperparams', data_name, surrogate_quant.__class__.__name__)
+
+            print(f'Launching {method_name} in dataset {data_name}')
+            report = qp.util.pickled_resource(
+                result_path, experiment, data, data_name, surrogate_quant, hyper_params, bay_constructor, method_name, hyper_path
+            )
+            reports.append(report)
+
+    # concat all reports as a dataframe
+    df = concat_reports(reports)
+
+    # for data_name in selected:
+    #     print(data_name)
+    #     df_ = df[df['dataset']==data_name]
+    df_ = df
+    error_vs_concentration_plot(df_, save_path='./plots/prior_effect/error_vs_concentration.pdf')
+    coverage_vs_concentration_plot(df_, save_path='./plots/prior_effect/coverage_vs_concentration.pdf')
+    amplitude_vs_concentration_plot(df_, save_path='./plots/prior_effect/amplitude_vs_concentration.pdf')
+    coverage_vs_amplitude_plot(df_, save_path='./plots/prior_effect/coverage_vs_amplitude.pdf')
+
--- a/BayesianKDEy/single_experiment_debug.py
+++ b/BayesianKDEy/single_experiment_debug.py
@ -1,4 +1,5 @@
 import os
+import pickle
 import warnings
 from os.path import join
 from pathlib import Path
@ -9,7 +10,8 @@ from sklearn.model_selection import GridSearchCV, StratifiedKFold
 from copy import deepcopy as cp
 import quapy as qp
 from BayesianKDEy._bayeisan_kdey import BayesianKDEy
-from BayesianKDEy.full_experiments import experiment, experiment_path, KDEyCLR
+from BayesianKDEy.full_experiments import experiment
+from BayesianKDEy.commons import experiment_path, KDEyCLR
 from build.lib.quapy.data import LabelledCollection
 from quapy.method.aggregative import DistributionMatchingY as DMy, AggregativeQuantifier
 from quapy.method.base import BinaryQuantifier, BaseQuantifier
@ -27,7 +29,7 @@ from scipy.stats import dirichlet
 from collections import defaultdict
 from time import time
 from sklearn.base import clone, BaseEstimator
-
+from temperature_calibration import temp_calibration

 def methods():
    """
@ -50,7 +52,8 @@ def methods():
    # for T in [10, 20, 50, 100., 500]:
    #     yield f'BaKDE-CLR-T{T}', KDEyCLR(LR()), kdey_hyper_clr, lambda hyper: BayesianKDEy(kernel='aitchison', explore='ilr', mcmc_seed=0, temperature=T, num_warmup=3000, num_samples=1000, step_size=.1, **hyper),

-    yield f'BaKDE-emcee', KDEyML(LR()), kdey_hyper, lambda hyper: BayesianKDEy(mcmc_seed=0, num_warmup=100, num_samples=100, step_size=.1, **hyper),
+    # yield f'BaKDE-emcee', KDEyML(LR()), kdey_hyper, lambda hyper: BayesianKDEy(mcmc_seed=0, num_warmup=100, num_samples=100, step_size=.1, **hyper),
+    yield f'BaKDE-numpyro', KDEyML(LR()), kdey_hyper, lambda hyper: BayesianKDEy(mcmc_seed=0, engine='numpyro', **hyper)



@ -68,8 +71,8 @@ if __name__ == '__main__':

    setup = multiclass
    # data_name = 'isolet'
-    # data_name = 'cmc'
-    data_name = 'abalone'
+    # data_name = 'academic-success'
+    data_name = 'poker_hand'

    qp.environ['SAMPLE_SIZE'] = setup['sample_size']
    print(f'dataset={data_name}')
@ -86,6 +89,11 @@ if __name__ == '__main__':
              f'coverage={report["results"]["coverage"].mean():.5f}, '
              f'amplitude={report["results"]["amplitude"].mean():.5f}, ')

+        # best_hyperparams = pickle.load(open(hyper_path, 'rb'))
+        # method = withconf_constructor(best_hyperparams)
+        #
+        # train, val = data.training.split_stratified(train_prop=0.6, random_state=0)
+        # temp_calibration(method, train, val, amplitude_threshold=0.1475, temp_grid=[2., 10., 100., 1000.])#temp_grid=[1., 1.25, 1.5, 1.75, 2.])



--- a/BayesianKDEy/temperature_bandwidth_corr.py
+++ b/BayesianKDEy/temperature_bandwidth_corr.py
@ -0,0 +1,129 @@
+import os.path
+import pickle
+from collections import defaultdict
+from pathlib import Path
+import pandas as pd
+import quapy as qp
+from BayesianKDEy._bayeisan_kdey import BayesianKDEy
+from BayesianKDEy.commons import experiment_path
+from quapy.protocol import UPP
+import numpy as np
+from tqdm import tqdm
+from joblib import Parallel, delayed
+from itertools import product
+
+# is there a correspondence between smaller bandwidths and overconfident likelihoods? if so, a high temperature
+# after calibration might be correlated; this script aims at analyzing this trend
+
+datasets = qp.datasets.UCI_MULTICLASS_DATASETS
+
+def show(results, values):
+    df_res = pd.DataFrame(results)
+    df_mean = (
+        df_res
+        .groupby(['bandwidth', 'temperature'], as_index=False)
+        .mean(numeric_only=True)
+    )
+    pivot = df_mean.pivot(
+        index='bandwidth',
+        columns='temperature',
+        values=values
+    )
+    import matplotlib.pyplot as plt
+
+
+    plt.imshow(pivot, origin='lower', aspect='auto')
+    plt.colorbar(label=values)
+
+    plt.xticks(range(len(pivot.columns)), pivot.columns)
+    plt.yticks(range(len(pivot.index)), pivot.index)
+
+    plt.xlabel('Temperature')
+    plt.ylabel('Bandwidth')
+    plt.title(f'{values} heatmap')
+
+    plt.savefig(f'./plotcorr/{values}.png')
+    plt.cla()
+    plt.clf()
+
+def run_experiment(
+    bandwidth,
+    temperature,
+    train,
+    test,
+    dataset,
+):
+    qp.environ['SAMPLE_SIZE'] = 1000
+    bakde = BayesianKDEy(
+        engine='numpyro',
+        bandwidth=bandwidth,
+        temperature=temperature,
+    )
+    bakde.fit(*train.Xy)
+
+    test_generator = UPP(test, repeats=20, random_state=0)
+
+    rows = []
+
+    for i, (sample, prev) in enumerate(
+        tqdm(
+            test_generator(),
+            desc=f"bw={bandwidth}, T={temperature}",
+            total=test_generator.total(),
+            leave=False,
+        )
+    ):
+        point_estimate, region = bakde.predict_conf(sample)
+
+        rows.append({
+            "bandwidth": bandwidth,
+            "temperature": temperature,
+            "dataset": dataset,
+            "sample": i,
+            "mae": qp.error.mae(prev, point_estimate),
+            "cov": region.coverage(prev),
+            "amp": region.montecarlo_proportion(n_trials=50_000),
+        })
+
+    return rows
+
+
+bandwidths = [0.001, 0.005, 0.01, 0.05, 0.1]
+temperatures = [0.5, 0.75, 1., 2., 5.]
+
+res_dir = './plotcorr/results'
+os.makedirs(res_dir, exist_ok=True)
+
+all_rows = []
+for i, dataset in enumerate(datasets):
+    if dataset in ['letter', 'isolet']: continue
+    res_path = f'{res_dir}/{dataset}.pkl'
+    if os.path.exists(res_path):
+        print(f'loading results from pickle {res_path}')
+        results_data_rows = pickle.load(open(res_path, 'rb'))
+    else:
+        print(f'{dataset=} [complete={i}/{len(datasets)}]')
+        data = qp.datasets.fetch_UCIMulticlassDataset(dataset)
+        train, test = data.train_test
+        jobs = list(product(bandwidths, temperatures))
+        results_data_rows = Parallel(n_jobs=-1,backend="loky")(
+            delayed(run_experiment)(bw, T, train, test, dataset) for bw, T in jobs
+        )
+        pickle.dump(results_data_rows, open(res_path, 'wb'), pickle.HIGHEST_PROTOCOL)
+    all_rows.extend(results_data_rows)
+
+results = defaultdict(list)
+for rows in all_rows:
+    for row in rows:
+        for k, v in row.items():
+            results[k].append(v)
+
+show(results, 'mae')
+show(results, 'cov')
+show(results, 'amp')
+
+
+
+
+
+
--- a/BayesianKDEy/temperature_calibration.py
+++ b/BayesianKDEy/temperature_calibration.py
@ -0,0 +1,116 @@
+from build.lib.quapy.data import LabelledCollection
+from quapy.method.confidence import WithConfidenceABC
+from quapy.protocol import AbstractProtocol
+import numpy as np
+from tqdm import tqdm
+import quapy as qp
+from joblib import Parallel, delayed
+import copy
+
+
+def temp_calibration(method:WithConfidenceABC,
+                     train:LabelledCollection,
+                     val_prot:AbstractProtocol,
+                     temp_grid=[.5, 1., 1.5, 2., 5., 10., 100.],
+                     nominal_coverage=0.95,
+                     amplitude_threshold=1.,
+                     criterion='winkler',
+                     n_jobs=1,
+                     verbose=True):
+
+    assert (amplitude_threshold == 'auto' or (isinstance(amplitude_threshold, float)) and amplitude_threshold <= 1.), \
+        f'wrong value for {amplitude_threshold=}, it must either be "auto" or a float <= 1.0.'
+    assert criterion in {'auto', 'winkler'}, f'unknown {criterion=}; valid ones are auto or winkler'
+
+    if amplitude_threshold=='auto':
+        n_classes = train.n_classes
+        amplitude_threshold = .1/np.log(n_classes+1)
+
+    if isinstance(amplitude_threshold, float) and amplitude_threshold > 0.1:
+        print(f'warning: the {amplitude_threshold=} is too large; this may lead to uninformative regions')
+
+    def _evaluate_temperature_job(job_id, temp):
+        # if verbose:
+        #     print(f'\tstarting exploration with temperature={temp}...')
+
+        local_method = copy.deepcopy(method)
+        local_method.temperature = temp
+
+        coverage = 0
+        amplitudes = []
+        winklers = []
+        # errs = []
+
+        pbar = tqdm(enumerate(val_prot()), position=job_id, total=val_prot.total(), disable=not verbose)
+
+        for i, (sample, prev) in pbar:
+            point_estim, conf_region = local_method.predict_conf(sample)
+
+            if prev in conf_region:
+                coverage += 1
+
+            amplitudes.append(conf_region.montecarlo_proportion(n_trials=50_000))
+            winkler = None
+            if criterion=='winkler':
+                winkler = conf_region.mean_winkler_score(true_prev=prev, alpha=0.005)
+            winklers.append(winkler)
+
+            # errs.append(qp.error.mae(prev, point_estim))
+            pbar.set_description(
+                f'job={job_id} T={temp}: '
+                f'coverage={coverage/(i+1)*100:.2f}% '
+                f'amplitude={np.mean(amplitudes)*100:.4f}% '
+                + f'winkler={np.mean(winklers):.4f}%' if criterion=='winkler' else ''
+            )
+
+        mean_coverage = coverage / val_prot.total()
+        mean_amplitude = np.mean(amplitudes)
+        winkler_mean = np.mean(winklers) if criterion=='winkler' else None
+
+        # if verbose:
+        #     print(
+        #         f'Temperature={temp} got '
+        #         f'coverage={mean_coverage*100:.2f}% '
+        #         f'amplitude={mean_amplitude*100:.2f}% '
+        #         + f'winkler={winkler_mean:.4f}' if criterion == 'winkler' else ''
+        #     )
+
+        return temp, mean_coverage, mean_amplitude, winkler_mean
+
+    temp_grid = sorted(temp_grid)
+    method.fit(*train.Xy)
+
+    raw_results = Parallel(n_jobs=n_jobs, backend="loky")(
+        delayed(_evaluate_temperature_job)(job_id, temp)
+        for job_id, temp in tqdm(enumerate(temp_grid), disable=not verbose)
+    )
+    results = [
+        (temp, cov, amp, wink)
+        for temp, cov, amp, wink in raw_results
+        if amp < amplitude_threshold
+    ]
+
+    chosen_temperature = 1.
+    if len(results) > 0:
+        if criterion=='winkler':
+            # choose min winkler
+            chosen_temperature, ccov, camp, cwink = min(results, key=lambda x: x[3])
+        else:
+            # choose best coverage (regardless of amplitude), i.e., closest to nominal
+            chosen_temperature, ccov, camp, cwink = min(results, key=lambda x: abs(x[1]-nominal_coverage))
+
+    if verbose:
+        print(
+            f'\nChosen_temperature={chosen_temperature:.2f} got '
+            f'coverage={ccov*100:.2f}% '
+            f'amplitude={camp*100:.4f}% '
+            + f'winkler={cwink:.4f}' if criterion=='winkler' else ''
+        )
+
+    return chosen_temperature
+
+
+
+
+
+
--- a/CHANGE_LOG.txt
+++ b/CHANGE_LOG.txt
@ -1,13 +1,21 @@
 Change Log 0.2.1
 -----------------

- Added squared ratio error.
+- Added mechanisms for Temperature Calibration for coverage
+- Added MAPLS and BayesianEMQ
+- Added DirichletProtocol, which allows to generate samples according to a parameterized Dirichlet prior.
 - Improved efficiency of confidence regions coverage functions
 - Added Precise Quantifier to WithConfidence methods (a Bayesian adaptation of HDy)
+- Added Temperature parameter to BayesianCC
 - Improved documentation of confidence regions.
 - Added ReadMe method by Daniel Hopkins and Gary King
 - Internal index in LabelledCollection is now "lazy", and is only constructed if required.
- I have added dist_aitchison and mean_dist_aitchison as a new error evaluation metric
+- Added new error metrics:
+    - dist_aitchison and mean_dist_aitchison as a new error evaluation metric.
+    - squared ratio error.
+- Improved numerical stability of KDEyML through logsumexp; useful for cases with large number of classes, where
+    densities for small bandwidths may become huge.
+

 Change Log 0.2.0
 -----------------
--- a/TODO.txt
+++ b/TODO.txt
@ -1,6 +1,11 @@
 Adapt examples; remaining: example 4-onwards
 not working: 15 (qunfold)

+Unify ConfidenceIntervalsTransformation with ConfidenceEllipseTransformation
+
+Unify functionality of withconfidence methods; the predict_conf has a clear similar structure across all 
+    variants, and should be unified in the super class
+
 Solve the warnings issue; right now there is a warning ignore in method/__init__.py:

 Add 'platt' to calib options in EMQ?
@ -46,14 +51,25 @@ Para quitar el labelledcollection de los métodos:
                - proporción en [0,1]
        - fit_classifier=False:

-
-
+- [TODO] add RLSbench?:
+    - https://arxiv.org/pdf/2302.03020
+    - https://github.com/acmi-lab/RLSbench
+- [TODO] check Table shift
+    - https://proceedings.neurips.cc/paper_files/paper/2023/hash/a76a757ed479a1e6a5f8134bea492f83-Abstract-Datasets_and_Benchmarks.html
+- [TODO] have a look at TorchDrift
+    - https://torchdrift.org/
+- [TODO] have a look at
+    - https://github.com/SeldonIO/alibi-detect/
+- [TODO] check if the KDEyML variant with sumlogexp is slower than the original one, or check whether we can explore
+    an unconstrained space in which the parameter is already the log(prev); maybe also move to cvxq
+- [TODO] why not simplifying the epsilon of RAE? at the end, it is meant to smooth the denominator for avoiding div 0
 - [TODO] document confidence in manuals
 - [TODO] Test the return_type="index" in protocols and finish the "distributing_samples.py" example
 - [TODO] Add EDy (an implementation is available at quantificationlib)
 - [TODO] add ensemble methods SC-MQ, MC-SQ, MC-MQ
 - [TODO] add HistNetQ
- [TODO] add CDE-iteration and Bayes-CDE methods
+- [TODO] add CDE-iteration (https://github.com/Arctickirillas/Rubrication/blob/master/quantification.py#L593 or
+    Schumacher's code) and Bayes-CDE methods
 - [TODO] add Friedman's method and DeBias
 - [TODO] check ignore warning stuff
    check https://docs.python.org/3/library/warnings.html#temporarily-suppressing-warnings
--- a/docs/source/manuals/datasets.md
+++ b/docs/source/manuals/datasets.md
@ -294,7 +294,7 @@ The datasets correspond to a part of the datasets that can be retrieved from the
 * containing at least 1,000 instances
 * can be imported using the Python API. 

-Some statistics about these datasets are displayed below :
+Some statistics about these datasets (after applying default filters) are displayed below :

 | **Dataset** | **classes** | **instances** | **features** | **prevs** | **type** |
 |:------------|:-----------:|:-------------:|:------------:|:----------|:--------:|
--- a/quapy/data/_lequa.py
+++ b/quapy/data/_lequa.py
@ -99,6 +99,9 @@ class SamplesFromDir(AbstractProtocol):
            sample, _ = self.load_fn(os.path.join(self.path_dir, f'{id}.txt'))
            yield sample, prevalence

+    def total(self):
+        return len(self.true_prevs)
+

 class LabelledCollectionsFromDir(AbstractProtocol):

@ -113,6 +116,9 @@ class LabelledCollectionsFromDir(AbstractProtocol):
            lc = LabelledCollection.load(path=collection_path, loader_func=self.load_fn)
            yield lc

+    def total(self):
+        return len(self.true_prevs)
+

 class ResultSubmission:

--- a/quapy/data/datasets.py
+++ b/quapy/data/datasets.py
@ -663,8 +663,8 @@ def fetch_UCIMulticlassLabelledCollection(dataset_name, data_home=None, min_clas
    :param dataset_name: a dataset name
    :param data_home: specify the quapy home directory where the dataset will be dumped (leave empty to use the default
        ~/quay_data/ directory)
-    :param min_class_support: minimum number of istances per class. Classes with fewer instances
-        are discarded (deafult is 100)
+    :param min_class_support: integer or float, the minimum number or proportion of istances per class.
+        Classes with fewer instances are discarded (deafult is 100).
    :param standardize: indicates whether the covariates should be standardized or not (default is True). 
    :param verbose: set to True (default is False) to get information (stats) about the dataset
    :return: a :class:`quapy.data.base.LabelledCollection` instance
@ -673,7 +673,12 @@ def fetch_UCIMulticlassLabelledCollection(dataset_name, data_home=None, min_clas
        f'Name {dataset_name} does not match any known dataset from the ' \
        f'UCI Machine Learning datasets repository (multiclass). ' \
        f'Valid ones are {UCI_MULTICLASS_DATASETS}'
-    
+
+    assert (min_class_support is None or
+            ((isinstance(min_class_support, int) and min_class_support>=0) or
+            (isinstance(min_class_support, float) and 0. <= min_class_support < 1.))), \
+        f'invalid value for {min_class_support=}; expected non negative integer or float in [0,1)'
+
    if data_home is None:
        data_home = get_quapy_home()
    
@ -738,27 +743,44 @@ def fetch_UCIMulticlassLabelledCollection(dataset_name, data_home=None, min_clas
        print(f'Loading UCI Muticlass {dataset_name} ({fullname})')

    file = join(data_home, 'uci_multiclass', dataset_name+'.pkl')
-    
+
+    def dummify_categorical_features(df_features, dataset_id):
+        CATEGORICAL_FEATURES = {
+            158: ["S1", "C1", "S2", "C2", "S3", "C3", "S4", "C4", "S5", "C5"],  # poker_hand
+        }
+
+        categorical = CATEGORICAL_FEATURES.get(dataset_id, [])
+
+        X = df_features.copy()
+        if categorical:
+            X[categorical] = X[categorical].astype("category")
+            X = pd.get_dummies(X, columns=categorical, drop_first=True)
+
+        return X
+
    def download(id, name):
        df = fetch_ucirepo(id=id)

-        df.data.features = pd.get_dummies(df.data.features, drop_first=True)
-        X, y = df.data.features.to_numpy(dtype=np.float64), df.data.targets.to_numpy().squeeze()
+        X_df = dummify_categorical_features(df.data.features, id)
+        X = X_df.to_numpy(dtype=np.float64)
+        y = df.data.targets.to_numpy().squeeze()

-        assert y.ndim == 1, 'more than one y'
+        assert y.ndim == 1, f'error: the dataset {id=} {name=} has more than one target variable'

        classes = np.sort(np.unique(y))
        y = np.searchsorted(classes, y)
        return LabelledCollection(X, y)

-    def filter_classes(data: LabelledCollection, min_ipc):
-        if min_ipc is None:
-            min_ipc = 0
+    def filter_classes(data: LabelledCollection, min_class_support):
+        if min_class_support is None or min_class_support == 0.:
+            return data
+        if isinstance(min_class_support, float):
+            min_class_support = int(len(data) * min_class_support)
        classes = data.classes_
-        # restrict classes to only those with at least min_ipc instances
-        classes = classes[data.counts() >= min_ipc]
+        # restrict classes to only those with at least min_class_support instances
+        classes = classes[data.counts() >= min_class_support]
        # filter X and y keeping only datapoints belonging to valid classes
-        filter_idx = np.in1d(data.y, classes)
+        filter_idx = np.isin(data.y, classes)
        X, y = data.X[filter_idx], data.y[filter_idx]
        # map classes to range(len(classes))
        y = np.searchsorted(classes, y)
--- a/quapy/error.py
+++ b/quapy/error.py
@ -139,7 +139,8 @@ def sre(prevs_true, prevs_hat, prevs_train, eps=0.):
     :math:`\\mathcal{Y}` are the classes of interest
    :param prevs_true: array-like, true prevalence values
    :param prevs_hat: array-like, estimated prevalence values
-    :param prevs_train: array-like, training prevalence values
+    :param prevs_train: array-like with the training prevalence values, or a single prevalence vector when
+        all the prevs_true and prevs_hat are computed on the same training set.
    :param eps: float, for smoothing the prevalence values. It is 0 by default (no smoothing), meaning the
        training prevalence is expected to be >0 everywhere.
    :return: the squared ratio error
@ -149,6 +150,8 @@ def sre(prevs_true, prevs_hat, prevs_train, eps=0.):
    prevs_train = np.asarray(prevs_train)
    assert prevs_true.shape == prevs_hat.shape, f'wrong shape {prevs_true.shape=} vs {prevs_hat.shape=}'
    assert prevs_true.shape[-1]==prevs_train.shape[-1], 'wrong shape for training prevalence'
+    if prevs_true.ndim == 2 and prevs_train.ndim == 1:
+        prevs_train = np.tile(prevs_train, reps=(prevs_true.shape[0], 1))
    if eps>0:
        prevs_true = smooth(prevs_true, eps)
        prevs_hat = smooth(prevs_hat, eps)
@ -157,12 +160,13 @@ def sre(prevs_true, prevs_hat, prevs_train, eps=0.):
    N = prevs_true.shape[-1]
    w = prevs_true / prevs_train
    w_hat = prevs_hat / prevs_train
-    return (1./N) * (w - w_hat)**2.
+    return (1./N) * np.sum((w - w_hat)**2., axis=-1)


 def msre(prevs_true, prevs_hat, prevs_train, eps=0.):
    """
-    Returns the mean across all experiments. See :function:`sre`.
+    Returns the mean :function:`sre` across all experiments.
+
    :param prevs_true: array-like, true prevalence values of shape (n_experiments, n_classes,) or (n_classes,)
    :param prevs_hat: array-like, estimated prevalence values of shape equal to prevs_true
    :param prevs_train: array-like, training prevalence values of  (n_experiments, n_classes,) or (n_classes,)
--- a/quapy/functional.py
+++ b/quapy/functional.py
@ -282,7 +282,7 @@ def l1_norm(prevalences: ArrayLike) -> np.ndarray:
    """
    n_classes = prevalences.shape[-1]
    accum = prevalences.sum(axis=-1, keepdims=True)
-    prevalences = np.true_divide(prevalences, accum, where=accum > 0)
+    prevalences = np.true_divide(prevalences, accum, where=accum > 0, out=None)
    allzeros = accum.flatten() == 0
    if any(allzeros):
        if prevalences.ndim == 1:
--- a/quapy/method/_bayesian.py
+++ b/quapy/method/_bayesian.py
@ -33,7 +33,7 @@ P_TEST_C: str = "P_test(C)"
 P_C_COND_Y: str = "P(C|Y)"


-def model(n_c_unlabeled: np.ndarray, n_y_and_c_labeled: np.ndarray) -> None:
+def model_bayesianCC(n_c_unlabeled: np.ndarray, n_y_and_c_labeled: np.ndarray, temperature:float, alpha: np.ndarray) -> None:
    """
    Defines a probabilistic model in `NumPyro <https://num.pyro.ai/>`_.

@ -47,22 +47,53 @@ def model(n_c_unlabeled: np.ndarray, n_y_and_c_labeled: np.ndarray) -> None:
    K = len(n_c_unlabeled)
    L = len(n_y_labeled)

-    pi_ = numpyro.sample(P_TEST_Y, dist.Dirichlet(jnp.ones(L)))
+    pi_ = numpyro.sample(P_TEST_Y, dist.Dirichlet(jnp.asarray(alpha, dtype=jnp.float32)))
    p_c_cond_y = numpyro.sample(P_C_COND_Y, dist.Dirichlet(jnp.ones(K).repeat(L).reshape(L, K)))

-    with numpyro.plate('plate', L):
-        numpyro.sample('F_yc', dist.Multinomial(n_y_labeled, p_c_cond_y), obs=n_y_and_c_labeled)
+    if temperature==1:
+        # original implementation
+        with numpyro.plate('plate', L):
+            numpyro.sample('F_yc', dist.Multinomial(n_y_labeled, p_c_cond_y), obs=n_y_and_c_labeled)

-    p_c = numpyro.deterministic(P_TEST_C, jnp.einsum("yc,y->c", p_c_cond_y, pi_))
-    numpyro.sample('N_c', dist.Multinomial(jnp.sum(n_c_unlabeled), p_c), obs=n_c_unlabeled)
+        p_c = numpyro.deterministic(P_TEST_C, jnp.einsum("yc,y->c", p_c_cond_y, pi_))
+        numpyro.sample('N_c', dist.Multinomial(jnp.sum(n_c_unlabeled), p_c), obs=n_c_unlabeled) 
+    else:
+        # with temperature modification
+        with numpyro.plate('plate_y', L):
+            logp_F = dist.Multinomial(
+                n_y_labeled,
+                p_c_cond_y
+            ).log_prob(n_y_and_c_labeled)
+
+        numpyro.factor(
+            'F_yc_loglik',
+            jnp.sum(logp_F) / temperature
+        )
+        p_c = numpyro.deterministic(
+            P_TEST_C,
+            jnp.einsum("yc,y->c", p_c_cond_y, pi_)
+        )
+
+        # Likelihood datos no etiquetados
+        logp_N = dist.Multinomial(
+            jnp.sum(n_c_unlabeled),
+            p_c
+        ).log_prob(n_c_unlabeled)
+
+        numpyro.factor(
+            'N_c_loglik',
+            logp_N / temperature
+        )


-def sample_posterior(
-    n_c_unlabeled: np.ndarray,
-    n_y_and_c_labeled: np.ndarray,
-    num_warmup: int,
-    num_samples: int,
-    seed: int = 0,
+def sample_posterior_bayesianCC(
+        n_c_unlabeled: np.ndarray,
+        n_y_and_c_labeled: np.ndarray,
+        num_warmup: int,
+        num_samples: int,
+        alpha: np.ndarray,
+        temperature = 1.,
+        seed: int = 0,
 ) -> dict:
    """
    Samples from the Bayesian quantification model in NumPyro using the
@ -74,17 +105,20 @@ def sample_posterior(
        with entry `(y, c)` being the number of instances labeled as class `y` and predicted as class `c`.
    :param num_warmup: the number of warmup steps.
    :param num_samples: the number of samples to draw.
+    :param alpha: a `np.ndarray` of shape `(n_classes,)` with the alpha parameters of the
+        Dirichlet prior
    :seed: the random seed.
    :return: a `dict` with the samples. The keys are the names of the latent variables.
    """
+
    mcmc = numpyro.infer.MCMC(
-        numpyro.infer.NUTS(model),
+        numpyro.infer.NUTS(model_bayesianCC),
        num_warmup=num_warmup,
        num_samples=num_samples,
        progress_bar=False
    )
    rng_key = jax.random.PRNGKey(seed)
-    mcmc.run(rng_key, n_c_unlabeled=n_c_unlabeled, n_y_and_c_labeled=n_y_and_c_labeled)
+    mcmc.run(rng_key, n_c_unlabeled=n_c_unlabeled, n_y_and_c_labeled=n_y_and_c_labeled, temperature=temperature, alpha=alpha)
    return mcmc.get_samples()


--- a/quapy/method/_kdey.py
+++ b/quapy/method/_kdey.py
@ -5,7 +5,7 @@ from sklearn.neighbors import KernelDensity
 import quapy as qp
 from quapy.method.aggregative import AggregativeSoftQuantifier
 import quapy.functional as F
-
+from scipy.special import logsumexp
 from sklearn.metrics.pairwise import rbf_kernel


@ -29,9 +29,10 @@ class KDEBase:
        assert bandwidth in KDEBase.BANDWIDTH_METHOD or isinstance(bandwidth, float), \
            f'invalid bandwidth, valid ones are {KDEBase.BANDWIDTH_METHOD} or float values'
        if isinstance(bandwidth, float):
-            assert kernel!='gaussian' or (0 < bandwidth < 1),  \
-                ("the bandwidth for a Gaussian kernel in KDEy should be in (0,1), "
-                 "since this method models the unit simplex")
+            #assert kernel!='gaussian' or (0 < bandwidth < 1),  \
+            #    ("the bandwidth for a Gaussian kernel in KDEy should be in (0,1), "
+            #     "since this method models the unit simplex")
+            pass
        return bandwidth

    @classmethod
@ -61,7 +62,7 @@ class KDEBase:

        return KernelDensity(bandwidth=bandwidth).fit(X)

-    def pdf(self, kde, X, kernel):
+    def pdf(self, kde, X, kernel, log_densities=False):
        """
        Wraps the density evalution of scikit-learn's KDE. Scikit-learn returns log-scores (s), so this
        function returns :math:`e^{s}`
@ -76,7 +77,11 @@ class KDEBase:
        elif kernel == 'ilr':
            X = self.ilr_transform(X)

-        return np.exp(kde.score_samples(X))
+        log_density = kde.score_samples(X)
+        if log_densities:
+            return log_density
+        else:
+            return np.exp(log_density)

    def get_mixture_components(self, X, y, classes, bandwidth, kernel):
        """
@ -93,6 +98,8 @@ class KDEBase:
        for cat in classes:
            selX = X[y==cat]
            if selX.size==0:
+                print(f'WARNING: empty class {cat}')
+                raise ValueError(f'WARNING: empty class {cat}')
                selX = [F.uniform_prevalence(len(classes))]

            class_cond_X.append(selX)
@ -169,15 +176,26 @@ class KDEyML(AggregativeSoftQuantifier, KDEBase):
        :return: a vector of class prevalence estimates
        """
        with qp.util.temp_seed(self.random_state):
-            epsilon = 1e-10
+            epsilon = 1e-12
            n_classes = len(self.mix_densities)
-            test_densities = [self.pdf(kde_i, posteriors, self.kernel) for kde_i in self.mix_densities]
+            if n_classes>=30:
+                # new version: improves numerical stability with logsumexp, at the cost of optimization efficiency.
+                # needed if the number of classes is large (approx >= 30) because densities tend to grow exponentially
+                test_log_densities = [self.pdf(kde_i, posteriors, self.kernel, log_densities=True) for kde_i in self.mix_densities]

-            def neg_loglikelihood(prev):
-                # test_mixture_likelihood = sum(prev_i * dens_i for prev_i, dens_i in zip (prev, test_densities))
-                test_mixture_likelihood = prev @ test_densities
-                test_loglikelihood = np.log(test_mixture_likelihood + epsilon)
-                return  -np.sum(test_loglikelihood)
+                def neg_loglikelihood(prev):
+                    prev = np.clip(prev, epsilon, 1.0)
+                    test_loglikelihood = logsumexp(np.log(prev)[:,None] + test_log_densities, axis=0)
+                    return -np.sum(test_loglikelihood)
+            else:
+                # original implementation
+                test_densities = [self.pdf(kde_i, posteriors, self.kernel) for kde_i in self.mix_densities]
+
+                def neg_loglikelihood(prev):
+                   # prev = np.clip(prev, epsilon, 1.0)
+                   test_mixture_likelihood = prev @ test_densities
+                   test_loglikelihood = np.log(test_mixture_likelihood + epsilon)
+                   return -np.sum(test_loglikelihood)

            return F.optim_minimize(neg_loglikelihood, n_classes)

--- a/quapy/method/aggregative.py
+++ b/quapy/method/aggregative.py
@ -163,6 +163,7 @@ class AggregativeQuantifier(BaseQuantifier, ABC):

        :param X: array-like of shape `(n_samples, n_features)`, the training instances
        :param y: array-like of shape `(n_samples,)`, the labels
+        :return: a tuple (predictions, labels)
        """
        self._check_classifier(adapt_if_necessary=self.fit_classifier)

--- a/quapy/method/confidence.py
+++ b/quapy/method/confidence.py
@ -1,3 +1,6 @@
+from numbers import Number
+from typing import Iterable
+
 import numpy as np
 from joblib import Parallel, delayed
 from sklearn.base import BaseEstimator
@ -139,7 +142,7 @@ class WithConfidenceABC(ABC):
    """
    Abstract class for confidence regions.
    """
-    METHODS = ['intervals', 'ellipse', 'ellipse-clr']
+    REGION_TYPE = ['intervals', 'ellipse', 'ellipse-clr', 'ellipse-ilr']

    @abstractmethod
    def predict_conf(self, instances, confidence_level=0.95) -> (np.ndarray, ConfidenceRegionABC):
@ -185,6 +188,8 @@ class WithConfidenceABC(ABC):
            region = ConfidenceEllipseSimplex(prev_estims, confidence_level=confidence_level)
        elif method == 'ellipse-clr':
            region = ConfidenceEllipseCLR(prev_estims, confidence_level=confidence_level)
+        elif method == 'ellipse-ilr':
+            region = ConfidenceEllipseILR(prev_estims, confidence_level=confidence_level)

        if region is None:
            raise NotImplementedError(f'unknown method {method}')
@ -336,6 +341,7 @@ class ConfidenceIntervals(ConfidenceRegionABC):
    """
    def __init__(self, samples, confidence_level=0.95, bonferroni_correction=False):
        assert 0 < confidence_level < 1, f'{confidence_level=} must be in range(0,1)'
+        assert samples.ndim == 2, 'unexpected shape; must be (n_bootstrap_samples, n_classes)'

        samples = np.asarray(samples)

@ -378,6 +384,10 @@ class ConfidenceIntervals(ConfidenceRegionABC):

        return proportion

+    def coverage_soft(self, true_value):
+        within_intervals = np.logical_and(self.I_low <= true_value, true_value <= self.I_high)
+        return np.mean(within_intervals.astype(float))
+
    def __repr__(self):
        return '['+', '.join(f'({low:.4f}, {high:.4f})' for (low,high) in zip(self.I_low, self.I_high))+']'

@ -385,25 +395,30 @@ class ConfidenceIntervals(ConfidenceRegionABC):
    def n_dim(self):
        return len(self.I_low)

-    def winkler_scores(self, true_prev):
+    def winkler_scores(self, true_prev, alpha=None, add_ae=False):
        true_prev = np.asarray(true_prev)
        assert true_prev.ndim == 1, 'unexpected dimensionality for true_prev'
        assert len(true_prev)==self.n_dim, \
            f'unexpected number of dimensions; found {true_prev.ndim}, expected {self.n_dim}'

-        def winkler_score(low, high, true_val, alpha):
+        def winkler_score(low, high, true_val, alpha, center):
            amp = high-low
-            scale_cost = 1./alpha
+            scale_cost = 2./alpha
            cost = np.max([0, low-true_val], axis=0) + np.max([0, true_val-high], axis=0)
-            return amp + scale_cost*cost
+            err = 0
+            if add_ae:
+                err = abs(true_val - center)
+            return amp + scale_cost*cost + err

+        alpha = alpha or self.alpha
        return np.asarray(
-            [winkler_score(low_i, high_i, true_v, self.alpha)
-                for (low_i, high_i, true_v) in zip(self.I_low, self.I_high, true_prev)]
+            [winkler_score(low_i, high_i, true_v, alpha, center)
+                for (low_i, high_i, true_v, center) in zip(self.I_low, self.I_high, true_prev, self.point_estimate())]
        )

-    def mean_winkler_score(self, true_prev):
-        return np.mean(self.winkler_scores(true_prev))
+    def mean_winkler_score(self, true_prev, alpha=None, add_ae=False):
+        return np.mean(self.winkler_scores(true_prev, alpha=alpha, add_ae=add_ae))
+


 class ConfidenceEllipseSimplex(ConfidenceRegionABC):
@ -481,8 +496,8 @@ class ConfidenceEllipseTransformed(ConfidenceRegionABC):
        samples = np.asarray(samples)
        self.transformation = transformation
        Z = self.transformation(samples)
-        # self.mean_ = np.mean(samples, axis=0)
-        self.mean_ = self.transformation.inverse(np.mean(Z, axis=0))
+        self.mean_ = np.mean(samples, axis=0)
+        # self.mean_ = self.transformation.inverse(np.mean(Z, axis=0))
        self.conf_region_z = ConfidenceEllipseSimplex(Z, confidence_level=confidence_level)
        self._samples = samples
        self.alpha = 1.-confidence_level
@ -544,7 +559,99 @@ class ConfidenceEllipseILR(ConfidenceEllipseTransformed):



+class ConfidenceIntervalsTransformed(ConfidenceRegionABC):
+    """
+    Instantiates a Confidence Interval region in a transformed space.

+    :param samples: np.ndarray of shape (n_bootstrap_samples, n_classes)
+    :param confidence_level: float, the confidence level (default 0.95)
+    :param bonferroni_correction: bool (default False), if True, a Bonferroni correction
+        is applied to the significance level (`alpha`) before computing confidence intervals.
+        The correction consists of replacing `alpha` with `alpha/n_classes`. When
+        `n_classes=2` the correction is not applied because there is only one verification test
+        since the other class is constrained. This is not necessarily true for n_classes>2.
+    """
+
+    def __init__(self, samples, transformation: CompositionalTransformation, confidence_level=0.95, bonferroni_correction=False):
+        samples = np.asarray(samples)
+        self.transformation = transformation
+        Z = self.transformation(samples)
+        self.mean_ = np.mean(samples, axis=0)
+        # self.mean_ = self.transformation.inverse(np.mean(Z, axis=0))
+        self.conf_region_z = ConfidenceIntervals(Z, confidence_level=confidence_level, bonferroni_correction=bonferroni_correction)
+        self._samples = samples
+        self.alpha = 1.-confidence_level
+
+    @property
+    def samples(self):
+        return self._samples
+
+    def point_estimate(self):
+        """
+        Returns the point estimate, the center of the ellipse.
+
+        :return: np.ndarray of shape (n_classes,)
+        """
+        # The inverse of the CLR does not coincide with the true mean, because the geometric mean
+        # requires smoothing the prevalence vectors and this affects the softmax (inverse);
+        # return self.clr.inverse(self.mean_) # <- does not coincide
+        return self.mean_
+
+    def coverage(self, true_value):
+        """
+        Checks whether a value, or a sets of values, are contained in the confidence region. The method computes the
+        fraction of these that are contained in the region, if more than one value is passed. If only one value is
+        passed, then it either returns 1.0 or 0.0, for indicating the value is in the region or not, respectively.
+
+        :param true_value: a np.ndarray of shape (n_classes,) or shape (n_values, n_classes,)
+        :return: float in [0,1]
+        """
+        transformed_values = self.transformation(true_value)
+        return self.conf_region_z.coverage(transformed_values)
+
+    def coverage_soft(self, true_value):
+        transformed_values = self.transformation(true_value)
+        return self.conf_region_z.coverage_soft(transformed_values)
+
+    def winkler_scores(self, true_prev, alpha=None, add_ae=False):
+        transformed_values = self.transformation(true_prev)
+        return self.conf_region_z.winkler_scores(transformed_values, alpha=alpha, add_ae=add_ae)
+
+    def mean_winkler_score(self, true_prev, alpha=None, add_ae=False):
+        transformed_values = self.transformation(true_prev)
+        return self.conf_region_z.mean_winkler_score(transformed_values, alpha=alpha, add_ae=add_ae)
+
+
+class ConfidenceIntervalsCLR(ConfidenceIntervalsTransformed):
+    """
+    Instantiates a Confidence Intervals in the Centered-Log Ratio (CLR) space.
+
+    :param samples: np.ndarray of shape (n_bootstrap_samples, n_classes)
+    :param confidence_level: float, the confidence level (default 0.95)
+    :param bonferroni_correction: bool (default False), if True, a Bonferroni correction
+        is applied to the significance level (`alpha`) before computing confidence intervals.
+        The correction consists of replacing `alpha` with `alpha/n_classes`. When
+        `n_classes=2` the correction is not applied because there is only one verification test
+        since the other class is constrained. This is not necessarily true for n_classes>2.
+    """
+    def __init__(self, samples, confidence_level=0.95, bonferroni_correction=False):
+        super().__init__(samples, CLRtransformation(), confidence_level=confidence_level, bonferroni_correction=bonferroni_correction)
+
+
+class ConfidenceIntervalsILR(ConfidenceIntervalsTransformed):
+    """
+    Instantiates a Confidence Intervals in the Isometric-Log Ratio (CLR) space.
+
+    :param samples: np.ndarray of shape (n_bootstrap_samples, n_classes)
+    :param confidence_level: float, the confidence level (default 0.95)
+    :param bonferroni_correction: bool (default False), if True, a Bonferroni correction
+        is applied to the significance level (`alpha`) before computing confidence intervals.
+        The correction consists of replacing `alpha` with `alpha/n_classes`. When
+        `n_classes=2` the correction is not applied because there is only one verification test
+        since the other class is constrained. This is not necessarily true for n_classes>2.
+    """
+    def __init__(self, samples, confidence_level=0.95, bonferroni_correction=False):
+        super().__init__(samples, ILRtransformation(), confidence_level=confidence_level, bonferroni_correction=bonferroni_correction)



@ -606,23 +713,35 @@ class AggregativeBootstrap(WithConfidenceABC, AggregativeQuantifier):
        self.verbose = verbose

    def aggregation_fit(self, classif_predictions, labels):
-        data = LabelledCollection(classif_predictions, labels, classes=self.classes_)
+
        self.quantifiers = []
        if self.n_train_samples==1:
            self.quantifier.aggregation_fit(classif_predictions, labels)
            self.quantifiers.append(self.quantifier)
        else:
-            # model-based bootstrap (only on the aggregative part)
-            n_examples = len(data)
-            full_index = np.arange(n_examples)
-            with qp.util.temp_seed(self.random_state):
-                for i in range(self.n_train_samples):
-                    quantifier = copy.deepcopy(self.quantifier)
-                    index = resample(full_index, n_samples=n_examples)
-                    classif_predictions_i = classif_predictions.sampling_from_index(index)
-                    data_i = data.sampling_from_index(index)
-                    quantifier.aggregation_fit(classif_predictions_i, data_i)
-                    self.quantifiers.append(quantifier)
+            if classif_predictions is None or labels is None:
+                # The entire dataset was consumed for classifier training, implying there is no need for training
+                # an aggregation function. If the bootstrap method was configured to train different aggregators
+                # (i.e., self.n_train_samples>1), then an error is raise. Otherwise, the method ends.
+                if self.n_train_samples > 1:
+                    raise ValueError(
+                        f'The underlying quantifier ({self.quantifier.__class__.__name__}) has consumed, all training '
+                        f'data, meaning the aggregation function needs none, but {self.n_train_samples=} is > 1, which '
+                        f'is inconsistent.'
+                    )
+            else:
+                # model-based bootstrap (only on the aggregative part)
+                data = LabelledCollection(classif_predictions, labels, classes=self.classes_)
+                n_examples = len(data)
+                full_index = np.arange(n_examples)
+                with qp.util.temp_seed(self.random_state):
+                    for i in range(self.n_train_samples):
+                        quantifier = copy.deepcopy(self.quantifier)
+                        index = resample(full_index, n_samples=n_examples)
+                        classif_predictions_i = classif_predictions.sampling_from_index(index)
+                        data_i = data.sampling_from_index(index)
+                        quantifier.aggregation_fit(classif_predictions_i, data_i)
+                        self.quantifiers.append(quantifier)
        return self

    def aggregate(self, classif_predictions: np.ndarray):
@ -648,8 +767,8 @@ class AggregativeBootstrap(WithConfidenceABC, AggregativeQuantifier):
        return prev_estim, conf

    def aggregate_conf(self, classif_predictions: np.ndarray, confidence_level=None):
-        if confidence_level is None:
-            confidence_level = self.confidence_level
+        confidence_level = confidence_level or self.confidence_level
+

        n_samples = classif_predictions.shape[0]
        prevs = []
@ -660,11 +779,8 @@ class AggregativeBootstrap(WithConfidenceABC, AggregativeQuantifier):
                    for i in range(self.n_test_samples)
                )
                prevs.extend(results)
-                # for i in tqdm(range(self.n_test_samples), desc='resampling', total=self.n_test_samples, disable=not self.verbose):
-                #     sample_i = resample(classif_predictions, n_samples=n_samples)
-                #     prev_i = quantifier.aggregate(sample_i)
-                #     prevs.append(prev_i)

+        prevs = np.array(prevs)
        conf = WithConfidenceABC.construct_region(prevs, confidence_level, method=self.region)
        prev_estim = conf.point_estimate()

@ -724,6 +840,8 @@ class BayesianCC(AggregativeCrispQuantifier, WithConfidenceABC):
    :param region: string, set to `intervals` for constructing confidence intervals (default), or to
        `ellipse` for constructing an ellipse in the probability simplex, or to `ellipse-clr` for
        constructing an ellipse in the Centered-Log Ratio (CLR) unconstrained space.
+    :param prior: an array-list with the alpha parameters of a Dirichlet prior, or the string 'uniform'
+        for a uniform, uninformative prior (default)
    """
    def __init__(self,
                 classifier: BaseEstimator=None,
@ -733,12 +851,17 @@ class BayesianCC(AggregativeCrispQuantifier, WithConfidenceABC):
                 num_samples: int = 1_000,
                 mcmc_seed: int = 0,
                 confidence_level: float = 0.95,
-                 region: str = 'intervals'):
+                 region: str = 'intervals',
+                 temperature = 1., 
+                 prior = 'uniform'):

        if num_warmup <= 0:
            raise ValueError(f'parameter {num_warmup=} must be a positive integer')
        if num_samples <= 0:
            raise ValueError(f'parameter {num_samples=} must be a positive integer')
+        assert ((isinstance(prior, str) and prior == 'uniform') or
+                (isinstance(prior, Iterable) and all(isinstance(v, Number) for v in prior))), \
+            f'wrong type for {prior=}; expected "uniform" or an array-like of real values'

        if _bayesian.DEPENDENCIES_INSTALLED is False:
            raise ImportError("Auxiliary dependencies are required. "
@ -750,6 +873,8 @@ class BayesianCC(AggregativeCrispQuantifier, WithConfidenceABC):
        self.mcmc_seed = mcmc_seed
        self.confidence_level = confidence_level
        self.region = region
+        self.temperature = temperature
+        self.prior = prior

        # Array of shape (n_classes, n_predicted_classes,) where entry (y, c) is the number of instances
        # labeled as class y and predicted as class c.
@ -780,11 +905,19 @@ class BayesianCC(AggregativeCrispQuantifier, WithConfidenceABC):

        n_c_unlabeled = F.counts_from_labels(classif_predictions, self.classifier.classes_).astype(float)

-        self._samples = _bayesian.sample_posterior(
+        n_classes = len(self.classifier.classes_)
+        if isinstance(self.prior, str) and self.prior == 'uniform':
+            alpha = np.asarray([1.] * n_classes)
+        else:
+            alpha = np.asarray(self.prior)
+
+        self._samples = _bayesian.sample_posterior_bayesianCC(
            n_c_unlabeled=n_c_unlabeled,
            n_y_and_c_labeled=self._n_and_c_labeled,
            num_warmup=self.num_warmup,
            num_samples=self.num_samples,
+            alpha=alpha,
+            temperature=self.temperature,
            seed=self.mcmc_seed,
        )
        return self._samples
--- a/quapy/protocol.py
+++ b/quapy/protocol.py
@ -171,7 +171,7 @@ class AbstractStochasticSeededProtocol(AbstractProtocol):
        return sample


-class OnLabelledCollectionProtocol:
+class OnLabelledCollectionProtocol(AbstractStochasticSeededProtocol):
    """
    Protocols that generate samples from a :class:`qp.data.LabelledCollection` object.
    """
@ -229,8 +229,17 @@ class OnLabelledCollectionProtocol:
        elif return_type=='index':
            return lambda lc,params:params

+    def sample(self, index):
+        """
+        Realizes the sample given the index of the instances.

-class APP(AbstractStochasticSeededProtocol, OnLabelledCollectionProtocol):
+        :param index: indexes of the instances to select
+        :return: an instance of :class:`qp.data.LabelledCollection`
+        """
+        return self.data.sampling_from_index(index)
+
+
+class APP(OnLabelledCollectionProtocol):
    """
    Implementation of the artificial prevalence protocol (APP).
    The APP consists of exploring a grid of prevalence values containing `n_prevalences` points (e.g.,
@ -311,15 +320,6 @@ class APP(AbstractStochasticSeededProtocol, OnLabelledCollectionProtocol):
            indexes.append(index)
        return indexes

-    def sample(self, index):
-        """
-        Realizes the sample given the index of the instances.
-
-        :param index: indexes of the instances to select
-        :return: an instance of :class:`qp.data.LabelledCollection`
-        """
-        return self.data.sampling_from_index(index)
-
    def total(self):
        """
        Returns the number of samples that will be generated
@ -329,7 +329,7 @@ class APP(AbstractStochasticSeededProtocol, OnLabelledCollectionProtocol):
        return F.num_prevalence_combinations(self.n_prevalences, self.data.n_classes, self.repeats)


-class NPP(AbstractStochasticSeededProtocol, OnLabelledCollectionProtocol):
+class NPP(OnLabelledCollectionProtocol):
    """
    A generator of samples that implements the natural prevalence protocol (NPP). The NPP consists of drawing
    samples uniformly at random, therefore approximately preserving the natural prevalence of the collection.
@ -365,15 +365,6 @@ class NPP(AbstractStochasticSeededProtocol, OnLabelledCollectionProtocol):
            indexes.append(index)
        return indexes

-    def sample(self, index):
-        """
-        Realizes the sample given the index of the instances.
-
-        :param index: indexes of the instances to select
-        :return: an instance of :class:`qp.data.LabelledCollection`
-        """
-        return self.data.sampling_from_index(index)
-
    def total(self):
        """
        Returns the number of samples that will be generated (equals to "repeats")
@ -383,7 +374,7 @@ class NPP(AbstractStochasticSeededProtocol, OnLabelledCollectionProtocol):
        return self.repeats


-class UPP(AbstractStochasticSeededProtocol, OnLabelledCollectionProtocol):
+class UPP(OnLabelledCollectionProtocol):
    """
    A variant of :class:`APP` that, instead of using a grid of equidistant prevalence values,
    relies on the Kraemer algorithm for sampling unit (k-1)-simplex uniformly at random, with
@ -423,14 +414,53 @@ class UPP(AbstractStochasticSeededProtocol, OnLabelledCollectionProtocol):
            indexes.append(index)
        return indexes

-    def sample(self, index):
+    def total(self):
        """
-        Realizes the sample given the index of the instances.
+        Returns the number of samples that will be generated (equals to "repeats")

-        :param index: indexes of the instances to select
-        :return: an instance of :class:`qp.data.LabelledCollection`
+        :return: int
        """
-        return self.data.sampling_from_index(index)
+        return self.repeats
+
+
+class DirichletProtocol(OnLabelledCollectionProtocol):
+    """
+    A protocol that establishes a prior Dirichlet distribution for the prevalence of the samples.
+    Note that providing an all-ones vector of Dirichlet parameters is equivalent to invoking the
+    APP protocol (although each protocol will generate a different series of samples given a
+    fixed seed, since the implementation is different).
+
+    :param data: a `LabelledCollection` from which the samples will be drawn
+    :param alpha: an array-like of shape (n_classes,) with the parameters of the Dirichlet distribution
+    :param sample_size: integer, the number of instances in each sample; if None (default) then it is taken from
+        qp.environ["SAMPLE_SIZE"]. If this is not set, a ValueError exception is raised.
+    :param repeats: the number of samples to generate. Default is 100.
+    :param random_state: allows replicating samples across runs (default 0, meaning that the sequence of samples
+        will be the same every time the protocol is called)
+    :param return_type: set to "sample_prev" (default) to get the pairs of (sample, prevalence) at each iteration, or
+        to "labelled_collection" to get instead instances of LabelledCollection
+    """
+
+    def __init__(self, data: LabelledCollection, alpha, sample_size=None, repeats=100, random_state=0,
+                 return_type='sample_prev'):
+        assert len(alpha)>1, 'wrong parameters: alpha must be an array-like of shape (n_classes,)'
+        super(DirichletProtocol, self).__init__(random_state)
+        self.data = data
+        self.alpha = alpha
+        self.sample_size = qp._get_sample_size(sample_size)
+        self.repeats = repeats
+        self.random_state = random_state
+        self.collator = OnLabelledCollectionProtocol.get_collator(return_type)
+
+    def samples_parameters(self):
+        """
+        Return all the necessary parameters to replicate the samples.
+
+        :return: a list of indexes that realize the sampling
+        """
+        prevs = np.random.dirichlet(self.alpha, size=self.repeats)
+        indexes = [self.data.sampling_index(self.sample_size, *prevs_i) for prevs_i in prevs]
+        return indexes

    def total(self):
        """
@ -450,7 +480,7 @@ class DomainMixer(AbstractStochasticSeededProtocol):
    :param sample_size: integer, the number of instances in each sample; if None (default) then it is taken from
        qp.environ["SAMPLE_SIZE"]. If this is not set, a ValueError exception is raised.
    :param repeats: int, number of samples to draw for every mixture rate
-    :param prevalence: the prevalence to preserv along the mixtures. If specified, should be an array containing
+    :param prevalence: the prevalence to preserve along the mixtures. If specified, should be an array containing
        one prevalence value (positive float) for each class and summing up to one. If not specified, the prevalence
        will be taken from the domain A (default).
    :param mixture_points: an integer indicating the number of points to take from a linear scale (e.g., 21 will
--- a/quapy/tests/test_methods.py
+++ b/quapy/tests/test_methods.py
@ -3,6 +3,7 @@ import unittest

 from sklearn.linear_model import LogisticRegression

+import BayesianKDEy.commons
 import quapy as qp
 from quapy.method.aggregative import ACC
 from quapy.method.meta import Ensemble
@ -47,7 +48,7 @@ class TestMethods(unittest.TestCase):
            learner.fit(*dataset.training.Xy)

            for model in AGGREGATIVE_METHODS:
-                if not dataset.binary and model in BINARY_METHODS:
+                if not BayesianKDEy.commons.binary and model in BINARY_METHODS:
                    print(f'skipping the test of binary model {model.__name__} on multiclass dataset {dataset.name}')
                    continue

@ -61,7 +62,7 @@ class TestMethods(unittest.TestCase):
        for dataset in TestMethods.datasets:

            for model in NON_AGGREGATIVE_METHODS:
-                if not dataset.binary and model in BINARY_METHODS:
+                if not BayesianKDEy.commons.binary and model in BINARY_METHODS:
                    print(f'skipping the test of binary model {model.__name__} on multiclass dataset {dataset.name}')
                    continue

@ -76,7 +77,7 @@ class TestMethods(unittest.TestCase):

        base_quantifier = ACC(LogisticRegression())
        for dataset, policy in itertools.product(TestMethods.datasets, Ensemble.VALID_POLICIES):
-            if not dataset.binary and policy == 'ds':
+            if not BayesianKDEy.commons.binary and policy == 'ds':
                print(f'skipping the test of binary policy ds on non-binary dataset {dataset}')
                continue

--- a/quapy/tests/test_protocols.py
+++ b/quapy/tests/test_protocols.py
@ -2,6 +2,7 @@ import unittest
 import numpy as np

 import quapy.functional
+from protocol import DirichletProtocol
 from quapy.data import LabelledCollection
 from quapy.protocol import APP, NPP, UPP, DomainMixer, AbstractStochasticSeededProtocol

@ -138,6 +139,31 @@ class TestProtocols(unittest.TestCase):

        self.assertNotEqual(samples1, samples2)

+    def test_dirichlet_replicate(self):
+        data = mock_labelled_collection()
+        p = DirichletProtocol(data, alpha=[1,2,3,4], sample_size=5, repeats=10, random_state=42)
+
+        samples1 = samples_to_str(p)
+        samples2 = samples_to_str(p)
+
+        self.assertEqual(samples1, samples2)
+
+        p = DirichletProtocol(data, alpha=[1,2,3,4], sample_size=5, repeats=10, random_state=0)
+
+        samples1 = samples_to_str(p)
+        samples2 = samples_to_str(p)
+
+        self.assertEqual(samples1, samples2)
+
+    def test_dirichlet_not_replicate(self):
+        data = mock_labelled_collection()
+        p = DirichletProtocol(data, alpha=[1,2,3,4], sample_size=5, repeats=10, random_state=None)
+
+        samples1 = samples_to_str(p)
+        samples2 = samples_to_str(p)
+
+        self.assertNotEqual(samples1, samples2)
+
    def test_covariate_shift_replicate(self):
        dataA = mock_labelled_collection('domA')
        dataB = mock_labelled_collection('domB')
--- a/1
+++ b/1
@ -0,0 +1 @@
+Subproject commit 9d433e3e35b4d111a3914a1e7d3257a8fcf24a9b
Author	SHA1	Message	Date
Alejandro Moreo Fernandez	bb5763da76	broken submodule fix	2026-02-12 13:14:59 +01:00
Alejandro Moreo Fernandez	81472b9d25	trying pca reduction for cifar 100	2026-01-31 00:11:19 +01:00
Alejandro Moreo Fernandez	877bfb2b18	log densities for kde	2026-01-27 17:58:53 +01:00
Alejandro Moreo Fernandez	839496fb8e	added mnist	2026-01-23 18:05:23 +01:00
Alejandro Moreo Fernandez	a6336218e2	adding lequa datasets	2026-01-20 10:53:16 +01:00
Alejandro Moreo Fernandez	9ae65ab09a	mapls working	2026-01-15 09:41:37 +01:00
Alejandro Moreo Fernandez	a511f577c9	reducing strength for antagonic prior	2026-01-13 18:19:09 +01:00
Alejandro Moreo Fernandez	93c33fe237	adding plots to prior test	2026-01-13 17:29:40 +01:00
Alejandro Moreo Fernandez	300b8e6423	bugfix	2026-01-13 15:01:02 +01:00
Alejandro Moreo Fernandez	934b09fa66	bugfix	2026-01-13 14:36:20 +01:00
Alejandro Moreo Fernandez	ce51766944	bayesianCC with custom prior	2026-01-13 13:08:07 +01:00
Alejandro Moreo Fernandez	47dc6acc75	bayesianCC with custom prior	2026-01-13 13:05:42 +01:00
Alejandro Moreo Fernandez	724e1b13a0	bugfix	2026-01-13 12:32:28 +01:00
Alejandro Moreo Fernandez	ca981836b4	some refactor and prior effect script	2026-01-13 12:25:37 +01:00
Alejandro Moreo Fernandez	de7c04a380	some refactor and prior effect script	2026-01-13 12:20:02 +01:00
Alejandro Moreo Fernandez	323cd383f9	some refactor and prior effect script	2026-01-13 12:19:05 +01:00
Alejandro Moreo Fernandez	4b7fc77e90	improved plots	2026-01-12 15:51:40 +01:00
Alejandro Moreo Fernandez	17c17ffd0f	bugfix	2026-01-11 19:07:01 +01:00
Alejandro Moreo Fernandez	c6fb46cf70	added prior	2026-01-11 19:00:13 +01:00
Alejandro Moreo Fernandez	ae9503a43b	calibrating temperature and coming back to Aitchison kernel	2026-01-09 17:20:05 +01:00
Alejandro Moreo Fernandez	e33b291357	adding nuts to kdey	2026-01-07 18:21:46 +01:00
				`@ -0,0 +1 @@`
				`Subproject commit 9d433e3e35b4d111a3914a1e7d3257a8fcf24a9b`