QuaPy/examples/lequa2022_experiments.py

import numpy as np
from sklearn.calibration import CalibratedClassifierCV
from sklearn.linear_model import LogisticRegression
import quapy as qp
import quapy.functional as F
from data.datasets import LEQUA2022_SAMPLE_SIZE, fetch_lequa2022
from evaluation import evaluation_report
from method.aggregative import EMQ
from model_selection import GridSearchQ
import pandas as pd


task = 'T1A'

qp.environ['SAMPLE_SIZE'] = LEQUA2022_SAMPLE_SIZE[task]
training, val_generator, test_generator = fetch_lequa2022(task=task)

# define the quantifier
learner = CalibratedClassifierCV(LogisticRegression())
quantifier = EMQ(learner=learner)

# model selection
param_grid = {'C': np.logspace(-3, 3, 7), 'class_weight': ['balanced', None]}
model_selection = GridSearchQ(quantifier, param_grid, protocol=val_generator, n_jobs=-1, refit=False, verbose=True)
quantifier = model_selection.fit(training)

# evaluation
report = evaluation_report(quantifier, protocol=test_generator, error_metrics=['mae', 'mrae', 'mkld'], verbose=True)

# printing results
pd.set_option('display.expand_frame_repr', False)
report['estim-prev'] = report['estim-prev'].map(F.strprev)
print(report)

print('Averaged values:')
print(report.mean())
full example of training, model selection, and evaluation using the lequa2022 dataset with the new protocols 2022-11-04 15:04:36 +01:00			`import numpy as np`
adding the possibility to estimate the training prevalence, instead of using the true training prevalence, as a starting point in emq 2022-12-12 09:34:09 +01:00			`from sklearn.calibration import CalibratedClassifierCV`
full example of training, model selection, and evaluation using the lequa2022 dataset with the new protocols 2022-11-04 15:04:36 +01:00			`from sklearn.linear_model import LogisticRegression`
			`import quapy as qp`
adding the possibility to estimate the training prevalence, instead of using the true training prevalence, as a starting point in emq 2022-12-12 09:34:09 +01:00			`import quapy.functional as F`
full example of training, model selection, and evaluation using the lequa2022 dataset with the new protocols 2022-11-04 15:04:36 +01:00			`from data.datasets import LEQUA2022_SAMPLE_SIZE, fetch_lequa2022`
			`from evaluation import evaluation_report`
			`from method.aggregative import EMQ`
			`from model_selection import GridSearchQ`
full example of training, model selection, and evaluation using the lequa2022 dataset with the new protocols 2022-11-04 15:15:12 +01:00			`import pandas as pd`
full example of training, model selection, and evaluation using the lequa2022 dataset with the new protocols 2022-11-04 15:04:36 +01:00

			`task = 'T1A'`

full example of training, model selection, and evaluation using the lequa2022 dataset with the new protocols 2022-11-04 15:06:08 +01:00			`qp.environ['SAMPLE_SIZE'] = LEQUA2022_SAMPLE_SIZE[task]`
full example of training, model selection, and evaluation using the lequa2022 dataset with the new protocols 2022-11-04 15:04:36 +01:00			`training, val_generator, test_generator = fetch_lequa2022(task=task)`

			`# define the quantifier`
adding the possibility to estimate the training prevalence, instead of using the true training prevalence, as a starting point in emq 2022-12-12 09:34:09 +01:00			`learner = CalibratedClassifierCV(LogisticRegression())`
			`quantifier = EMQ(learner=learner)`
full example of training, model selection, and evaluation using the lequa2022 dataset with the new protocols 2022-11-04 15:04:36 +01:00
			`# model selection`
			`param_grid = {'C': np.logspace(-3, 3, 7), 'class_weight': ['balanced', None]}`
			`model_selection = GridSearchQ(quantifier, param_grid, protocol=val_generator, n_jobs=-1, refit=False, verbose=True)`
			`quantifier = model_selection.fit(training)`

			`# evaluation`
full example of training, model selection, and evaluation using the lequa2022 dataset with the new protocols 2022-11-04 15:15:12 +01:00			`report = evaluation_report(quantifier, protocol=test_generator, error_metrics=['mae', 'mrae', 'mkld'], verbose=True)`
full example of training, model selection, and evaluation using the lequa2022 dataset with the new protocols 2022-11-04 15:04:36 +01:00
adding the possibility to estimate the training prevalence, instead of using the true training prevalence, as a starting point in emq 2022-12-12 09:34:09 +01:00			`# printing results`
			`pd.set_option('display.expand_frame_repr', False)`
			`report['estim-prev'] = report['estim-prev'].map(F.strprev)`
full example of training, model selection, and evaluation using the lequa2022 dataset with the new protocols 2022-11-04 15:04:36 +01:00			`print(report)`
adding the possibility to estimate the training prevalence, instead of using the true training prevalence, as a starting point in emq 2022-12-12 09:34:09 +01:00
			`print('Averaged values:')`
			`print(report.mean())`