examples and readme update

This commit is contained in:
Andrea Pedrotti 2024-03-12 13:52:34 +01:00
parent ba609ee459
commit b322435423
3 changed files with 8 additions and 3 deletions

View File

@ -1,3 +1,7 @@
#!bin/bash
# carica il dataset da examples/dataset/sample-dataset.csv
# le predicions vengono salvate in exampels/results/sample-dataset_<timestamp>.csv
# --category_map specifica il path del file di mapping. Nel file di output sono salvate i nomi delle categorie predette.
python infer.py --datapath examples/dataset/sample-dataset.csv --outdir examples/results --category_map examples/dataset/dataset-mapping.csv

View File

@ -103,7 +103,7 @@ if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument("--datapath", required=True, type=str, help="path to csv file containing the documents to be classified")
parser.add_argument("--outdir", type=str, default="results/inference-preds", help="path to store csv file containing gfun predictions")
parser.add_argument("--category_map", type=str, default="models/category_mappers/rai-mapping.csv", help="path to csv file containing the mapping from label name to label id [str: id]")
parser.add_argument("--category_map", type=str, default=None, help="path to csv file containing the mapping from label name to label id [str: id]")
parser.add_argument("--nlabels", type=int, default=28)
parser.add_argument("--muse_dir", type=str, default="embeddings", help="path to muse embeddings")
parser.add_argument("--trained_gfun", type=str, default="rai_pmt_mean_231029", help="name of the trained gfun instance")

View File

@ -9,7 +9,7 @@ mkdir resources
# optional
mkdir models/category_mappers
```
In `models`, scaricare i modelli pre-trained condivisi. La directory `models` contiene 4 subdir `metaclassifier, vgfs, vectorizer, category_mappers`.
In `models`, scaricare i modelli pre-trained condivisi. La directory `models` contiene 3 subdir `metaclassifier, vgfs, vectorizer`.
In `resources` estrarre i muse-embeddings.
In `models/category_mappers` estrarre il file csv che contiene il mapping da category label a category id (opzionale).
@ -22,13 +22,14 @@ python infer.py --datapth <path/to/the/csv_file.csv>
I risultati saranno salvati di default nella cartella `results/inference-preds`, in un file csv denominato a seconda input file specificato in `--datapath` + il timetamp della run (e.g., `<csv_file>_<240312_13345>.csv`) (è possibile cambiare directory di output tramite `--outdir <my/output/dir/>`)
NB: per ottenere i nomi (stringhe) delle classi predette è necessario specificare il path del file csv che contiene il mapping class id -> class label (argomento `--category_map`).
```
optional arguments:
-h, --help show this help message and exit
--datapath path to csv file containing the documents to be classified
--outdir path to store csv file containing gfun predictions (default=results/inference-preds)
--category_map path to csv file containing the mapping from label name to label id [str: id] (default=models/category_mappers/rai-mapping.csv)
--category_map path to csv file containing the mapping from label name to label id [str: id] (default=None)
--nlabels number of target classes defined in the annotation schema (default=28)
--muse_dir path to muse embeddings
--trained_gfun name of the trained gfun instance