52 lines
2.2 KiB
Markdown
52 lines
2.2 KiB
Markdown
# Generalized Funnelling (gFun)
|
|
|
|
## Requirements
|
|
```commandline
|
|
transformers==2.11.0
|
|
pandas==0.25.3
|
|
numpy==1.17.4
|
|
joblib==0.14.0
|
|
tqdm==4.50.2
|
|
pytorch_lightning==1.1.2
|
|
torch==1.3.1
|
|
nltk==3.4.5
|
|
scipy==1.3.3
|
|
rdflib==4.2.2
|
|
torchtext==0.4.0
|
|
scikit_learn==0.24.1
|
|
```
|
|
|
|
## Usage
|
|
```commandline
|
|
usage: main.py [-h] [-o CSV_DIR] [-x] [-w] [-m] [-b] [-g] [-c] [-n NEPOCHS]
|
|
[-j N_JOBS] [--muse_dir MUSE_DIR] [--gru_wce]
|
|
[--gru_dir GRU_DIR] [--bert_dir BERT_DIR] [--gpus GPUS]
|
|
dataset
|
|
|
|
Run generalized funnelling, A. Moreo, A. Pedrotti and F. Sebastiani (2020).
|
|
|
|
positional arguments:
|
|
dataset Path to the dataset
|
|
|
|
optional arguments:
|
|
-h, --help show this help message and exit
|
|
-o, --output result file (default ../csv_logs/gfun/gfun_results.csv)
|
|
-x, --post_embedder deploy posterior probabilities embedder to compute document embeddings
|
|
-w, --wce_embedder deploy (supervised) Word-Class embedder to the compute document embeddings
|
|
-m, --muse_embedder deploy (pretrained) MUSE embedder to compute document embeddings
|
|
-b, --bert_embedder deploy multilingual Bert to compute document embeddings
|
|
-g, --gru_embedder deploy a GRU in order to compute document embeddings
|
|
-c, --c_optimize optimize SVMs C hyperparameter
|
|
-j, --n_jobs number of parallel jobs, default is -1 i.e., all
|
|
--nepochs_rnn number of max epochs to train Recurrent embedder (i.e., -g), default 150
|
|
--nepochs_bert number of max epochs to train Bert model (i.e., -g), default 10
|
|
--patience_rnn set early stop patience for the RecurrentGen, default 25
|
|
--patience_bert set early stop patience for the BertGen, default 5
|
|
--batch_rnn set batchsize for the RecurrentGen, default 64
|
|
--batch_bert set batchsize for the BertGen, default 4
|
|
--muse_dir path to the MUSE polylingual word embeddings (default ../embeddings)
|
|
--gru_wce deploy WCE embedding as embedding layer of the GRU View Generator
|
|
--rnn_dir set the path to a pretrained RNN model (i.e., -g view generator)
|
|
--bert_dir set the path to a pretrained mBERT model (i.e., -b view generator)
|
|
--gpus specifies how many GPUs to use per node
|
|
``` |