This page lists open-source frameworks, datasets, and benchmarks developed or co-authored by our group.

💻 Frameworks & Toolkits

  • GAN-BERT – Few-shot adversarial learning on Transformers (ACL 2020).
  • GAN-BERT-PyTorch – PyTorch/HF port of GAN-BERT (ACL 2020).
  • GroundedSRL4HRI – Synthetic multimodal dataset and framework for Grounded Semantic Role Labeling in Human–Robot Interaction (EMNLP 2025).
  • Sanskrit Voyager – Unified web platform for interactive reading and linguistic analysis of Sanskrit texts (EMNLP 2025 System Demonstrations).
  • UniTor@BioASQ – Biomedical QA system and benchmark for the BioASQ@CLEF 2025 challenge (CLEF 2025).
  • WikiGame-LLM-Eval – Reproducible pipeline for evaluating LLMs on Wikipedia graph navigation (CLiC-it 2025).
  • MM-IGLU-Dialogues – Multimodal benchmark for dialogue planning in 3D grounded environments (ACL 2025).
  • BacKGen – Background Knowledge Generator (Analogy-Angle@ACL 2025).
  • dats – Data augmentation for NLP (NAACL 2022).
  • MT-GANBERT – Multi-task + GAN-BERT for sustainable NLP (NL4AI 2021).
  • KeLP – Kernel-based Learning Platform for scalable ML (JMLR 2017).
  • EthicalNNEthics by Design framework in PyTorch (AIxIA 2022).
  • GrUT – Semantic parsing for Human–Robot Interaction (AIxIA 2022).
  • LU4R – Adaptive spoken language understanding for robots (IJCAI 2016).
  • ACLPUB2 – ACL proceedings generation tool.

📚 Datasets & Benchmarks

  • ExtremITA – Instruction-tuned LLM for Italian (EVALITA 2023).
  • MM-IGLU – Multimodal grounded understanding benchmark (COLING 2024).
  • MM-IGLU-IT – Italian benchmark for grounded instruction following (AIxIA 2024).
  • U-DepPLLaMA – Universal dependency parsing with LLMs (IJCoL 2024).
  • FEVER-it – Italian fact-checking dataset & pipeline (CLiC-it 2024).
  • HuRIC – Human–Robot Interaction Corpus 2.0 (AI Journal 2020).
  • GQA-it – 1M+ Visual Question Answering pairs in Italian (CLiC-it 2021).
  • mscoco-it – 600K captions for Italian Image Captioning (IJCoL 2019).
  • msr-vtt-it – 200K Italian video caption pairs (IJCoL 2019).
  • SQuAD-it – 60K Q/A triples for reading comprehension (AIxIA 2018).
  • ABSITA – Tourism opinion mining dataset (EVALITA 2018).
  • SENTIPOLC – 10K annotated Italian tweets (EVALITA 2016).

🎓 Teaching Material