Publications
Detailed Information
VANT : A Visual Analytics System for Refining Parallel Corpora in Neural Machine Translation
Cited 1 time in
Web of Science
Cited 2 time in Scopus
- Authors
- Issue Date
- 2022-04
- Publisher
- IEEE COMPUTER SOC
- Citation
- 2022 IEEE 15TH PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS 2022), pp.181-185
- Abstract
- The quality of parallel corpora used to train a Neural Machine Translation (NMT) model can critically influence the model's performance. Various approaches for refining parallel corpora have been introduced, but there is still much room for improvements, such as enhancing the efficiency and the quality of refinement. We introduce VANT, a novel visual analytics system for refining parallel corpora used in training an NMT model. Our system helps users to readily detect and filter noisy parallel corpora by (1) aiding the quality estimation of individual sentence pairs within the corporaby providing diverse quality metrics (e.g., cosine similarity, BLEU, length ratio) and (2) allowing users to visually examine and manage the corpora based on the pre-computed metrics scores. Our system's effectiveness and usefulness are demonstrated through a qualitative user study with eight participants, including four domain experts with real-world datasets.
- ISSN
- 2165-8765
- Files in This Item:
- There are no files associated with this item.
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.