Publications

Detailed Information

VANT : A Visual Analytics System for Refining Parallel Corpora in Neural Machine Translation

Cited 1 time in Web of Science Cited 2 time in Scopus
Authors

Park, Sebeom; Lee, Soohyun; Kim, Youngtaek; Jeon, Hyeon; Jung, Seokweon; Bok, Jinwook; Seo, Jinwook

Issue Date
2022-04
Publisher
IEEE COMPUTER SOC
Citation
2022 IEEE 15TH PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS 2022), pp.181-185
Abstract
The quality of parallel corpora used to train a Neural Machine Translation (NMT) model can critically influence the model's performance. Various approaches for refining parallel corpora have been introduced, but there is still much room for improvements, such as enhancing the efficiency and the quality of refinement. We introduce VANT, a novel visual analytics system for refining parallel corpora used in training an NMT model. Our system helps users to readily detect and filter noisy parallel corpora by (1) aiding the quality estimation of individual sentence pairs within the corporaby providing diverse quality metrics (e.g., cosine similarity, BLEU, length ratio) and (2) allowing users to visually examine and manage the corpora based on the pre-computed metrics scores. Our system's effectiveness and usefulness are demonstrated through a qualitative user study with eight participants, including four domain experts with real-world datasets.
ISSN
2165-8765
URI
https://hdl.handle.net/10371/185826
DOI
https://doi.org/10.1109/PacificVis53943.2022.00029
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share