colibrita-semeval-2014.tar.bz2 | 1.99GB |
Type: Dataset
Tags: machine translation, translation assistance, nlp, computational linguistics, semeval
Bibtex:
Tags: machine translation, translation assistance, nlp, computational linguistics, semeval
Bibtex:
@article{, title = {The Role of Context information in L2 Translation Assistance (Data Set)}, journal = {Language Resources and Evaluation (submitted, pending acceptation)}, author = {Maarten van Gompel and Antal van den Bosch}, year = {2014}, url = {}, license = {CC-SA}, abstract = {We investigate to what extent L2 context information can aid the translation of L1 fragments in an L2 context, and what techniques are most suitable. The task is framed in the context of second language learning, where translation assistance systems enable language learners to write in their target language whilst allowing them to fall back to their native language in case the correct word or expression is not known. These code switches are subsequently translated to L2 given the L2 context. We focus on two approaches: a classifier-based approach, and one rooted in Statistical Machine Translation. Various mixtures between the two are investigated. In doing so, we provide valuable insights on how to best tackle the task presented at SemEval 2014. We zoom in on the role of context information (in L2) and of the L2 language model, and investigate the incorporation of memory-based classifiers as a means of better disambiguating the L1 fragments. We find Statistical Machine Translation to be the most adequate solution to the problem, and show how it can be applied with a cross-lingual context. Integrating classifiers in such a framework may lead to small improvements in translation quality, but there is considerable overlap with the benefits of the L2 language model. } }
No comments yet
Add a comment