Using unsupervised clustering techniques this study explores sentence alignment patterns in a parallel corpus of Norwegian source texts and Spanish translations, the NSPC (Hareide and Hofland 2012). The results show that three strategies with respect to sentence alignment dominate: one to one correspondence, merging two sentences into one, and removing sentences altogether (omission). The strategies are intricately correlated with the variables translator, author, and genre. However, we show how visualization techniques for cluster analyses offer a possibility for teasing apart these interactions as well as their relative importance. Our results indicate that non-fiction texts allow translators more freedom with respect to the treatment of sentences than do texts that are written by professional authors of fiction. The style of the author appears to play only a secondary role, but is especially important in fiction.
Keywords: corpus based translation, cluster analysis, parallel corpora, corpus alignment, unidirectional bilingual corpus
Copyright (c) 2013 Gard Buen Jenset, Lidun Hareide
This work is licensed under a Creative Commons Attribution 3.0 International License.
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).