Encoding a parallel corpus: The TRIS corpus experience

Carla Parra Escartín

doi:10.15845/bells.v3i1.362

Encoding a parallel corpus: The TRIS corpus experience

Authors

Carla Parra Escartín University of Bergen

DOI:

https://doi.org/10.15845/bells.v3i1.362

Abstract

This paper focuses on one of the many aspects to be taken into account when developing a new corpus: its encoding. During the compilation of the corpus of Technical Regulations Information System (the TRIS corpus) several encoding issues arose. In this paper the author discusses the possibilities available with regards to encoding as well as the decisions taken and the strategies followed. The author discusses standards for character encoding and corpus markup and explains how these were integrated in the compilation of the TRIS corpus.

Downloads

Published

2013-04-10

How to Cite

Parra Escartín, Carla. 2013. “Encoding a Parallel Corpus: The TRIS Corpus Experience”. Bergen Language and Linguistics Studies 3 (1). https://doi.org/10.15845/bells.v3i1.362.

Download Citation

Issue

Vol. 3 No. 1 (2013): The many facets of corpus linguistics in Bergen - in honour of Knut Hofland

Section

Articles

License

This work is licensed under a Creative Commons Attribution 3.0 Unported License.