Guidelines for the Linguistic Data Consortium’s
Language Translation Evaluation Project
for Translation of Chinese to English
2) The files, or segments of files the team translated
Goal
Chinese Source Text
Our goal is to support the development of automatic means
of evaluating translation quality. To this end it is necessary
that we have a number of different translations of the same
source material.
Each story has SGML tags added at the beginning to aid
automatic processing, as follows:
The Translation Team
<DOC docid=XXXXXXXXXX>
A single translation “team” must be used to translate all of
the source language data. This team must consist of at least
two members:
--Segment 1--
{Chinese text to be translated}
1) An Chinese dominant bilingual who does initial
translation
--Segment 2--
2) An English dominant bilingual who proofreads
and edits the output of the first translator.
{Chinese text to be translated}
The team may use the following means as assistance:
1) An automatic machine translation system
--Segment 3--
2) A translation memory system.
{Chinese text to be translated}
The translation team must not change during translation,
and the team must be fully documented. Documentation
includes:
Each story is divided at sentence (originally marked with a
Chinese period except for the headline) boundaries. A story
is organized into records of Chinese text separated by blank
lines. Each sentence is preceded and followed by a blank
line. Each segment has a number associated with it.
1) The name (or pseudonym), native language,
second languages, age and years of translation
experience of the translator(s)
2) The order of processing (i.e. the name of the
person who performs the first pass, second pass,
etc.)
English Translation File Format
The English translation of each source story is to be
rendered as plain ASCII text, with enclosing SGML tags
that preserve the attributes of the original story, as
illustrated as following:
3) The name and version numb