| Guidelines
on preparation of the source text
Of course, the quality improvement of machine translation
(MT) is mainly the task of its developers. However, the
users can also make some efforts for reaching acceptable
results because first of all the quality of machine
translation directly depends on the quality of the delivered
source text.
Certainly, the guidelines below will not solve all problems
of machine translation, but they can help to win some
points in opposition of a computer and the natural language.
- Avoid
misprints and spelling errors! The machine translator
can not correct errors and recognize incorrectly written
words (special spell-checking programs are very useful
for this purpose).
- Bear
in mind punctuation marks! Skipped or, on the contrary,
redundant punctuation mark can prevent an electronic
translator from understanding of syntactical structure
of the sentence correctly.
Signs of the end of the paragraph ()are automatically
deleted by the program, and thus two lines become one
line. Therefore it is necessary to put a point (.) at
the end of the sentence.
- Place
diacritics correctly!
Remark: as a rule, the electronic translator can't recognize
a word with the Russian letter (ё) and also words
with emphasis.
- Observe
the case of letters! A lowercase letter in a word
can quite become a capital one (for example, at the
beginning of the sentence, in the header), and it is
taken into account when developing MT systems. On the
contrary, the capital letter becomes seldom a lowercase
one, and in most cases it is related to derivation of
a new word, for example, at transition of a proper noun
in the class of common nouns - xerox ??? etc.). Because
the word Internet is usually written with the capital
letter there is no sense to complain (as one author
of message in the guest book of the server www.translate.ru
it does) that "there isn't the word Internet in
your dictionary".
Besides,
there are languages where the first capital letter
in a word in principle changes its appurtenance to
one or another part of speech. Certainly, an example
for it is German language in which nouns are
written with capital letter both at the beginning
and in middle of the sentence. Compare these translations:
"wie funktioniert das ûbersetzen
mit dem "clipboard"?" - "How
it works translate with "clipboard"?"
Or
" Wie funktioniert
das Übersetzen mit dem "clipboard"?"
- "How does the
clipboard translation work?"
- Try
to use simple syntactical constructions with the direct
word order.
For example, on the first place in the sentence there
should be the subject or its group (I,
you, he, my cat, my chief, son of my girlfriend).
On the second place is the predicate expressed by a
verb (want, know, like).
Further there should be adverbs expressed by different
parts of speech.
A lot of
guidelines on how to make the text in the natural
language more "digestible" for the computer
can be found at:
http://alemeln.narod.ru/progper2.html
- Try
to avoid skipping of syntactic words (even if it is
allowed in the grammar). Here is an example. English
sentence: "Your e-mail address
is the address other people use to send e-mail messages
to you" will be translated into Russian
as not quite understandable text: "Ваш
адрес электронной
почты - адрес
другое использование
людей, чтобы
послать почтовые
сообщения
Вам." Now after restoring
the one skipped word - the conjunction that:
"Your e-mail address is the
address that other people use to send e-mail
messages to you" -we'll
receive quite correct variant: "Ваш
адрес электронной
почты - адрес,
который другие
люди используют,
чтобы послать
почтовые
сообщения
Вам."
- Use
only conventional abbreviations! Incorrect translation
of an abbreviation is only a part of the problem. The
matter is that even one not translated word can prevent
the electronic translator from analyzing the syntactical
structure of the sentence correctly (abbreviations participate
in syntactical links alongside with common words).
The writing of some abbreviations coincided with frequently
used words could result in unpleasant consequences.
For example, Russian abbreviation ПО
(software) is written in the same way as Russian
preposition по (on)
(the case of letters does not play a role in this example
as it is allowed to write a preposition with the capital
letters, for example, in the header). Therefore, we
regret to say, that translation of the following phrase
"Я часто
использую
это ПО" consistently
looks like "I frequently
use it ON." On the other hand, if you are
not too lazy and write "Я
часто использую
это программное
обеспечение"
the translation will be "I
frequently use this software."
- Avoid
using slangy expressions! Of course, we are speaking
not about the criminal slang (though we could assume
that the users of MT systems could use it). Law-abiding
native speakers also use quite often during informal
communication some words, expressions and constructions
not belonging to literary norm ("Люди,
решите траблу!
Не могу зарегить
мыло!" (literary norm:
Help me please to solve the problem - I couldn't sign-in
an e-mail account) ). On the one hand, such words
appear in speech earlier, than in dictionaries. On the
other hand, it is not always advisable to add neologisms
to the dictionary, e.g. the word "мыло"
(soap) for the most users of MT systems is related to
the denotation of a detergent.
|
|
| Babblefish
Language Lessons |
|
BBC
- Languages
Lessons in many languages
The
Virtual CALL Library
Computer Aided Language Learning Software
single-serving.com
Quickly learn essential phrases and words for travelling,
in easy single-serving doses! Great for beginners!
Holiday
Prases
A great list of essential holiday phrases Now with mp3 downloads!
In many languages.
Phrasebase
Language Learning Resources
Your Conversational Language Learning Resource Center and
Community |
Language
Resources -
Grammar guide, and much more. |
|
Wordchamp.com
Provides members with shared open content, exercises for
language learning, free teacher resources, and personal
tools to assist anyone in the day-to-day use of a foreign
language.
Spanish
Resources
Click here to learn about Spanish culture, check out the
Spanish grammar guide, and much more.
French
Resources
Click here to learn about French culture, check out the
French grammar guide, and much more.
German
Resources
Click here to learn about German culture, check out the
German grammar guide, and much more.
English
Resources
Click here to learn about English culture, check out the
English grammar guide, and much more.
WordNet®
A large lexical database of English, developed under the
direction of George A. Miller. Nouns, verbs, adjectives
and adverbs are grouped into sets of cognitive synonyms
(synsets), each expressing a distinct concept. Synsets
are interlinked by means of conceptual-semantic and lexical
relations.
Verbix Verb Conjugator
Here is a verb conjugator that conjugates the verbs of
over 50 different languages for you.
Verb charts: English French German Italian Spanish |
|