Автор: Пользователь скрыл имя, 08 Мая 2012 в 20:02, курсовая работа
Об’єктом дослідження даної курсової роботи виступають лексичні особливості перекладу текстів наукового стилю у галузі машинного перекладу з англійської мови на українську.
Метою дослідження є виконання перекладу наукового тексту та визначення головних лексичних особливостей та труднощів перекладу.
Вступ………………………………………………………………………………2
Розділ 1: Фонові знання перекладача у галузі машинного перекладу………..3
Розділ 2: Текст перекладу……………………………………………………….9
Розділ 3: Перекладацький коментар……………………………………………16
Висновки…………………………………………………………………………25
Список використаної літератури……………………………………………….26
Додаток А………………………………………………………………………..27
Додаток Б………………………………………………………………………...35
Список використаної літератури
1. Грицанов А.А. Новейший философский словарь / А.А.Грицанов//. - Мн.: Изд. В.М. Скакун, Минск, 1998
2. Карабан В. І. Посібник-довідник з перекладу наукової і технічної літератури : Частина 1 / Карабан В. І. – Флоренція-Страсбург-Гранада-
3. Комиссаров В.Н. Теория перевода./ В.Н. Комиссаров// Лингвистические аспекты теории перевода – Учебник для институтов и факультетов иностранного языка - М.: Высш. шк., 1990. – с. 117-253
4. Мейрамова С.А. Семантизация терминологической лексики в обучении чтению и переводу научно-технической литературы/ Мейрамова С.А. // Мышление – язык - лингводедактика : Научнно – практическое пособие – Алмата, 2000 – Ч.1. –с 121-137
5. Мурот В. П. Функциональный стиль / В. П. Мурот // Лингвистический Энциклопедический Словарь. – М. : Советская энциклопедия, 1990. – С. 567-568.
6. Солганик Г.Я. Стилистика текста./ Г.Я. Солганик// Учебное пособие по стилистике текста - М.: Флинта, Наука, 1997.—256 с.
7. Троянская Е. С. Полевая структура научного стиля и его жанровых разновидностей / Е. С. Троянская // Общие и частные проблемы функциональных стилей. – М. : Наука, 1986. – С. 16-28.
8. Bar-Hillel Yehoshua, "Automatic Translation of Languages". / Yehoshua Bar-Hillel// Advances in Computers, vol. 1 М. Hebrew University, Jerusalem, Israel 1960. - p.91-163
Додаток А
Англо-український словник
adjust | упорядковувати |
AI (artificial intellect) | ШІ (штучний інтелект) |
All Purpose Electronic Computer (APEC) | Універсальний електронний комп'ютер (УЕК) |
ambiguous | двозначний |
approach | спосіб,метод,підхід |
assume | припускати |
attained | досягнутий |
Automatic Language Processing Advisory Committee (ALPAC) | Комитет по проблемам автоматической обработки речи |
automatic machine translation | aвтоматичний машинний переклад |
bilingual | Двомовний |
capture | Схопити,захопити |
case-based | заснований на прецедентах |
Closely related | споріднений |
Cognitive operation | Когнітивний процес |
Comprehensive knowledge | Всебічні знання |
computational power | обчислювальна потужність |
Computer software | Комп’ютерне програмне забезпечення |
computer-aided translation | автоматизований переклад |
context – embedded | контекстуально залежний |
corpus of data | корпус,масив даних |
corpus technique | корпусний метод |
correlation of meaning | урахування смислових зв’язків |
correspondence | співвідношення |
counterparts | Еквіваленти |
customization | Налаштування |
decode | Розшифровувати |
denote | позначати |
derive | отримувати |
Dictionary entries | Словникові статті |
dictionary-based | заснований на застосуванні словників |
digital computer | цифрова обчислювальна машина |
disambiguation | неоднозначність |
distinguish | розрізняти |
domain | Галузь |
engine | Програмний механізм |
evaluate | оцінювати |
example-based | заснований на основі прикладів |
formulaic language | Шаблонна мова |
generate | утворювати |
grammar-based | заснований на граматиці |
Grammatical and lexical exigencies | Граматичні та лексичні вирішення |
Human intervention | Втручання людини |
Hybrid Machine translation (HMT) | Гібридний машинний переклад |
idioms | Ідіоми |
implementation | реалізація |
In-principle obstacles | Принципові перешкоди |
input sentence | речення на мові оригіналу |
Interactive translation | інтерактивний переклад |
Interlingua | мова-посередник |
Interlingual | Міжмовний |
intermediary representation | проміжна репрезентація |
Interpret | Витлумачити |
Language-independent representation | Репрезентація,незалежна від перекладацької мови |
leverage | врівноважувати |
lexicon | словник |
linguist | Лінгвіст |
Linguistic rule | Лінгвістичне правило |
Linguistic typology | Лінгвістична типологія |
machine Translation (MT) | Машинний переклад |
machine translation software | програмне забезпечення для машинного перекладу |
Machine-aided human translation (MAHT) | Людський переклад з участю ЕОМ (ЛП з ЕОМ) |
methodology | Методика |
Morphological analysis | Морфологічний аналіз |
native speaker | носій мови |
natural language | природна мова |
normalization | Нормалізація |
Original sentence | речення на мові оригіналу |
Output | синтезовуваний текст, кінцевий варіант перекладу |
Paradigm | різновид,парадигма |
Pattern | шаблон |
post-process | попередньо обробляти |
Pre-process (data) | Попередньо обробляти (данні) |
presume | передбачити |
re-encoding | перекодування |
regularity | закономірність |
Retrieve | Вилучати |
rudimentary translation | елементарний переклад |
rule-based | заснований на системі правил |
scholar | вчений |
scope | сфера |
semantics | Семантика |
shallow and deep approaches | поверхневі та глибинні підходи |
shallow-transfer MT | поверховий МП (машинний переклад) |
Skilled linguist | Кваліфікований лінгвіст |
source language | мова оригіналу |
Source text | Текст оригіналу |
Statistical machine translation (СМТ) | Статистичний машинний переклад (СМП) |
statistical model | статистична модель (машинного перекладу) |
Statistical technique | Статистичний метод |
sub-field | Підрозділ |
Substantial funding | Значні інвестиції |
substitution of words | заміна слів |
syntax | синтаксис |
target language | мова-реципієнт |
text corpora | корпуси текстів |
trace back | простежувати |
transfer-based | заснований на принципі переміщення |
unambiguously | Однозначно |
Usher (in) | Сповістити про поачаток |
word-by-word | Дослівно |
УКРАЇНСЬКО-АНГЛІЙСЬКИЙ СЛОВНИК
автоматичний машинний переклад | automatic machine translation |
автоматизований переклад | computer-aided translation |
Вилучати | Retrieve |
Витлумачити | Interpret |
врівноважувати | leverage |
Всебічні знання | Comprehensive knowledge |
Втручання людини | Human intervention |
вчений | scholar |
Галузь | domain |
Гібридний машинний переклад | Hybrid Machine translation (HMT) |
Граматичні та лексичні вирішення | Grammatical and lexical exigencies |
двозначний | ambiguous |
Двомовний | bilingual |
Дослівно | word-by-word |
досягнутий | attained |
Еквіваленти | counterparts |
елементарний переклад | rudimentary translation |
закономірність | regularity |
заміна слів | substitution of words |
заснований на основі прикладів | example-based |
заснований на граматиці | grammar-based |
заснований на застосуванні словників | dictionary-based |
заснований на прецедентах | case-based |
заснований на принципі переміщення | transfer-based |
заснований на системі правил | rule-based |
Значні інвестиції | Substantial funding |
Ідіоми | idioms |
інтерактивний переклад | Interactive translation |
Кваліфікований лінгвіст | Skilled linguist |
Когнітивний процес | Cognitive operation |
Комитет по проблемам автоматической обработки речи | Automatic Language Processing Advisory Committee (ALPAC) |
Комп’ютерне програмне забезпечення | Computer software |
контекстуально залежний | context – embedded |
корпус,масив даних | corpus of data |
корпуси текстів | text corpora |
корпусний метод | corpus technique |
Лінгвіст | linguist |
Лінгвістична типологія | Linguistic typology |
Лінгвістичне правило | Linguistic rule |
Людський переклад з участю ЕОМ (ЛП з ЕОМ) | Machine-aided human translation (MAHT) |
Машинний переклад | machine Translation (MT) |
Методика | methodology |
Міжмовний | Interlingual |
мова оригіналу | source language |
мова-посередник | Interlingua |
мова-реципієнт | target language |
Морфологічний аналіз | Morphological analysis |
Налаштування | customization |
неоднозначність | disambiguation |
Нормалізація | normalization |
носій мови | native speaker |
обчислювальна потужність | computational power |
Однозначно | unambiguously |
отримувати | derive |
оцінювати | evaluate |
передбачити | presume |
перекодування | re-encoding |
Підрозділ | sub-field |
поверхневі та глибинні підходи | shallow and deep approaches |
поверховий МП (машинний переклад) | shallow-transfer MT |
позначати | denote |
попередньо обробляти | post-process |
Попередньо обробляти (данні) | Pre-process (data) |
Принципові перешкоди | In-principle obstacles |
припускати | assume |
природна мова | natural language |
програмне забезпечення для машинного перекладу | machine translation software |
Програмний механізм | engine |
проміжна репрезентація | intermediary representation |
простежувати | trace back |
реалізація | implementation |
Репрезентація,незалежна від перекладацької мови | Language-independent representation |
речення на мові оригіналу | input sentence |
речення на мові оригіналу | Original sentence |
різновид,парадигма | Paradigm |
розрізняти | distinguish |
Розшифровувати | decode |
Семантика | semantics |
синтаксис | syntax |
синтезовуваний текст, кінцевий варіант перекладу | Output |
словник | lexicon |
Словникові статті | Dictionary entries |
співвідношення | correspondence |
Сповістити про поачаток | Usher (in) |
споріднений | Closely related |
спосіб,метод,підхід | approach |
статистична модель (машинного перекладу) | statistical model |
Статистичний машинний переклад (СМП) | Statistical machine translation (СМТ) |
Статистичний метод | Statistical technique |
сфера | scope |
Схопити,захопити | capture |
Текст оригіналу | Source text |
Універсальний електронний комп'ютер (УЕК) | All Purpose Electronic Computer (APEC) |
упорядковувати | adjust |
урахування смислових зв’язків | correlation of meaning |
утворювати | generate |
цифрова обчислювальна машина | digital computer |
шаблон | Pattern |
Шаблонна мова | formulaic language |
ШІ (штучний інтелект) | AI (artificial intellect) |
Додаток Б
Текст оригіналу
Machine translation
Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation MAHT and interactive translation) is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. On a basic level, MT performs simple substitution of words in one natural language for words in another, but that alone usually cannot produce a good translation of a text, because recognition of whole phrases and their closest counterparts in the target language is needed. Solving this problem with corpus and statistical techniques is a rapidly growing field that is leading to better translations, handling differences in linguistic typology, translation of idioms, and the isolation of anomalies.
Current machine translation software often allows for customization by domain or profession (such as weather reports), improving output by limiting the scope of allowable substitutions. This technique is particularly effective in domains where formal or formulaic language is used. It follows that machine translation of government and legal documents more readily produces usable output than conversation or less standardized text. Improved output quality can also be achieved by human intervention: for example, some systems are able to translate more accurately if the user has unambiguously identified which words in the text are names. With the assistance of these techniques, MT has proven useful as a tool to assist human translators and, in a very limited number of cases, can even produce output that can be used as is (e.g., weather reports).
The progress and potential of machine translation has been debated much through its history. Since the 1950s, a number of scholars have questioned the possibility of achieving fully automatic machine translation of high quality. Some critics claim that there are in-principle obstacles to automatizing the translation process. The idea of machine translation may be traced back to the 17th century. In 1629, René Descartes proposed a universal language, with equivalent ideas in different tongues sharing one symbol. In the 1950s, The Georgetown experiment (1954) involved fully automatic translation of over sixty Russian sentences into English. The experiment was a great success and ushered in an era of substantial funding for machine-translation research. The authors claimed that within three to five years, machine translation would be a solved problem. Real progress was much slower, however, and after the ALPAC report (1966), which found that the ten-year-long research had failed to fulfill expectations, funding was greatly reduced. Beginning in the late 1980s, as computational power increased and became less expensive, more interest was shown in statistical models for machine translation. The idea of using digital computers for translation of natural languages was proposed as early as 1946 by A. D. Booth and possibly others. Warren Weaver wrote an important memorandum "Translation" in 1949. The Georgetown experiment was by no means the first such application, and a demonstration was made in 1954 on the APEC machine at Birkbeck College (University of London) of a rudimentary translation of English into French. Several papers on the topic were published at the time, and even articles in popular journals . A similar application, also pioneered at Birkbeck College at the time, was reading and composing Braille texts by computer.
As for the human translation process,it may be described as:
Decoding the meaning of the source text
Re-encoding this meaning in the target language.
Behind this ostensibly simple procedure lies a complex cognitive operation. To decode the meaning of the source text in its entirety, the translator must interpret and analyse all the features of the text, a process that requires in-depth knowledge of the grammar, semantics, syntax, idioms, etc., of the source language, as well as the culture of its speakers. The translator needs the same in-depth knowledge to re-encode the meaning in the target language.
There lies the challenge in machine translation: how to program a computer that will "understand" a text as a person does, and that will "create" a new text in the target language that "sounds" as if it has been written by a person.
This problem may be approached in a number of ways.
Approaches
Machine translation can use a method based on linguistic rules, which means that words will be translated in a linguistic way — the most suitable (orally speaking) words of the target language will replace the ones in the source language.It is often argued that the success of machine translation requires the problem of natural language understanding to be solved first. Generally, rule-based methods parse a text, usually creating an intermediary, symbolic representation, from which the text in the target language is generated. According to the nature of the intermediary representation, an approach is described as interlingual machine translation or transfer-based machine translation. These methods require extensive lexicons with morphological, syntactic, and semantic information, and large sets of rules. Given enough data, machine translation programs often work well enough for a native speaker of one language to get the approximate meaning of what is written by the other native speaker. The difficulty is getting enough data of the right kind to support the particular method. For example, the large multilingual corpus of data needed for statistical methods to work is not necessary for the grammar-based methods. But then, the grammar methods need a skilled linguist to carefully design the grammar that they use. To translate between closely related languages, a technique referred to as shallow-transfer machine translation may be used.
The most popular approaches of translation are:
Rule-based Machine Translation - is a general term that denotes machine translation systems based on linguistic information about source and target languages basically retrieved from dictionaries and grammars covering the main semantic, morphological, and syntactic regularities of each language respectively. Having input sentences (in some source language), an RBMT system generates them to output sentences (in some target language) on the basis of morphological, syntactic, and semantic analysis of both the source and the target languages involved in a concrete translation task.
Transfer-based machine translation – the main idea is that it is necessary to have an intermediate representation that captures the "meaning" of the original sentence in order to generate the correct translation The way in which transfer-based machine translation systems work varies substantially, but in general they follow the same pattern: they apply sets of linguistic rules which are defined as correspondences between the structure of the source language and that of the target language.
Interlingual machine translation- is one of the classic approaches to machine translation. In this approach, the source language, i.e. the text to be translated is transformed into an interlingua, an abstract language-independent representation. The target language is then generated from the interlingua. Within the rule-based machine translation paradigm, the interlingual approach is an alternative to the direct approach and the transfer approach.
Dictionary-based translation- uses a method based on dictionary entries, which means that the words will be translated as a dictionary does – word by word, usually without much correlation of meaning between them.
Statistical machine translation (SMT)- is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The statistical approach contrasts with the rule-based approaches to machine translation as well as with example-based machine translation.
The example-based machine translation (EBMT)- approach to machine translation is often characterized by its use of a bilingual corpus with parallel texts as its main knowledge base, at run-time. It is essentially a translation by analogy and can be viewed as an implementation of case-based reasoning approach of machine learning.
Hybrid machine translation (HMT) - leverages the strengths of statistical and rule-based translation methodologies. There are several approaches of the HMT:
a) Rules post-processed by statistics: Translations are performed using a rules based engine. Statistics are then used in an attempt to adjust/correct the output from the rules engine.
b) Statistics guided by rules: Rules are used to pre-process data in an attempt to better guide the statistical engine. Rules are also used to post-process the statistical output to perform functions such as normalization. This approach has a lot more power, flexibility and control when translating.
Though, there are some serious problems about the machine translation like the word-sense disambiguation. It concerns finding a suitable translation when a word can have more than one meaning. The problem was first raised in the 1950s by Yehoshua Bar-Hillel. He pointed out that without a "universal encyclopedia", a machine would never be able to distinguish between the two meanings of a word. Today there are numerous approaches designed to overcome this problem. They can be approximately divided into "shallow" approaches and "deep" approaches.
Shallow approaches assume no knowledge of the text. They simply apply statistical methods to the words surrounding the ambiguous word. Deep approaches presume a comprehensive knowledge of the word. So far, shallow approaches have been more successful.
The ideal deep approach would require the translation software to do all the research necessary for this kind of disambiguation on its own; but this would require a higher degree of AI than has yet been attained. A shallow approach which simply guessed at the sense of the ambiguous English phrase would have a reasonable chance of guessing wrong fairly often. It involves "ask the user about each ambiguity" would, only automate about 25% of a professional translator's job, leaving the harder 75% still to be done by a human.
Machine translation systems and output can be evaluated along numerous dimensions. The intended use of the translation, characteristics of the MT software, the nature of the translation process, etc., all affect how one evaluates MT systems and their output.
There are various means for evaluating the output quality of machine translation systems. The oldest is the use of human judges to assess a translation's quality. Even though human evaluation is time-consuming, it is still the most reliable way to compare different systems such as rule-based and statistical systems. Automated means of evaluation include BLEU, NIST and METEOR.
Relying exclusively on unedited machine translation ignores the fact that communication in human language is context-embedded and that it takes a person to comprehend the context of the original text with a reasonable degree of probability. It is certainly true that even purely human-generated translations are prone to error. Therefore, to ensure that a machine-generated translation will be useful to a human being and that publishable-quality translation is achieved, such translations must be reviewed and edited by a human. The late Claude Piron, a long-time translator for the United Nations and the World Health Organization, wrote that machine translation, at its best, automates the easier part of a translator's job; the harder and more time-consuming part usually involves doing extensive research to resolve ambiguities in the source text, which the grammatical and lexical exigencies of the target language require to be resolved.