Machine Translation Project
Machine Translation Project
Machine Translation Project
Machine translation
Machine translation is an automatic translation system that makes use of advanced computational linguistic
analysis to process source documents automatically to create target texts without human intervention (Quah,
2006). Research (Arnold, 2003; Austermuhl, 2001; Bhattacharyya, 2015; Cronin, 2003; Garrison & Anderson,
2003; Hutchins, 2005; O’Hagan & Ashworth, 2002) indicated that modern technologies, used in developing
machine translation, continue to provide new possibilities for machine translation to develop so as to impact
daily professional and social life. With the advancement of technology, the need for machine translation has
increased.
For example, the Internet has connected people of different languages and cultures around the world in such
a way as to make machine translation inevitable for translating webpages, and social networks sites.
Moreover, with the spread of personal computers, machine translation is now available for various purposes.
Systems of machine translation serve professional and non-professional translators in various fields in life.
However, while strenuous research efforts have been carried out to develop machine translation systems,
strategies for developing teaching and learning machine translation as an academic discipline are still needed.
Teaching and learning machine translation is an arduous task due to its complex characteristics which require
pedagogic knowledge in various fields including linguistics, translation studies, mathematics, statistics, and
computational sciences. Because machine translation is a multidisciplinary field, avenues for teaching machine
translation, specifically to students who study it for the first time, need to be explored. Hence, the purpose of
this research was to investigate the effectiveness of using project-based teaching strategies for teaching machine
translation. Project-based methodologies are cognitive teaching tactics that are founded on brain-research.
Project-based teaching strategies can also be integrated into problem, and inquiry-based techniques, and
since the three approaches are closely connected with information processing learning theories, they can be
implemented in machine translation classroom.
Research questions:
This research includes a number of technical terms that are used specifically in the areas
of machine translation and education. The following are the definitions of such terms:
Alignment: is the process of binding a source-language segment to its corresponding
target-language segment for creating a new translation memory database or to add to an
existing one (Quah, 2006).
Computer-Aided Translation (CAT): machine translation that is used in localization (i.e.
customization) industries (Hutchins & Somers, 1992).
Educational technology: refers to the use of technological tools in learning, including
machines, networking, and media (Richey, 2008).
Example-based MT: this method relies on a bilingual database of example phrases derived
from a large corpus of texts and their translations (Sumita & Imamura, 2002).
Filter: is a feature that converts a source language text from one format into another to
give the translator the flexibility to work with texts of different formats; hence a translation-
friendly format contains only a written text without any accompanying graphics. In order to
obtain such a format, an import filter separates a text from its formatting code (Esselink,
2000).
Fully Automatic High-quality Machine Translation (FAHQMT): MT performed without any intervention of a human being during the
process of translation (Bar-Hillel, 1960, 2003).
Fuzzy Matching: occurs when an old and a new source-language segment are similar but not exactly identical due to differences of
language usages (Esselink, 1998).
Human-Aided Machine Translation (HAMT): A system wherein the computer is responsible for producing the translation with the
interaction of human monitoring at many stages during the process of translation (Hutchins & Somers, 1992).
Hybrid Machine Translation (HMT): is based on using statistical and rule-based translation methodologies (Hutchins & Somers,
1992).
Machine translation (MT): An automatic translation system with no human intervention that makes use of an advanced computational
linguistic analysis to process source documents and automatically create target texts (Quah, 2006).
Machine-Aided Translation MAT: is used by software community, which develops machine tools in order to perform the tasks of
translation (Hutchins & Somers, 1992).
Machine-Aided Human Translation (MAHT): refers to the act of translation as cooperation between the human translator and the
machine. The focus of machine-aided human translation is on the human translator who uses a number of tools such as a grammar checker,
which examines the grammatical errors that appear because they do not conform to the pre-determined set of grammatical rules stored for a
particular language (Hutchins & Somers, 1992).
Machine-aided Translation (MAT): The use of computer programs by translators to help them during the translation process (Hutchins
& Somers, 1992).
Perfect Matching: is the exact match which occurs when a new source language segment is completely identical to the old segment
found in the database , including inflections, spelling, punctuation (Austermuhl, 2001).
Project-based teaching methodologies: are cognitive teaching strategies that are founded on brain-research, and are connected with
various learning theories such as information processing, inquiry-based, problem-based, constructivism, connectivism, and cognitive
apprenticeship, among others (Schunk, 2011).
Segmentation: is the process of breaking a text up into units consisting of a word or a string of words that is linguistically acceptable;
and this process is needed in order for a translation memory to perform the process of fuzzy and perfect matching (Quah, 2006).
Translation Memory (TM): is a multilingual text archive containing (segmented, aligned, parsed and classified) multilingual texts,
allowing storage and retrieval of multilingual text segments against various search conditions (EAGLES, 1996).
Terminology Management Systems: refers to a systematic arrangement of concepts within a special language, and since this system is
based on concepts not terms; therefore it is systematic not alphabetical (Bononnon, 2000).
Workbench/Workstation: is a single integrated system that is made up of a number of translation tools and resources such as electronic
dictionaries terminology databases, a translation memory, an alignment tool, a tag filter, a terminology management system, and spelling and
grammar-checkers (Quah, 2006).
The Development of Machine Translation
According to Wilks (2009), the period from 1400s to 1600s was the discovery era which necessitated enhancing communication
among speakers of different language. In the 17th century, Leibniz and Descartes proposed their research on codes to relate words between
languages. In the 1930s, Georges Artsrouni developed an automatic bilingual dictionary. In 1949, Warren Weaver presented his Translation
Memorandum, which was the first proposal on computer-based machine translation. Weaver was influenced by McCulloch and Pitts’ (1943)
theory on mathematical modeling of the neural structure of the human brain when he proposed the applicability of cryptographic methods.
The concept of cryptography is related to Claude Shannon’s information theory. Shannon’s theory is concerned with the basic
statistical properties of communication. The most significant outcome of the Weaver’s Translation Memorandum’ was the decision in 1951 at
the Massachusetts Institute of Technology to appoint the logician Bar-Hillel to research the use of mathematical formulae in machine
translation.
In 1950s, the researchers at Georgetown University experimented with a fully automatic translation of more than sixty Russian
sentences into English. Bar-Hillel (1953), argued that MT systems did not operate effectively because the focus was on translating words
rather than the meanings of the texts, and that FAHQMT should not be the goal of machine translation researchers.
The actual progress of MT applications began with the Automatic Language Processing Advisory Committee’s (ALPAC, 1966) report,
which contained an evaluation of ten years of research, pointing to the feasibility of high-quality machine translation. The first MT research in
the 1960 and 1970s depended on linguistic theories of translation, and the research was conducted using empirical trial-and-error methods,
adopting statistical analysis of grammatical and lexical regularities among different languages. Thus, the first MT generation applied word-
for-word translation method, translating only 250 words, six grammar rules and 49 sentences. However, US government funded large-scale
projects to develop MT systems.
The second MT generation systems (from mid-1960s until 1980s) used a large lexicon and a small syntax. The number of MT
installed systems increased, including mainframe technology, SYSTRAN, LOGOS, ARIANE-G5, METAL, and METEO. The ALPS
(Automatic Language Processing System) was developed, using multilingual terminological data manipulation systems. SYSTRAN was
widely used, and the METEO system was in operation in Canada from 1982 to 2001. Pan American Health Organization built two mainframe
systems: from Spanish into English (SPANAM); and from English into Spanish (ENGSPAN). According to Guerra (2000, p.74), these systems
were successful for translating conference documents, scientific papers, training materials and technical brief reports. Gouadec (2007) noted
that the spread of the Internet activities and the increase of personal computers changed communication practices beyond paper files to
enhance the use of electronic emails, word processing, translation memory tools, and terminology management systems. Thus, the progress
that MT made resulted in increasing the number of professional and non-professional translators, and localizations.
Participants Technology Skills
Conclusions
In conclusion, the focus of this quantitative–qualitative research was on the problems and
solutions for machine translation as an academic course at higher education institutions.
Data analysis results pointed to the positive impact of integrating project-based teaching
methods into teaching machine translation on students’ performance.
Engaging students in creative projects would not only help them to improve their academic
achievements, but would also play an effective role in developing new tools for automated
translation. The discussions on the research data analysis include content analysis of students’
projects and how they can put machine translation theories into practice. The review of the
literature contains a thorough analysis of previous research on machine translation and
project-based teaching and learning approaches, and the application of educational
technology in the classroom.
The contribution of this research is derived from three specific areas: integrating education
research into teaching machine translation to motivate students to improve their performance;
employing educational technology to bridge the gap between theories and practice of machine
translation; and providing an implementation of creative teaching in machine translation
through presenting students’ creative projects.
Bibliography
• Barrault, L. (2010). MANY: Open source machine translation system combination. Prague
Bulletin of Mathematical Linguistics, Special Issue on Open Source Tools for Machine
Translation, 93, 147-155. http://www.mtmarathon2010. info/web/Program_files/art-
barrault.pdf