Stron w wątku: [1 2] > | Machine Translation Postediting - Translator's views needed Autor wątku: hzhang
| hzhang Local time: 01:32 angielski > chiński
For those of you who specialize in/have experience in Machine Translation Postediting, could you tell me what it's like working on MT post-editing/how you find working on these? I am referring to actual Machine Translation software (not machine translation performed from free online tools.
Since we're seeing an increase in this request from clients, we are trying to get the translators' views on this.
Your help is appreciated.
Thanks,
Sandra
<... See more For those of you who specialize in/have experience in Machine Translation Postediting, could you tell me what it's like working on MT post-editing/how you find working on these? I am referring to actual Machine Translation software (not machine translation performed from free online tools.
Since we're seeing an increase in this request from clients, we are trying to get the translators' views on this.
Your help is appreciated.
Thanks,
Sandra
[Subject edited by staff or moderator 2009-04-21 14:23 GMT] ▲ Collapse | | | It is just an infant yet to be born! | Apr 21, 2009 |
Hi,
In last 4 months, I have received some machine translations for post editing. Clients did not mention that this is machine translation, being here for a long time I imagined that this not human work. I reported to those clients about the translation. Gave them, detailed feedback about language quality, absence of grammar and all related issues. In my communication, I also did not charge them for providing machine translation. I just tried to tell them in different words that, ‘this n... See more Hi,
In last 4 months, I have received some machine translations for post editing. Clients did not mention that this is machine translation, being here for a long time I imagined that this not human work. I reported to those clients about the translation. Gave them, detailed feedback about language quality, absence of grammar and all related issues. In my communication, I also did not charge them for providing machine translation. I just tried to tell them in different words that, ‘this not a man or young man or a boy or even a baby, it is just an infant yet to be born, what to call ‘it’?
Working on this job was, just as good as re-translation. I requested them to reschedule it, and two of them asked me to start with fresh translation, where the other went to others! ▲ Collapse | | | Post-editing vs Human Translation | Apr 21, 2009 |
hzhang wrote:
I am referring to actual Machine Translation software (not machine translation performed from free online tools.
There really is not a whole lot of difference. Many of the free on-line tools are simply earlier versions of marketed software.
As a translator, I would prefer to translate the job from scratch rather than trying to re-work/check sentences written by a computer. If I were forced to do it, I would charge more for this service rather than less. | | | Charge more than for direct translation | Apr 22, 2009 |
Your question infers the rather odd idea that a machine translation of a text is somehow more understandable than the original text so that editing it should be cheaper than the price of a good translation.
I have defended MT on the Proz site but only as part of an expert system progammed by the translator him/herself.
Otherwise the use of machine translation will only make things more difficult and will almost certainly increase the likelihood of mistranslations.
[Edit... See more Your question infers the rather odd idea that a machine translation of a text is somehow more understandable than the original text so that editing it should be cheaper than the price of a good translation.
I have defended MT on the Proz site but only as part of an expert system progammed by the translator him/herself.
Otherwise the use of machine translation will only make things more difficult and will almost certainly increase the likelihood of mistranslations.
[Edited at 2009-04-22 05:24 GMT] ▲ Collapse | |
|
|
DZiW (X) Ukraina angielski > rosyjski + ... human-vs-machine | Apr 22, 2009 |
When I was a student I also tried to make my job easier and used to post-edit a MT.
First, depending on the context it did pretty well;
Second, I still could refer to the original source, so it was easy;
Third, I did save some 30 minutes or so not typing the whole text;
... See more When I was a student I also tried to make my job easier and used to post-edit a MT.
First, depending on the context it did pretty well;
Second, I still could refer to the original source, so it was easy;
Third, I did save some 30 minutes or so not typing the whole text;
Fourth, most terms were correctly and consistently translated;
and forth on - some MT sentences were more concise and techy than I initially thought and you can bet there were NO misspelling at all.
But when later I began learning CAT I did abandon using MT, yet it still could be useful to have a MT suggestion though
Cheers
[Редактировалось 2009-04-22 09:15 GMT] ▲ Collapse | | | Depends on quality | Apr 22, 2009 |
About 1/3rd of my turnover is editing/proofreading and to be honest:
I really don't care who produced the translation, a human or a maschine.
I always have a look at the translation first and decide if it is worth editing/proofreading/rescuing. If not, I inform the customer that a retranslation is required and that he/she need not waste my time with translations from this source in the future.
Works very well, and the amount of proofreading work I get is i... See more About 1/3rd of my turnover is editing/proofreading and to be honest:
I really don't care who produced the translation, a human or a maschine.
I always have a look at the translation first and decide if it is worth editing/proofreading/rescuing. If not, I inform the customer that a retranslation is required and that he/she need not waste my time with translations from this source in the future.
Works very well, and the amount of proofreading work I get is increasing. In my experience, customers are ok, if you tell them that a translation is not worth any additional effort. With regards to projects that are announced as MT (Maschine Translation) processed. I'll have a look first too, and very often I do not consider these projects as beeing fit enough for further processing.
Please note that TM (Translation memory) based translations are a completely different beast. I do love to work on projects for clients that do use well structured and approved TMs, they do make my job much easier.
Siegfried ▲ Collapse | | | Philippe Etienne Hiszpania Local time: 07:32 Członek ProZ.com angielski > francuski Text suitability | Apr 22, 2009 |
From my experience, MT can be helpful, but I have only worked in the following setting (irrelevant of the agency customer) :
- EN>FR
- suitable texts
- massive projects (multimillion-word EN>FR projects from agencies),
- upstream priming with human translations (agency in-house or outsourced) and glossary compilation to "train" the thing, feed it with appropriate terminology, set it up and fine-tune it along the way
- someone dedicated to MT management and mainten... See more From my experience, MT can be helpful, but I have only worked in the following setting (irrelevant of the agency customer) :
- EN>FR
- suitable texts
- massive projects (multimillion-word EN>FR projects from agencies),
- upstream priming with human translations (agency in-house or outsourced) and glossary compilation to "train" the thing, feed it with appropriate terminology, set it up and fine-tune it along the way
- someone dedicated to MT management and maintenance during the project
- MT paired with TM
In such conditions, MT can be productive.
But even with the MT system at peak performance on suitable texts, I found that post-editing MT still requires far more time than editing a human translation of acceptable quality.
I flatly turn down any MT-prepared stand-alone text that I feel doesn't meet my requirements in terms of MT usefulness. Without the early steps and ongoing refining, raw MT output is probably no better than any of the free online tools.
Lastly, post-editing MT is not very gratifying and I feel constrained, but if paid sensibly, I tend to forget about these drawbacks. Once used to this task, I got a detailed overview of clauses/patterns where MT fails/succeeds and can detect possible improvements to the MT system. Not only does it ultimately translate into a higher hourly rate when paid by the source word, but I almost get a sense of the MT system's way of "thinking", and this is interesting. ▲ Collapse | | | Jeff Allen Francja Local time: 07:32 wiele języków + ... progressively added functionality in MT packages beyond free online portal system | May 11, 2009 |
hzhang wrote:
I am referring to actual Machine Translation software (not machine translation performed from free online tools.
Jeff Whittaker wrote:
There really is not a whole lot of difference. Many of the free on-line tools are simply earlier versions of marketed software.
Jeff,
This is not quite the difference between the free online MT portals and the paid packaged software and enterprise level systems.
The online portals are more or less the stripped down, non-customizable base engines with a standard dictionary. The standard dictionary is updated regularly with additional terms, and the engine with additional grammar rules, when necessary. Depending on the vendor, they have different criteria for what vocabulary/terms/grammar rules to add to the system, and at what frequency.
As for the free portals, it does depend on the commercial deal with the portal site. If it brings money to the MT vendor, then they would naturally update the standard dictionary more frequently.
The packaged software (often 2-5 different levels) progressively introduces features with limited thresholds on some features or basic level features in the lower cost range packages, and the unlimited and advanced level features in the professional and expert level packages. The enterprise level server versions have 1 or more different levels as well, depending on different factors of system set-up, scaability, how to access the modules. Basically, the more the money, the more the features, and the higher level of sophistication of features for configuration and linguistic customization.
I provided examples of the different levels of such portals and progressively features in MT packaged software in my Localization World presentation in 2004:
http://www.geocities.com/mtpostediting/LWBonn2004-A05JeffAllen.pdf
Jeff | |
|
|
Rod Walters Japonia Local time: 15:32 japoński > angielski Unpromising language pair | May 11, 2009 |
Philippe rightly points out that MT is a full process, and the last step depends heavily on the prior steps.
However, another thing to consider is language pair. By its nature, MT works better in language pairs where the structure is already similar. MT really struggles in Japanese to English, even with extensive preprocessing, and I'd be surprised if the same weren't true with Chinese too.
One thing to ask the clients is, why the increase in MT, and what other avenues ... See more Philippe rightly points out that MT is a full process, and the last step depends heavily on the prior steps.
However, another thing to consider is language pair. By its nature, MT works better in language pairs where the structure is already similar. MT really struggles in Japanese to English, even with extensive preprocessing, and I'd be surprised if the same weren't true with Chinese too.
One thing to ask the clients is, why the increase in MT, and what other avenues towards efficiency gains are they pursuing. ▲ Collapse | | | Jeff Allen Francja Local time: 07:32 wiele języków + ... upfront preparation makes all the difference | May 11, 2009 |
Philippe Etienne wrote:
From my experience, MT can be helpful, but I have only worked in the following setting (irrelevant of the agency customer) :
- EN>FR
- suitable texts
- massive projects (multimillion-word EN>FR projects from agencies),
- upstream priming with human translations (agency in-house or outsourced) and glossary compilation to "train" the thing, feed it with appropriate terminology, set it up and fine-tune it along the way
- someone dedicated to MT management and maintenance during the project
- MT paired with TM
But even with the MT system at peak performance on suitable texts, I found that post-editing MT still requires far more time than editing a human translation of acceptable quality.
I flatly turn down any MT-prepared stand-alone text that I feel doesn't meet my requirements in terms of MT usefulness. Without the early steps and ongoing refining, raw MT output is probably no better than any of the free online tools.
Hi Philippe,
You are in fact very correct on this last point. Content that has not been prepared and gone through the customization stage for dictionary preparation does present the same translation as the online portal of the same MT vendor. The paid commercial system are what allow the user to customize the translation process and output to be better than the freebie version.
As for the postediting effort, it really depends on who did the preparation tasks, how well they are trained in it, the coverage that they attempt to achieve in a given timeframe, how the terminology identification and extraction is done, etc. MT projects can :
1) not be prepared at all
2) be poorly prepared
3) be fairly well prepared
4) be prepared with high level of precision to produce excellent processed content
On one of my own MT projects, I did a first pass at postediting (EN > FR) with no dictionary building (because the text would not be reused) at 6000 words in 6 hours, and then gave it to a second posteditor who only made 11% of changes to the final published version.
On another project (FR>EN), I spent 7.5 hours in preparation and interactive and dictionary building and reiterative translation validation. Then 30 minutes of postediting. Critical document of 10K words (with significantly high level of reusable terminology with other documents in the company, so the dictionary work was very worthwhile). The final version went through multiple internal depts for approval, including the business unit head (a very technical person who requested the translation), and then onto the customer and was used to conduct the procedure described in the document. The customer came back and bought more software and services for the subsequent years.
Caveat for both projects: I was a subject matter expert in source and target languages for the topics covered in both projects, and had excellent level of mastery of the MT tool to conduct the dictionary customization tasks.
Jeff | | | Jeff Allen Francja Local time: 07:32 wiele języków + ... MT for Asian languages versus European languages | May 12, 2009 |
Rod Walters wrote:
However, another thing to consider is language pair. By its nature, MT works better in language pairs where the structure is already similar. MT really struggles in Japanese to English, even with extensive preprocessing, and I'd be surprised if the same weren't true with Chinese too.
One thing to ask the clients is, why the increase in MT, and what other avenues towards efficiency gains are they pursuing.
Thanks Rod for that comment. Yes, fully agree with you.
Someone asked me about this last week and I stated that Inter-European language pairs usually have been quality that European with Asian language pairs. Much depends on the specific system that is used, and the language direction.
More work tends to be done by MT vendors on EN source for language pairs because of the higher potential to use the MT system with EN as the source text or as a pivot language.
An additional point is the maturity of the specific language direction (ie, how many versions of that specific language direction engine have been released). A language direction which has been on the market for 10+ years with 2-5 major versions and many minor versions and patches, will naturally have a higher potential of good transability than a new language direction that has just been put on the market with the most recent major version.
Jeff | | | MT English -> Chinese (or VV) | May 12, 2009 |
Rod Walters wrote:
... , and I'd be surprised if the same weren't true with Chinese too.
Oh? why do you say that...
I have no knowledge of this area, but as far as I know Chinese and English are structurally similar. so ... perhaps Chinese would be more suited to MT from English than Japanese is ... I have no idea really. What did you have in mind, Rod? | |
|
|
Look at the text first | May 12, 2009 |
Siegfried Armbruster wrote:
I always have a look at the translation first and decide if it is worth editing/proofreading/rescuing. If not, I inform the customer that a retranslation is required and that he/she need not waste my time with translations from this source in the future.
Indeed. In a quotation situation on MT translations, I would always require to see a sample of the translation.
MT output won't have typos, but the mistakes will be big in the meaning of sentences (misinterpreted grammar) and terminology, and will make you lose a lot of time. It surely feels like reviewing the work of a translator with a faulty knowledge of the source language and no experience in the field. It will force you to retranslate many sentences from scratch if you are picky enough --being picky is good in a translator! | | | What are the sources? | May 12, 2009 |
Jeff Allen wrote:
On another project (FR>EN), I spent 7.5 hours in preparation and interactive and dictionary building and reiterative translation validation. Then 30 minutes of postediting.
Jeff, your replies are always interesting reading. I thank you for that. I also appreciate the fact that you are patient with us, human translators, who complain about the poor quality of some MT output we see.
Let me ask you these quick questions to better understand what we are talking about when you say "preparation":
- What is the source of terminology information for a typical project?
- Does the customer supply you with a glossary you can feed into the system?
- In case the customer does not supply the terms, how do you pinpoint, research, and feed the terms into the system, and can you do that in languages you don't use? | | | Jeff Allen Francja Local time: 07:32 wiele języków + ... MT preparation steps | May 12, 2009 |
Jeff Allen wrote:
On another project (FR>EN), I spent 7.5 hours in preparation and interactive and dictionary building and reiterative translation validation. Then 30 minutes of postediting.
Tomás Cano Binder, CT wrote:
Jeff, your replies are always interesting reading. I thank you for that. I also appreciate the fact that you are patient with us, human translators, who complain about the poor quality of some MT output we see.
Tomás,
Thanks for your comments. It is good to hear that the info is helpful.
I come from the translation profession, and early in my career got involved in MT and TM tools (back when some of the first large industrial implementation of these began) and have participated in many development, testing, implementation and training of a variety of types of MT tools. Also led a group of translators for 2 years on 5 languages to build up terminology databases and translation memories to train MT systems. And sold and ran translation and localization services for a couple of years.
So, I've been on the human translation side as well as the MT development side (not just a computer geek trying to break the Star Trek communication barrier with a new technology), but always working toward implementing TM and MT as tools to aid translators to be productive. And I've used several MT systems regularly for many years.
Tomás Cano Binder, CT wrote:
Let me ask you these quick questions to better understand what we are talking about when you say "preparation":
- What is the source of terminology information for a typical project?
- Does the customer supply you with a glossary you can feed into the system?
- In case the customer does not supply the terms, how do you pinpoint, research, and feed the terms into the system, and can you do that in languages you don't use?
For point 1: What is the source of terminology information for a typical project
Analyze the source language content:
- First, terminology identification is nothing different than a human translator (HT) would do for project (not speaking for Statistical MT systems). However more time is spent to identify a longer list of terms than you would do with a typical HT project, in order to reduce the number of potential mistranslations by the MT system.
- Second, the real difference is at the terminology analysis stage. For an HT project, you want to create a list of terms and create the best translation per term. For an MT project, you want to create a list of candidate terms, but then carefully optimize it (understanding how the parts of the terms interact with the natural grammar rules of the MT system) in order not to create terminology/dictionary entries in the system that the system would naturally create. Also, to try and pinpoint specific words that when coded in 1 or more ways, could work in combination with the grammar rules, an along for the posteditor to choose 1 or the other interactively during the translation processing phase. It really is not complicated as it seems, but with a lot of practice, you can really identify the most frequent terms and their related terms to enrich the dictionary with an amount of time spent that produces a higher reduction of time on the postediting task.
But don't create too many terms in the dictionary. It's a temptation, and many have done it that way. It's a possible path to take, but does not necessary produce better quality for the time spent.
Some very concrete examples of terminology are in the target language output of my article indicated below on MT dictionary building.
For point 2: Does the customer supply you with a glossary you can feed into the system
Often not. I've described in my articles on the topic that the key is doing a good analysis of the source text, identify all terminology/dictionary candidates, create a list of them (in Excel), and reduce the list down to those that are really needed. One of my project below indicates that I reduced the candidate list of 1100 terms down to 570 total terms to translation and terms not to translate (also need to be coded because the MT system does try to translate them, but wrongly.)
So the terminology analysis method allowed me to reduce the terminology translation phase time and costs by nearly 50%, but to achieve the same general level of quality with the same list of terms.
One important item is that if a term only appears once, and it or related terms will be very infrequent, then it's generally not worth creating the dictionary entry. Rather catch it during the postediting phase.
For point 3: how do you pinpoint, research, and feed the terms into the system
This is described in the 6-page paper on the project I mentioned: pages 3-4 give all the details you are looking for.
http://www.geocities.com/mtpostediting/Jeff-Allen-AMTA2004-paper_v1.01.pdf
Ah, sorry, I said 7.5 hours for all. It was 7.5 hours for the first set of 4000 words of content, and then 1.5 hours for the additional 4000 words.
And a more indepth project conducted in 2006.
http://www.geocities.com/mtpostediting/MT-dictionary-building-casestudy-jeff-allen_v100.pdf
page 5 describes in detail (all logged in an Excel sheet) the tasks conducted during 32 separate sessions each ranging from 10 min to 90 min. If you read that page with the Excel sheet, you can see the sequence of steps very clearly, how much time for each step. I also indicated the number of dictionary entries completed during each session.
The most important thing about this project in 2006 is that "intentionally" conducted the entire terminology identification and analysis steps in a very manual way with Word and Excel. The purpose was to show a baseline time, which can very easily be sped up significantly by using any of the terminology extraction tools available on the market today. It was important to show that the identification and analysis methodology is the key, and that other tools can simply accelerate the process, but will not be a replacement to the method itself.
Point 4: can you do that in languages you don't use?
the source content identification and analysis is the same. It can be adapted based on the target language, but you can always start with the same initial terminology candidate list.
And you always need a HT to do the terminology translation. I described a way to put this into place, for a language I do not know, with a step by step procedure, in:
Getting started with Machine Translation
https://www.multilingual.com/downloads/screenSupp69.pdf
And as for preparation time, all of the time effort was logged and documented on these projects in those articles. And the overall translation time always is less than the standard translation speed of 2000-3000 words per day (not using TM or speech recognition software).
Important to note that it is also possible to combine TM and speech tools with MT. There are however different levels of (in)compatibility between the various TM and MT software packages, so it requires fine-tuning the workflow process to use them together.
Jeff | | | Stron w wątku: [1 2] > | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Machine Translation Postediting - Translator's views needed Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
| Pastey | Your smart companion app
Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.
Find out more » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |