Differences between versions of Lucene tokenizers
Thread poster: CafeTran Training (X)
CafeTran Training (X)
CafeTran Training (X)
Netherlands
Local time: 10:30
Jul 10, 2016

When I create a new project in omegaT 3.6, it'll use version 3.0 of the Lucene tokenizer for the source language German and version 3.6 for the target language Dutch.

Since I see that I can also manually select version 3.6 of the Lucene tokenizer for the source language German, I'd like to learn what the differences are between the versions 3.0 and 3.6 of the Lucene tokenizer for German.


 
Didier Briel
Didier Briel  Identity Verified
France
Local time: 10:30
English to French
+ ...
3.0 provides better stemming Jul 10, 2016

CafeTran Training wrote:
When I create a new project in omegaT 3.6, it'll use version 3.0 of the Lucene tokenizer for the source language German and version 3.6 for the target language Dutch.

Since I see that I can also manually select version 3.6 of the Lucene tokenizer for the source language German, I'd like to learn what the differences are between the versions 3.0 and 3.6 of the Lucene tokenizer for German.

According to translators translating from German, 3.0 uses a better stemming algorithm compared with 3.1 and latter.

You can read a thread on what started the need to configure the behaviour here:
https://groups.yahoo.com/neo/groups/OmegaT/conversations/topics/28375

In OmegaT 4.0, selecting the behaviour won't be necessary. All the tokenizers perform correctly, except German for which we found a way of replicating tokenizer 3.0 behaviour.

Didier


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Differences between versions of Lucene tokenizers






Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »