Very different analysis results of same source for different languages
Autor wątku: Mikhail Kropotov
Mikhail Kropotov
Mikhail Kropotov  Identity Verified
Niemcy
Local time: 07:54
angielski > rosyjski
+ ...
Mar 9, 2015

I'm using MemoQ 2013 R2 to coordinate translations of the same source file into four languages: Russian, French, German and Spanish. The source file includes software strings in .po format for localization.

I've had the file completely translated into all four languages and all translations have been saved to their respective TMs. Then the development team had to change a couple of strings in the source file. Running statistics on the new updated file, which is essentially all the s
... See more
I'm using MemoQ 2013 R2 to coordinate translations of the same source file into four languages: Russian, French, German and Spanish. The source file includes software strings in .po format for localization.

I've had the file completely translated into all four languages and all translations have been saved to their respective TMs. Then the development team had to change a couple of strings in the source file. Running statistics on the new updated file, which is essentially all the same as before, I get two very different kinds of analysis results.

For Russian, Spanish and French I get:

Type ----------- Segments --- Source words
=============================
All ------------- 782 ----------- 2962
Repetition --- 0 --------------- 0
101% -------- 765 ------------ 2896
100% -------- 12 -------------- 35
95%-99% --- 0 --------------- 0
85%-94% --- 0 --------------- 0
75%-84% --- 2 --------------- 7
50%-74% --- 2 --------------- 9
No match ---- 1 --------------- 15

But for German, on the same source file, I get:

Type ----------- Segments --- Source words
=============================
All ------------- 728 ------------ 2698
Repetition --- 11 -------------- 25
101% --------- 227 ----------- 807
100% --------- 302 ----------- 881
95%-99% --- 18 -------------- 42
85%-94% --- 26 -------------- 136
75%-84% --- 65 -------------- 346
50%-74% --- 112 ------------- 519
No match -----21 -------------- 206

Could someone please explain this drastic difference, or maybe tell me where to look next to understand what causes it?

Thank you in advance for any ideas.

[Edited at 2015-03-09 15:11 GMT]
Collapse


 
Rossana Triaca
Rossana Triaca  Identity Verified
Urugwaj
Local time: 02:54
angielski > hiszpański
Segmentation Rules Mar 10, 2015

First thing that came to mind, given the different number of segments/words, is an issue with the segmentation rules.

Are you sure you are using the same segmentation rules for the source file for all the analyses? These are usually given by the source language, but if you opened/edited the file and changed the language (or codification, or wrapping) mid-way this could explain the difference.


 
USTranslation
USTranslation
USA
Local time: 23:54
angielski
Homogeneity/reps take precedence Mar 10, 2015

Здравствуйте, Михаил!

Differences in analysis numbers may come from several things:

First, you could double check to see if your TM really has the source segments you are analyzing. "Export to TMX" then view with Okapi Olifant http://okapi.sourceforge.net/downloads.html

If all your translations are there, you could check the following:... See more
Здравствуйте, Михаил!

Differences in analysis numbers may come from several things:

First, you could double check to see if your TM really has the source segments you are analyzing. "Export to TMX" then view with Okapi Olifant http://okapi.sourceforge.net/downloads.html

If all your translations are there, you could check the following:

"Project TMs and corpora" checkbox may remain unchecked, causing the analysis to bypass the TM
The checkbox "Homogeneity" was checked, enabling fuzzy matches from within the project with no TM
The checkbox "Repetitions take precedence over 100% matches" was unchecked, causing all repetitions to be counted as 100% matches

Make sure "Project TMs and corpora" is checked. "Homogeneity" should be unchecked. "Repetitions take precedence over 100%" should be checked. "Disable cross-file repetitions" should also be checked.

Some of these options may only be available in memoQ 2014 R2 but I'm not sure.

Good luck!

Nick Lambson
U.S. Translation Company
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Very different analysis results of same source for different languages






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »