torek, 07. december 2010

Improving translation consistency with dictionary display in Rosetta

MULTIPLE DICTIONARIES

A few weeks ago I read an article about 10 problems that should be solved in Ubuntu. One of the problems was that are two strings: Recycle bin and Trash can. This really made me smile. While everyone agrees this is a problem, it is nothing compared to the challenges Non-English users are faced to. For almost every word in English there are multiple translations used in FLOSS software by Slovenian language (situation is similar in other languages I am at least partially fluent in). Additionally there are many strings, which were invented to describe a function of the software and don't exist outside the software world (I am talking about strings such as anycast or demultiplexer). These strings have no direct translations. To make things even more interesting some strings have multiple meanings in English (such as N/A which can mean either not available or not applicable).

If we are to offer the same quality of Ubuntu experience for Non-English users we need to unify translation strings. A question HOW? immediately pops up. Use a dictionary. Which one? Google Translate? Not specialized for FLOSS software (also very wrong sometimes). Use national Evoterm dictionary? It looks too Windows oriented. Then use some FLOSS dictionary. But which one? As always with volunteer projects people have various opinions and hence sometimes choose different options when translating.

In the past there was Slovenian dictionary (is some cases more like accepted strings within the project not formally published) for Gnome, KDE, Firefox, OpenOffice etc and each Linux distribution is also using own translation strings. Multiple groups had each their own dictionary, which are at least slightly incompatible.

UNITING DICTIONARIES

To solve that problem we decided to accept a common uniting dictionary. In order to speed up the word approval process we decided to start on small scale. This was about one year ago. Initially there was cooperation between Gnome and Ubuntu translators groups and later KDE also joined the fun :). The result is a dictionary Pojmovnik (please ignore the certificate warning) that is ambitiously trying to combine all of local FLOSS dictionaries into a single dictionary. Currently it holds some 860 words, which means more than two words are added per day in average.

TRANSLATION QUALITY ASSURANCE

Of course the dictionary needs to be used and old translations need to be fixed and terminology synchronized. To do that we spent the majority of Gnome 2.32 and Ubuntu 10.10 release cycle reviewing all .po files, removing old and adding new terminology and also fixing hundreds of minor bugs such as misspellings, which have lingered around for ages. As such Ubuntu 10.10 is dramatically more polished language wise compared to previous releases. Work was relatively easy for gnome packages using translation tools. Of course they weren't 100% effective so line per line review was still required, but there was no such possibility in Launchpad.

Additionally we wanted to have a way to easily suggest accepted terminology within Launchpad as changing tabs and searching for strings is not practical. Besides improving translations quality this could significantly increase translation pace.

We also discussed this at #ubuntu-translators but couldn't find fast and easy solution for us.

DICTIONARY DISPLAY IN ROSETTA

Then we started search for alternatives and found a nice Firefox extension, EHTip which is perfect for solving our problem. Extension uses local dictionary that can be prepared with simple syntax: "English word=Foreign word". Each string has to be in its own row and file saved into UTF-8 file format, saved for example with Gedit program. One of our tech savvy users created a bash script to transform dictionary wiki page into the syntax required by Firefox extension. Advantage of the extension is that it can be also used on other online translation pages such as Pootle. It's worth noting the extension author is planning to have Chrome-ium extension done by January 2011. We have also created detailed print-screen step-by-step instructions how to implement this solution. Several translators have started using it since (we created final version only last week), and they are reporting this has made their translation workflow much smoother and efficient. Such a success in such a short time is because of simplicity of us: Move mouse over English word and pop-up with translation appears. Can't be easier. See image below.



As we are very proud of our solution we are sharing it with everyone so you can use it in other translation teams as well.

If you are a translator please let us know how do you handle this. Is terminology a problem in your language? How are you trying to solve it and how successful are you? Please share your thoughts in the comments as we are very keen to know them. :)

Ni komentarjev:

Objavite komentar