In 2012–2014, Vilnius University implemented the project "The Development of the English-Lithuanian-English and French-Lithuanian-French Machine Translation Systems Based on Statistical Methods" financed by the Structural Funds of the European Union. The machine translation (MT) system ALPMAVIS was created, and the online statistical MT service (https://www.versti.eu/) became available to the public, also accessible through the integrated Lithuanian language and writing resources information system "Raštija.lt" (www.raštija.lt). The creation and development of machine translation systems is a modern intellectual challenge that is of interest, not only to the academic community, but also to the entire of society using modern information technologies. In 2013, artificial neural networks were used for machine translation (MT), and the idea of using computer graphics processing units to compute neural networks opened opportunities to solve real tasks, including machine translation. Millions of artificial neurons are used for neural MT; machine translation is increasingly associated with the development of artificial intelligence, and the quality of translation is increasingly approaching that of a human.
The new opportunities led to the improvement of the machine translation system of Vilnius University. The project team, led by Dr. Arūnas Samuilis, completed a new project "Improvement and Development of Machine Translation Systems and Localization Services" and created a new open and free translation environment. The following work was performed:
New technologies and additional linguistic resources have been developed to improve the quality of previous MT systems:
- Newly developed solutions were integrated into the machine translation infrastructure of Vilnius University, which enabled the MT system to automatically learn from the result of a translation edited by users. This functionality allows one to have better MT results every time a translation is edited. It is especially important that the benefits of such functionality can be immediately felt by users who translate and edit the text (there will be no need for separate system training processes, which take a very long time).
- Additional linguistic resources (texts and dictionaries, lists of terms, Lithuanian thesaurus and pre-editing tools, controlled language methods, etc.) have been developed, processed and revised to improve the quality of the previous MT system.
- In order to achieve the versatility and wider application of the translation system (also for professional work), the smooth translation of the text was supplemented by a more accurate dictionary translation functionality of individual words/phrases.
- While collecting and processing linguistic resources, special attention was paid to texts in the fields of medicine, law and communication.
- An MT plugin has been developed for the package of office applications OpenOffice/LibreOffice, that is able to communicate with www.versti.eu machine translation systems and translate users’ texts.
- Technologies based on neural networks have been used and opportunities have been created to use them to increase the quality of existing MT systems.
- The following pairs of machine translation languages have been installed in the existing infrastructure: Lithuanian-English-Lithuanian, Lithuanian-French-Lithuanian, Lithuanian-Polish-Lithuanian, Lithuanian-Russian-Lithuanian and Lithuanian-German-Lithuanian. Such language pairs were selected based on the real needs of society.
- MT infrastructure is adapted to provide e-government services, as MT solutions and tools must not only be publicly available to users, but also easily adaptable to the provision of e-government services. Programs running on the client's computer and/or server have been developed and they are able to translate and present the information provided by the institutions providing e-government services in the chosen language (for example, "epaslaugos.lt"). Field-specific MT systems have also been developed and integrated into the service. Installation of infrastructure for the provision of MT e-services has been prepared.
Speech recognition and synthesis solutions developed in the VU project "Development of Lithuanian Speech Managed Services – LIEPA 2" have been implemented in the machine translation platform. This allows users of www.versti.eu to enter the Lithuanian text by voice, correct it, translate it into the chosen languages, hear it, correct the translated text and distribute it through other communication channels (e.g. transfer to a text editor, write e-mails, distribute through social channels).