How often do people from all over the world struggle with having websites translated into their local native languages? Especially if they don’t speak English? About a year ago, Amazon announced that it would add 22 new “rare” languages into its translation service. This news reflects an overall trend: growing numbers of such languages are being involved in automatic translation.
Machine translation engines require enormous bilingual corpora which tend to be found in the web. However, the Internet is not presented equally throughout the globe. This is why there is less content in Hindi or Slovenian (“low-resource languages”) available online than in English or German (“high-resource languages”). Even though Hindi is one of the most commonly spoken languages in the world!
High-resource languages are perfect for MT process. However, low-resource languages are currently experiencing prosperity. The reasons are the following:
1. Developing MT technologies
Neural networks and deep learning are two cornerstones of contemporary MT. It works well and becomes more accurate with the time when it comes to grammatically incomparable languages like Japanese and English, achieving an impressive quality level.
2. Focus shift and market diversification
As high-resource languages is a well-developed area for neural MT, investing in low-resource languages has become possible. Meanwhile, lots of new market players focusing on a specific content type or less widespread languages have showed up (e.g. Baidu for Chinese or Yandex for Russian), which accompanies the neural MT development.
Technology has always helped us overcome language barriers. Now the time for Eastern Europe, Scandinavia, and Asia has come: foreign companies aiming for their markets will soon witness reduction of localization costs due to MT growth.
To learn more details, you may check out the source article on the topic here.