Submissions/Fast localization of articles to catch up with major Wikipedias

From Wikimania 2013 • Hong Kong

After careful consideration, the programme committee has decided not to accept the below submission at this time. Thank you to the author(s) for participating in the Wikimania 2013 programme submission, we hope to still see you at Wikimania this August.

Submission no.
1013
Subject no.
A2
Title of the submission

Fast localization of articles to catch up with major Wikipedias

Type of submission

presentation

Author of the submission

Marek Blahuš

Country of origin

Czech Republic

Affiliation

Esperanto and Free Knowledge, Education@Internet, Wikimedia Czech Republic, Wikimedia Poland scholarship

E-mail address

wikimania2013@blahus.cz

Username

Blahma

Personal homepage or blog
Abstract

Minor Wikipedias are having a tough time trying to catch up with Wikipedias in major languages both in terms of size and quality. Translation of articles could be made faster by providing the possibility to review high-quality machine translations. If the interface is easy-to-use (seamlessly integrated to Wikipedia) and the machine can learn from the human-made corrections, both readers and technology profit. We present results of a Wikimedia workshop in which the WikiTrans machine translation engine has been incorporated into Wikipedia by means of a gadget. Relevant general challenges are also discussed.

Detailed proposal

Due to facts such as limited number of speakers or negative influence of the digital divide, most Wikipedias will never be able to catch up in terms of size and quality with the English Wikipedia. Similarly, Wikipedias in minor languages may never leave the vicious circle of being too small to attract more contributors who would help them grow. This is especially true in regions with a few prominent supraterritorial languages which overshadow the plenty of local languages, as is the case in the Asia-Pacific region or Africa. Machine translation has been ofted suggested as a technological solution for this kind of language inequality; however, to the best of our knowledge, no current implementation of machine translation in Wikipedia has yet provided both high-quality input and ease of use at the same time. The presentation sums up the conclusions of a Wikipedia workshop on machine translation held on May 1-5, 2013, in Slovakia. In it, Wikimedians, programmers and the author of the WikiTrans machine translation engine came together and designed a Wikipedia gadget that is easy to use and provides the experience of seamless integration of the machine translation software into the encyclopedia. Whenever the user of a non-English Wikipedia attempts to access an inexisting article, its machine translation from English is shown instead, giving the user the possibility to start revising the text of the machine translation by a single click. All editing takes place within the Wikipedia page and preview is possible at any moment, yet the solution preserves the possibility for the machine translation engine to collect valuable data resulting from the human revision of its original output. As a result, both parties profit from such integration – Wikipedia obtains a new article and the translator receives feedback from which it can learn to translate better the next time. Due to WikiTrans' construction in particular (based on the Constraint Grammar paradigm), specifically-crafted improvements of its algorithm are possible (unlike with statistical machine translation) and its current performance in translating from English to Esperanto (http://epo.wikitrans.net/) has been assessed by the community to be surpassing the threshold after which it is easier to revise a machine translation than to start translating from a scratch. A discussion of related challenges, such as language-relative notability, terminological constraints and intercultural translatability will also be discussed.

Track
  • Wikis in Asia
Length of presentation/talk

25 Minutes

Language of presentation/talk

English

Will you attend Wikimania if your submission is not accepted?

Yes

Slides or further information (optional)
Special requests


Interested attendees

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with four tildes. (~~~~).

  1. আমীর ঈ. অহরোনি (talk) 20:27, 5 March 2013 (UTC)[reply]
  2. Leysan Gilmutdinova (talk) 09:58, 14 March 2013 (UTC)[reply]
  3. To what extent is it possible? I would like to know the history of what has already been done. Blue Rasberry (talk) 20:12, 1 April 2013 (UTC)[reply]