Submissions/Fast localization of articles to catch up with major Wikipedias
After careful consideration, the programme committee has decided not to accept the below submission at this time. Thank you to the author(s) for participating in the Wikimania 2013 programme submission, we hope to still see you at Wikimania this August.
- Submission no.
- Subject no.
- Title of the submission
Fast localization of articles to catch up with major Wikipedias
- Type of submission
- Author of the submission
- Country of origin
Esperanto and Free Knowledge, Education@Internet, Wikimedia Czech Republic, Wikimedia Poland scholarship
- E-mail address
- Personal homepage or blog
Minor Wikipedias are having a tough time trying to catch up with Wikipedias in major languages both in terms of size and quality. Translation of articles could be made faster by providing the possibility to review high-quality machine translations. If the interface is easy-to-use (seamlessly integrated to Wikipedia) and the machine can learn from the human-made corrections, both readers and technology profit. We present results of a Wikimedia workshop in which the WikiTrans machine translation engine has been incorporated into Wikipedia by means of a gadget. Relevant general challenges are also discussed.
- Detailed proposal
Due to facts such as limited number of speakers or negative influence of the digital divide, most Wikipedias will never be able to catch up in terms of size and quality with the English Wikipedia. Similarly, Wikipedias in minor languages may never leave the vicious circle of being too small to attract more contributors who would help them grow. This is especially true in regions with a few prominent supraterritorial languages which overshadow the plenty of local languages, as is the case in the Asia-Pacific region or Africa. Machine translation has been ofted suggested as a technological solution for this kind of language inequality; however, to the best of our knowledge, no current implementation of machine translation in Wikipedia has yet provided both high-quality input and ease of use at the same time. The presentation sums up the conclusions of a Wikipedia workshop on machine translation held on May 1-5, 2013, in Slovakia. In it, Wikimedians, programmers and the author of the WikiTrans machine translation engine came together and designed a Wikipedia gadget that is easy to use and provides the experience of seamless integration of the machine translation software into the encyclopedia. Whenever the user of a non-English Wikipedia attempts to access an inexisting article, its machine translation from English is shown instead, giving the user the possibility to start revising the text of the machine translation by a single click. All editing takes place within the Wikipedia page and preview is possible at any moment, yet the solution preserves the possibility for the machine translation engine to collect valuable data resulting from the human revision of its original output. As a result, both parties profit from such integration – Wikipedia obtains a new article and the translator receives feedback from which it can learn to translate better the next time. Due to WikiTrans' construction in particular (based on the Constraint Grammar paradigm), specifically-crafted improvements of its algorithm are possible (unlike with statistical machine translation) and its current performance in translating from English to Esperanto (http://epo.wikitrans.net/) has been assessed by the community to be surpassing the threshold after which it is easier to revise a machine translation than to start translating from a scratch. A discussion of related challenges, such as language-relative notability, terminological constraints and intercultural translatability will also be discussed.
- Wikis in Asia
- Length of presentation/talk
- Language of presentation/talk
- Will you attend Wikimania if your submission is not accepted?
- Slides or further information (optional)
- Special requests
If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with four tildes. (~~~~).