Submissions/Cantonese and other Chinese dialect Wikipedias: successes and challenges

From Wikimania 2013 • Hong Kong

After careful consideration, the programme committee has decided not to accept the below submission at this time. Thank you to the author(s) for participating in the Wikimania 2013 programme submission, we hope to still see you at Wikimania this August.

Submission no.
1009
Subject no.
A5
Title of the submission
Cantonese and other Chinese dialect Wikipedias: an introduction in English
Type of submission
presentation
Author of the submission
David Chan (陳惠明)
Country of origin
UK, Wales
Affiliation
Bangor University
E-mail address
d.chan@bangor.ac.uk
Username
Divec
Personal homepage or blog
http://techiaith.bangor.ac.uk
Abstract

Mandarin is China's dominant dialect, but there are over 300m speakers of other varieties of Chinese. These varieties can be written, but all speakers tend to use Standard Written Chinese (based on Mandarin) in formal contexts. Written Cantonese etc. can be seen as minority languages.

Cantonese Wikipedia breaks new ground in formal Written Cantonese publication. This unique resource documents Cantonese vocabulary, and can assist visually impaired readers, Cantonese learners and language researchers. Other Chinese dialect Wikipedias address the additional challenge of standardizing the dialect's written form.

This talk outlines the successes of Chinese dialect Wikipedias, and the challenges facing them.

Detailed proposal

This presentation will look at:

  • What "Chinese" is, how the dialects differ, and how Chinese writing works
  • Cantonese Wikipedia
    • How it breaks new ground
    • How it documents Cantonese technical vocabulary
    • Why it is a uniquely valuable resource
    • How it helps visually impaired speakers
    • How it helps Cantonese learners and language researchers
    • Whether it causes sociolinguistic change
  • Other Chinese dialect Wikipedias, and orthographic change
  • What can be learned from non-Chinese small Wikipedias such as Welsh Wikipedia

Chinese divides into several major dialect groups. The largest, Mandarin, acts as a lingua franca, and modern Standard Written Chinese is based on it. However, there are over 300 million speakers of other dialects which are not intelligible to Mandarin speakers.

Cantonese is one of the major Chinese varieties, spoken in Hong Kong, mainland China, and around the world. It has over 70 million speakers and a major international film industry. It also has a written form, Written Cantonese, which is used informally; Standard Written Chinese is preferred for formal contexts and communication with non-Cantonese speakers.

Cantonese Wikipedia can be regarded as a minority language project, in spite of the huge number of Cantonese speakers -- it is around half size of Welsh Wikipedia, though Welsh has less than a million speakers. Network effects apply because Cantonese speakers can use the Standard Written Chinese Wikipedia. Moreover, sociolinguistic restrictions on the use of Written Cantonese tend to confine it to informal context or quoted speech. These project development issues mirror the experiences of other lesser-used Wikipedias.

Despite these issues (or because of them), Cantonese Wikipedia is an unique and valuable resource, providing an encyclopedic corpus of Written Cantonese unmatched by any other resource. It can be shown that editors place particular emphasis on documenting the rich technical and terminological vocabulary of Cantonese which is otherwise primarily an oral phenomenon.

There are a number of other Chinese dialect Wikipedias (Min-Nan, Min-Dong, Gan, Wu, Hakka), offering a similar resource for other varieties of Chinese. These projects experience the same issues as Cantonese Wikipedia, but with an additional challenge: their varieties of Chinese have a much less well-established written form. As such there are greater difficulties around intelligibility and sociolinguistic acceptability. Viewing this challenge positively, it means these Wikipedias are even more groundbreaking, providing a corpus of written dialect material where nothing comparable exists.

Track
Wikis in Asia
Length of presentation/talk
25 minutes
Language of presentation/talk
English
Will you attend Wikimania if your submission is not accepted?
Yes
Slides or further information (optional)
Special requests
would prefer to present on Saturday or Sunday


Interested attendees

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with four tildes. (~~~~).

  1. ESanders (WMF) (talk) 23:14, 30 April 2013 (UTC)[reply]
  2. Sbouterse (talk) 00:35, 1 May 2013 (UTC)[reply]
  3. Graham87 (talk) 10:25, 5 May 2013 (UTC)[reply]
  4. Jeromy-Yu Chan, COIC (talk) 17:51, 5 May 2013 (UTC)[reply]
  5. Add your username here.