Submissions/Datafying Wikimedia

From Wikimania 2013 • Hong Kong
See also: Submissions/Wikimetrics for a maturing encyclopaedia, Submissions/The UserMetrics API: Measuring participation in Wikimedia projects

This is an accepted submission for Wikimania 2013.

Watch session on YouTube

Submission no.
Subject no.
Title of the submission
Datafying Wikimedia: Data products and services to empower our communities.
Type of submission
Author of the submission
  • Diederik van Liere
  • Dario Taraborelli
Country of origin
  • Canada
  • USA
E-mail address
Personal homepage or blog

Building privacy-aware services and products around the data that the Wikimedia projects collect and generate is one of the priorities of the Analytics Team. We would like to discuss a model for participatory design for these services and products, gathering requirements and design ideas at an early stage and looping community members in the entire lifecycle of these new services (designing, prototyping, testing, productizing and extending each of these products/services). This session will provide a forum for community members and Wikimedia staffers to discuss a viable model for data product design and prioritization, to brainstorm about individual data product ideas, use cases and requirements and to discuss the overall privacy and data licensing implications of these services.

Detailed proposal

WMF aspires to become a more data-inspired organization. To achieve this goal we have started developing new tools and infrastructure to assist us in collecting and analyzing data about Wikimedia projects. Examples of these new tools include EventLogging (a general-purpose mediawiki extension to measure how users interact with MediaWiki's interface), Limn (a visualization tool to generate charts and easily-embeddable reports in a user-friendly way), UserMetrics (a platform to measure user activity and perform cohort analysis) and Kraken (a distributed computing and data-services platform).

In this session, we will provide a brief showcase of existing projects, discuss processes and tools for participating in the design of new services and illustrate three types of new services/products whose user stories are being drafted by the analytics team.

  • Pageview APIs (a public API for pageviews and near-realtime pageview stream)
  • RecentChanges analytics (a tool to analyze and filter historical recentchanges data)
  • Task repository API (an API exposing articles in need of various kinds of help)

Finally, we would like to solicit feedback on the privacy and data licensing implications of existing and future analytics projects.

  • Technology and Infrastructure
Length of presentation/talk
50 minutes (given the scope of the proposal, we would like to request a long session, to host both a presentation and a roundtable discussion)
Language of presentation/talk
Will you attend Wikimania if your submission is not accepted?
Slides or further information (optional)
Special requests

Interested attendees

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with four tildes. (~~~~).

  1. Howief (talk) 20:05, 29 April 2013 (UTC)[reply]
  2. Daniel Mietchen (talk) 00:55, 30 April 2013 (UTC)[reply]
  3. Ocaasi (talk) 16:20, 30 April 2013 (UTC)[reply]
  4. Multichill (talk) 14:48, 4 May 2013 (UTC)[reply]
  5. LVilla (WMF) (talk) 05:49, 5 May 2013 (UTC)[reply]
  6. Nennes (talk) 18:57, 1 June 2013 (UTC)[reply]
  7. Add your username here.