Building a flexible crowdsourcing platform using open standards
The National Library of Wales (NLW) has for many years run many different crowdsourcing projects to capture data about its collections from volunteers. Their crowdsourcing platform, known as ‘Torf’ within the Library, has given rise to a number of very successful projects, particularly for the annotation of historic photographs where volunteers are invited to identify people, events, places etc.
The Library was an early adopter of IIIF and leader in that community, and much-digitised material is provided by the Library in IIIF form. This made NLW an ideal collaborator to build a generic crowdsourcing platform which could deliver requirements by the library for Torf but also for use by the other organisations, in which both the inputs and outputs conform to open standards. This underlying platform became known as ‘Madoc’, named after a figure from Welsh folklore that also resonated with our other key collaborator at the time: the Indigenous Digital Archive (IDA).
The Madoc platform was optimised for simple tasks but was also designed to be extensible for capture complex data, using a concept called the ‘capture model’ - a description of the information the contributors are to provide, and the user interface the platform should generate to capture it. This design decision has been key in making Madoc very powerful, able to facilitate a wide range of crowdsourcing needs which has ensured the key objective of adoption of the platform beyond the original collaborators, and also ensured that the platform meets these founding organisation's needs as they have evolved: both NLW and IDA now planning projects on Madoc 2.0 where they will benefit from additional product features and refinements.
A typical NLW project
Digitised material that forms the subject matter of a typical Torf crowdsourcing project - archives, manuscripts, periodicals, printed books, photography and artwork collections - is published as IIIF by the Library. The data and content generated by users of the platform from that material are saved in the form of W3C Web Annotations.
Torf crowdsourcing projects focus on a single set of IIIF resources, e.g., a particular collection of photographs, or a chosen set of archive material. Each project has its own web identity (a site with a distinct theme or branding), capture models, and editorial content. Administrators of a project define the capture models that project should use to collect contributions from users.
Volunteers see a visual overview of the project material through thumbnails and hierarchical navigation. The structure of the source IIIF collections and manifests is reflected in the navigation of the project site. Every source image has a web page, and as they browse, volunteers can see the contributions and comments that others have already made on each image. They can log in and start to make annotation contributions of their own, and view how their actions change the progress totals for the whole project.
As mentioned above, NLW continues to be a committed user of Madoc with new projects planned including a plan to convert a repository of Welsh folk songs from an index card system to an online database. They are also exploring how to enable the participation of volunteers in the improvement of the metadata which describes the audiovisual material held in the National Broadcast Archive.
Main image: © Llyfrgell Genedlaethol Cymru / National Library of Wales