Moonsheep digitizes massive collections of documents into structured data through crowdsourcing and cutting edge technology.
Poland and Hungary are amongst the few remaining European countries where declarations of assets forms can only be filled in manually, scanned, and then published. This makes the digital analysis of documents difficult and, as a result, over 90% of submitted documents are not subject to checks. Moonsheep is a technological tool that allows volunteers to transcribe and create massive collections of transcribed documents. The information is then converted into spreadsheets, CSVs, or JSON APIs.
- Governments can bring their archived data into the digital age and move towards greater transparency and efficiency;
- NGOs can more easily investigate and analyse large amounts of paper and PDF sources, keep governments accountable, uncover corruption, and track money flows;
- GLAM (galleries, libraries, archives and museums) can improve metadata and digitise content, to extract new meaning from their exhibits and items;
- Researchers can crowdsource and automate their data collection, to broaden their research horizon.
The tool was co-created with support from TransparenCEE Network and TechSoup. They have also partnered with organisations with past experience in the topic: Engine Room, who organized two replication sprints and performed a thorough evaluation of existing tools; Open Data Kosovo who supported Engine Room in the Quien Compro implementation and who has recently created Decode Darfur (microtasking website for Amnesty International); K-Monitor, who have practical experience with transcribing and verifying data using Vagyonnyilatkozatok. The co-creation process is described in more detail by TransparenCEE Network on their website.
DSI4EU’s ePa?stwo Foundation is currently supporting organisations in Hungary, Ukraine, Poland, Romania and Russia to implement the tool to tackle local needs.
Moonsheep’s impact and reach is dependent on its ability to engage users. Through their work, K-Monitor showed that building an open database is possible with support from volunteers:. “With a few dozens volunteers we liberated the data (more than 2000 pages of scanned PDFs) just hours after the publication and published it in an open, searchable and comparable database.” said Attila Juhasz from K-Monitor.
The tool is now being scaled to Romania (Code for Romania) and Ukraine (OPORA) and the tool creators are testing different models of funding including commercializing some services.
Founded in 2017 by the TransparenCEE Network. International project coordinated by ePa?stwo Foundation
Case study date: June 2018