My experience building a 100% Open Source based Open Data platform

During the great OKFest 2014, we were lucky to re-encounter the folks from EWMI and  Open Development Cambodia (ODC), a non-for-profit organization advocating for transparency that we could get to know during the Open Steps Journey. Since 2011, the team at ODC has been doing an amazing work sharing data with journalists, researchers and human rights activists so they can count with openly licensed information to support their activities in the south-east Asian country. At OKFest, they told us about EWMI’s Open Development Initiative and their plans of what now has become the Open Development Mekong project, an open data and news portal providing content about 5 countries of the Mekong region in South-east Asia. Back then, they were looking for somebody that could give a hand for conceiving and implementing the platform. That’s how I got engaged on this challenging project that has been keeping me busy for the last 9 months.

I’m writing this article to share my personal experience participating in a 100% Open Source project, within an agile and extremely collaborative environment whose outcome in terms of information, knowledge and code are meant to be reused by the community.

The project’s requirements and its architecture

ODC’s site  already features lots of information, datasets and visualizations. The team has done a great work getting the most out of WordPress, the CMS software the site is build upon. However, since the main expectations for this new iteration of the platform were to host much more machine-readable data and expose it through both web interface and API, a specific framework for storing, managing and exposing datasets was needed. After analysing the current options out there, we decided to implement an instance of CKAN, which is an Open Source solution for building Open Data portals. Coordinated by Open Knowledge and strongly maintained by a great community of worldwide developers, it was definitely a good choice. Being Open Source not only means that we could deploy it for free, but we could use plenty of extensions developed by the community and get our questions answered by the developers at the #CKAN channel on IRC or directly on the github repositories where the project is maintained.

gen_ii_architecture_Analogue to ODC, the OD Mekong project should present a great amount of news, data and visualizations in a comprehensive manner, allowing users to search within the large amount of contents and sharing them on social networks or among friends. Taking in consideration that the editorial team had already expertise working with WordPress and the fact that it is a widely used, community supported Open Source CMS, we went ahead and deployed a multi-site network instance, featuring one site for the whole region ( Mekong ) and one site for each of the countries ( Cambodia, Thailand, Laos, Vietnam, Myanmar ). The theme chosen for the front-end, called JEO and developed specifically for Geo-Journalism sites, provides with a set of great features to geo-localize, visualize, and share news content. Since OD Mekong’s team works intensively with geo-referenced information ( also an instance of Geoserver is part of the architecture), JEO proved to be a great starting point and thanks to the work of its developers, lots of features could be used out-of-the-box.

To be able to facilitate the complex work-flow of OD Mekong’s editorial team, many WordPress plug-ins were used for aggregating content automatically, presenting featured information in a visual way or for allowing users to provide feedback. Also, we developed WPCKAN, a WordPress plug-in which allows to pull/push content between CKAN and WordPress, the main elements of OD Mekong’s architecture. Although is extensively used across the whole OD Mekong site, this plug-in has been developed generically, so other folks out there can re-use it in similar scenarios.

Working in a collaborative environment

Since the beginning, OD Mekong’s intention is to become a platform where multiple organizations from the region, which share common goals, can work together. This is not an easy task and has conditioned many of the decisions taken during the conception and development.

This collaborative process has been taking place (and will continue) at different levels:

  • Organizations participate on the content creation process. Once credentials are granted, datasets can be uploaded to the CKAN instance and news, articles or reports to the specific country sites. In order to ensure the quality of the contents, a vetting system has been conceived which allows site administrators to review them before they get published.
  • Developers from the community can contribute on the development of the platform. All code repositories are available on Open Development Mekong’s github site and provisioning scripts based on Vagrant and Ansible, both open source technologies, are available for everyone to reproduce OD Mekong’s architecture with just one command.
  • Since this is an interregional endeavour, all components of the architecture need to have multilingual capabilities. For that, many contents and pieces of the software needed to be translated. Within OD Mekong, the localization process relied on Transifex, a web-based translation platform that gives teams the possibility to translate and review software collaboratively. Although not open source anymore, Transifex is free for Open Source projects. I would like to highlight here that the OD Mekong team contributed to the translation of CKAN version 2.2 in Khmer, Thai and Vietnamese languages. Bravo!!

It is also very important to remark the benefits of documenting every process, every work-flow, every small tutorial in order to share the knowledge with the rest of the team, thus avoiding having to communicate the same information repeatedly. For that, since the beginning of the development process, a Wiki had been set up to store all the knowledge around the project. Currently, the contents on OD Mekong’s WIKI are still private but after being reviewed information will be made publicly available soon, so stay tuned!

An amazing professional ( but also personal ) experience

Leaving the technical aspect and going more into human values. I can only say that for me, working in this project has marked a milestone in my professional career. I have had the pleasure to work with an amazing team from which I have learned tons of new things. And not only related to software development but also System administration, Human Rights advocacy, Copyright law, Project management, Communication and a large etcetera. All within the best work atmosphere, even when deadlines were approaching and the github issues started to pile up dramatically :) .

This is why I want to thank Terry Parnell, Eric Chuk, Mishari Muqbil, HENG Huy Eng, CHAN Penhleak, Nikita Umnov and Dan Bishton for the great time and all the learnings.

Learn more

As part of the ambassador programme at Infogr.am, I hosted yesterday a skill-sharing session where I explain, this time on video, my experience within this project. Watch it to discover more…

[one_half last=”no”]

[/one_half]

[one_half last=”yes”]

[/one_half]