My experience building a 100% Open Source based Open Data platform

During the great OKFest 2014, we were lucky to re-encounter the folks from EWMI and Open Development Cambodia (ODC), a non-for-profit organization advocating for transparency that we could get to know during the Open Steps Journey. Since 2011, the team at ODC has been doing an amazing work sharing data with journalists, researchers and human rights activists so they can count with openly licensed information to support their activities in the south-east Asian country. At OKFest, they told us about EWMI’s Open Development Initiative and their plans of what now has become the Open Development Mekong project, an open data and news portal providing content about 5 countries of the Mekong region in South-east Asia. Back then, they were looking for somebody that could give a hand for conceiving and implementing the platform. That’s how I got engaged on this challenging project that has been keeping me busy for the last 9 months.

I’m writing this article to share my personal experience participating in a 100% Open Source project, within an agile and extremely collaborative environment whose outcome in terms of information, knowledge and code are meant to be reused by the community.

The project’s requirements and its architecture

ODC’s site already features lots of information, datasets and visualizations. The team has done a great work getting the most out of WordPress, the CMS software the site is build upon. However, since the main expectations for this new iteration of the platform were to host much more machine-readable data and expose it through both web interface and API, a specific framework for storing, managing and exposing datasets was needed. After analysing the current options out there, we decided to implement an instance of CKAN, which is an Open Source solution for building Open Data portals. Coordinated by Open Knowledge and strongly maintained by a great community of worldwide developers, it was definitely a good choice. Being Open Source not onlymeans that we could deploy it for free, but we could use plenty of extensions developed by the community and get our questions answered by the developers at the #CKAN channel on IRC or directly on the github repositories where the project is maintained.

gen_ii_architecture_Analogue to ODC, the OD Mekong project should present a great amount of news, data and visualizations in a comprehensive manner, allowing users to search within the large amount of contents and sharing them on social networks or among friends. Taking in consideration that the editorial team had already expertise working with WordPress and the fact that it is a widely used, community supported Open Source CMS, we went ahead and deployed a multi-site network instance, featuring one site for the whole region ( Mekong ) and one site for each of the countries ( Cambodia, Thailand, Laos, Vietnam, Myanmar ). The theme chosen for the front-end, called JEO and developed specifically for Geo-Journalism sites, provides with a set of great features to geo-localize, visualize, and share news content. Since OD Mekong’s team works intensively with geo-referenced information ( also an instance of Geoserver is part of the architecture), JEO proved to be a great starting point and thanks to the work of its developers, lots of features could be used out-of-the-box.

To be able to facilitate the complex work-flow of OD Mekong’s editorial team, many WordPress plug-ins were used for aggregating content automatically, presenting featured information in a visual way or for allowing users to provide feedback. Also, we developed WPCKAN, a WordPress plug-in which allows to pull/push content between CKAN and WordPress, the main elements of OD Mekong’s architecture. Although is extensively used across the whole OD Mekong site, this plug-in has been developed generically, so other folks out there can re-use it in similar scenarios.

Working in a collaborative environment

Since the beginning, OD Mekong’s intention is to become a platform where multiple organizations from the region, which share common goals, can work together. This is not an easy task and has conditioned many of the decisions taken during the conception and development.

This collaborative process has been taking place (and will continue) at different levels:

  • Organizations participate on the content creation process. Once credentials are granted, datasets can be uploaded to the CKAN instance and news, articles or reports to the specific country sites. In order to ensure the quality of the contents, a vetting system has been conceived which allows site administrators to review them before they get published.
  • Developers from the community can contribute on the development of the platform. All code repositories are available on Open Development Mekong’s github site and provisioning scripts based on Vagrant and Ansible, both open source technologies, are available for everyone to reproduce OD Mekong’s architecture with just one command.
  • Since this is an interregional endeavour, all components of the architecture need to have multilingual capabilities. For that, many contents and pieces of the software needed to be translated. Within OD Mekong, the localization process relied on Transifex, a web-based translation platform that gives teams the possibility to translate and review software collaboratively. Although not open source anymore, Transifex is free for Open Source projects. I would like to highlight here that the OD Mekong team contributed to the translation of CKAN version 2.2 in Khmer, Thai and Vietnamese languages. Bravo!!

It is also very important to remark the benefits of documenting every process, every work-flow, every small tutorial in order to share the knowledge with the rest of the team, thus avoiding having to communicate the same information repeatedly. For that, since the beginning of the development process, a Wiki had been set up to store all the knowledge around the project. Currently, the contents on OD Mekong’s WIKI are still private but after being reviewed information will be made publicly available soon, so stay tuned!

An amazing professional ( but also personal ) experience

Leaving the technical aspect and going more into human values. I can only say that for me, working in this project has marked a milestone in my professional career. I have had the pleasure to work with an amazing team from which I have learned tons of new things. And not only related to software development but also System administration, Human Rights advocacy, Copyright law, Project management, Communication and a large etcetera. All within the best work atmosphere, even when deadlines were approaching and the github issues started to pile up dramatically :) .

This is why I want to thank Terry Parnell, Eric Chuk, Mishari Muqbil, HENG Huy Eng, CHAN Penhleak, Nikita Umnov and Dan Bishton for the great time and all the learnings.

Learn more

As part of the ambassador programme at Infogr.am, I hosted yesterday a skill-sharing session where I explain, this time on video, my experience within this project. Watch it to discover more…

[one_half last=”no”]

[/one_half]

[one_half last=”yes”]

[/one_half]

Introducing the new Open Knowledge directory with PLP Profiles

Bildschirmfoto 2014-10-25 um 11.11.10
During Open Steps’s journey around the world discovering Open Knowledge initiatives, the existence of a global community of like-minded individuals and groups became clear. Across the 24 countries we visited, we could meet people working on Open Knowledge related projects in every single one of them. Currently, and thanks to social networks, blogs, discussion groups and newsletters, this community manages to stay connected and get organized across borders. However, getting to meet the right people can result a difficult task for somebody without the overview or who is who and doing what, specially in a foreign country.

Me and my travel companion, Margo Thierry, started building a contact list as we met new amazing people during this great journey and finally realized that sharing this information would have a positive impact. That’s how the Open Knowledge directory came to life, with its aim of increasing the visibility of Open Knowledge projects and help forging collaborations between individuals and organizations across borders.

After some iterations we are now releasing a new version which not only features a new user interface with better usability and sets a base for a continuous development that aims to fulfill the goal of connecting people, monitor the status of Open Knowledge worldwide and raise awareness about relevant projects and initiatives worth to discover.

Bildschirmfoto 2014-10-25 um 11.11.25One of the main features of this version is the implementation of the Portable Linked Profiles, short PLP. In case you did not read the article I wrote about the inspiring GET-D conference last month where I spoke about it for the first time, you would like to know that PLP allows you to create a profile with your basic contact information that you can use and share. With basic contact information I mean the kind of information you are used to type in dozens of online forms, from registering on social networks, accessing web services or leaving your feedback in forums, it is always the same information: Name, Email, Address, Website, Facebook, Twitter, etc… PLP tries to address this issue but also, and most important, allows you to own your data and decide where you want it to be stored.

By implementing PLP, this directory does not make use anymore of the old Google Form and now allow users to edit their data and keep it up-to-date easily. For the sake of re-usability and interoperability, it makes listing your profile in another directory so easy as just pasting the URI of your profile on it, listo! If you want to know more about PLP, kindly head to the home page or to the github repository with the documentation.PLP is Open Source software and is based on Open Web Standards and Common Vocabularies.

We invite you now to register on our Open Knowledge directory if you are not there yet or update your information if you are. This directory is meant to be continuously improved so please drop us a line if you have any feedback, we’ll appreciate it.

Discussing the hottest topics of the decentralized web at GET-D

„If you want to go fast, go alone. If you want to go far, go together.” is a more than adequate motto chosen by GET-D‘s organisers to give character to this event, a conference aiming to explore the status, possibilities and challenges of the decentralized web. In its first edition, GET-D took place between the 17th and 19th of September in the amazing Agora Collective space in Berlin-Neukölln.

BxuQ_imCEAEQSRa.jpg_large

Decentralized web is a relatively new topic for many, as it is my case, and completely unknown by the vast majority of the internet users. If you belong to the latter group, let me explain briefly what I understand behind this term: The internet that most of the people use today (let me call it mainstream web) is structured in a centralized manner and a huge percent of the information is stored in big data centres and routed through servers owned by gigantic corporations. This makes possible that we all enjoy great services such as our favourite social networks, search engines and cloud storage services but has several negative implications such as poor inter-operability between information sources and, as you might already be aware of, governments accessing your private data.

As an opposition to the current infrastructure, the decentralized web proposes a much more democratic approach, where logic and storage is more balanced across the nodes of the network. Going back to GET-D’s motto, this idea also supports strongly the principles of collaboration. Because, in order to make things work, every node needs to work with the others. Last but not least, the re-use of resources (being digital information or physical assets) is also one of the main benefits of this approach.

What can we expect from a new and decentralized web?

As part of GET-D’s programme, we had the opportunity to discover very interesting projects that bring a new perspective to aspects of our current digital lives. To mention just a few, we enjoyed the presence of the folks developing Mail Pile, a free, add-free and Open Source email client that you can run on your local machine or server so you actually have total control of your data. Or Leihbar, a platform that tries to shift our consumer society towards a sharing economy. Leihbar envisions a network of boxes spread through the cities, where users can have access to all kind of products for particular occasions: from a projector to watch a movie, through tools for fixing your bike to an inflatable boat to enjoy a day at the lake. This way, we do not need to buy stuff that we are going to use just from time to time, we share it with others.

Internet of things (IoT) is also a hot topic nowadays. We are seeing how all kind of devices are becoming connected to the internet. Cars, public infrastructure or even coffee machines are now capable of interacting with the digital world and between them, in a de-centralized manner. At GET-D, a couple of IoT-related projects were presented: Starting with RiotOS, a free LGPL-licensed operative system for those devices the IoT is being built upon, or Gatesense, a project which encourages the community to imagine and shape the future of this field. With such a vast amount of devices generating tons of information, initiatives are also being launched to help us managing it efficiently. It is the case of Jolocom, a distributed visualisation tool which helps users make sense of complex connections between persons, projects, sensors and devices from the Internet of Things.

Hackaton: After theory it comes coding

I personally enjoyed the hacking sessions. Parallel to a series of interesting presentations and hangouts with folks working on decentralized web projects around the globe, they shaped the 3 days we spent at GET-D. Together with other participants, I worked on a project I would like to introduce here. Portable Linked Profiles (PLP) are set of components which offer an easy way for users, organisations and venues to create their public data, and most important, host it wherever they want. Thanks to its modular design and its Open Source nature, developers can create applications on top of PLP. This applications (named Browsers) would be something like our Open Knowledge directory which aggregates and maps contact information of individuals and organisations working on Open Knowledge worldwide. Expect more details about this on our blog soon.

GETD-am-17.09.14-um-17.16-2

Stay tuned for more GET-D

This first edition had already very good outcomes and the great thing is that there will be more to come. The topic of Decentralized web is still in a young state and more research, discussion and implementation is still needed. As we could experience, such an event offers a perfect environment for this and we are looking forward to attending next editions of GET-D.