My experience building a 100% Open Source based Open Data platform

During the great OKFest 2014, we were lucky to re-encounter the folks from EWMI and Open Development Cambodia (ODC), a non-for-profit organization advocating for transparency that we could get to know during the Open Steps Journey. Since 2011, the team at ODC has been doing an amazing work sharing data with journalists, researchers and human rights activists so they can count with openly licensed information to support their activities in the south-east Asian country. At OKFest, they told us about EWMI’s Open Development Initiative and their plans of what now has become the Open Development Mekong project, an open data and news portal providing content about 5 countries of the Mekong region in South-east Asia. Back then, they were looking for somebody that could give a hand for conceiving and implementing the platform. That’s how I got engaged on this challenging project that has been keeping me busy for the last 9 months.

I’m writing this article to share my personal experience participating in a 100% Open Source project, within an agile and extremely collaborative environment whose outcome in terms of information, knowledge and code are meant to be reused by the community.

The project’s requirements and its architecture

ODC’s site already features lots of information, datasets and visualizations. The team has done a great work getting the most out of WordPress, the CMS software the site is build upon. However, since the main expectations for this new iteration of the platform were to host much more machine-readable data and expose it through both web interface and API, a specific framework for storing, managing and exposing datasets was needed. After analysing the current options out there, we decided to implement an instance of CKAN, which is an Open Source solution for building Open Data portals. Coordinated by Open Knowledge and strongly maintained by a great community of worldwide developers, it was definitely a good choice. Being Open Source not onlymeans that we could deploy it for free, but we could use plenty of extensions developed by the community and get our questions answered by the developers at the #CKAN channel on IRC or directly on the github repositories where the project is maintained.

gen_ii_architecture_Analogue to ODC, the OD Mekong project should present a great amount of news, data and visualizations in a comprehensive manner, allowing users to search within the large amount of contents and sharing them on social networks or among friends. Taking in consideration that the editorial team had already expertise working with WordPress and the fact that it is a widely used, community supported Open Source CMS, we went ahead and deployed a multi-site network instance, featuring one site for the whole region ( Mekong ) and one site for each of the countries ( Cambodia, Thailand, Laos, Vietnam, Myanmar ). The theme chosen for the front-end, called JEO and developed specifically for Geo-Journalism sites, provides with a set of great features to geo-localize, visualize, and share news content. Since OD Mekong’s team works intensively with geo-referenced information ( also an instance of Geoserver is part of the architecture), JEO proved to be a great starting point and thanks to the work of its developers, lots of features could be used out-of-the-box.

To be able to facilitate the complex work-flow of OD Mekong’s editorial team, many WordPress plug-ins were used for aggregating content automatically, presenting featured information in a visual way or for allowing users to provide feedback. Also, we developed WPCKAN, a WordPress plug-in which allows to pull/push content between CKAN and WordPress, the main elements of OD Mekong’s architecture. Although is extensively used across the whole OD Mekong site, this plug-in has been developed generically, so other folks out there can re-use it in similar scenarios.

Working in a collaborative environment

Since the beginning, OD Mekong’s intention is to become a platform where multiple organizations from the region, which share common goals, can work together. This is not an easy task and has conditioned many of the decisions taken during the conception and development.

This collaborative process has been taking place (and will continue) at different levels:

  • Organizations participate on the content creation process. Once credentials are granted, datasets can be uploaded to the CKAN instance and news, articles or reports to the specific country sites. In order to ensure the quality of the contents, a vetting system has been conceived which allows site administrators to review them before they get published.
  • Developers from the community can contribute on the development of the platform. All code repositories are available on Open Development Mekong’s github site and provisioning scripts based on Vagrant and Ansible, both open source technologies, are available for everyone to reproduce OD Mekong’s architecture with just one command.
  • Since this is an interregional endeavour, all components of the architecture need to have multilingual capabilities. For that, many contents and pieces of the software needed to be translated. Within OD Mekong, the localization process relied on Transifex, a web-based translation platform that gives teams the possibility to translate and review software collaboratively. Although not open source anymore, Transifex is free for Open Source projects. I would like to highlight here that the OD Mekong team contributed to the translation of CKAN version 2.2 in Khmer, Thai and Vietnamese languages. Bravo!!

It is also very important to remark the benefits of documenting every process, every work-flow, every small tutorial in order to share the knowledge with the rest of the team, thus avoiding having to communicate the same information repeatedly. For that, since the beginning of the development process, a Wiki had been set up to store all the knowledge around the project. Currently, the contents on OD Mekong’s WIKI are still private but after being reviewed information will be made publicly available soon, so stay tuned!

An amazing professional ( but also personal ) experience

Leaving the technical aspect and going more into human values. I can only say that for me, working in this project has marked a milestone in my professional career. I have had the pleasure to work with an amazing team from which I have learned tons of new things. And not only related to software development but also System administration, Human Rights advocacy, Copyright law, Project management, Communication and a large etcetera. All within the best work atmosphere, even when deadlines were approaching and the github issues started to pile up dramatically :) .

This is why I want to thank Terry Parnell, Eric Chuk, Mishari Muqbil, HENG Huy Eng, CHAN Penhleak, Nikita Umnov and Dan Bishton for the great time and all the learnings.

Learn more

As part of the ambassador programme at Infogr.am, I hosted yesterday a skill-sharing session where I explain, this time on video, my experience within this project. Watch it to discover more…

[one_half last=”no”]

[/one_half]

[one_half last=”yes”]

[/one_half]

India Open Data Summit, 2015

ODSummit1Open Knowledge India, with support from the National Council of Education Bengal and the Open Knowledge micro grants, organised the India Open Data Summit on February, 28. It was the first ever Data Summit of this kind held in India and was attended by Open Data enthusiasts from all over India. The event was held at Indumati Sabhagriha, Jadavpur University. Talks and workshops were held throughout the day. The event succeeded in living up to its promise of being a melting point of ideas.

The attendee list included people from all walks of life. Students, teachers, educationists, environmentalists, scientists, government officials, people’s representatives, lawyers, people from the tinseltown — everyone was welcomed with open arms to the event. The Chief Guests included the young and talented movie director Bidula Bhattacharjee, a prominent lawyer from the Kolkata High Court Aninda Chatterjee, educationist Bijan Sarkar and an important political activist Rajib Ghoshal. Each one of them added value to the event, making it into a free flow of ideas. The major speakers from the side of Open Knowledge India included Subhajit Ganguly, Priyanka Sen and Supriya Sen. Praloy Halder, who has been working for the restoration of the Sunderbans Delta, also attended the event. Environment data is a key aspect of the conservation movement in the Sunderbans and it requires special attention.

ODSummit2The talks revolved around Open Science, Open Education, Open Data and Open GLAM. Thinking local and going global was the theme from which the discourse followed. Everything was discussed from an Indian perspective, as many of the challenges faced by India are unique to this part of the world. There were discussions on how the Open Education Project, run by Open Knowledge India, can complement the government’s efforts to bring the light of education to everyone. The push was to build up a platform that would offer the Power of Choice to the children in matters of educational content. More and more use of Open Data platforms like the CKAN was also discussed. Open governance not only at the national level, but even at the level of local governments, was something that was discussed with seriousness. Everyone agreed that in order to reduce corruption, open governance is the way to go. Encouraging the common man to participate in the process of open governance is another key point that was stressed upon. India is the largest democracy in the world and this democracy is very complex too.Greater use of the power of the crowd in matters of governance can help the democracy a long way by uprooting corruption from the very core.

ODSummit3Opening up research data of all kinds was another point that was discussed. India has recently passed legislature ensuring that all government funded research results will be in the open. A workshop was held to educate researchers about the existing ways of disseminating research results. Further enquiries were made into finding newer and better ways of doing this. Every researcher, who had gathered, resolved to enrich the spirit of Open Science and Open Research. Overall, the India Open Data Summit, 2015 was a grand success in bringing likeminded individuals together and in giving them a shared platform, where they can join hands to empower themselves. The first major Open Data Summit in India ended with the promise of keeping the ball rolling. Hopefully, in near future we will see many more such events all over India.

Open Data in the Philippines: Best practices from disaster relief and transportation mapping

While attending Geeks on a Beach last month, we also spent some time in Manila to visit a few labs and agencies, and had several discussions on the state of open-data in the Philippines.

Quick reminder: open-data is a recent trend for government, companies and institutions to release their datasets freely, so that users, developers, citizens or consumers can make use of it and create new services (check FixmyStreet for a “citizen 2.0″ stint or FlyonTime for a more commercial approach).

The Philippines, a 100m population country we have been exploring, hosts quite a few very good applications of open-data, and they also have a strong support from the government side to do so. Here’s some of their creations, with the explanations of Ivory Ong, Outreach Lead of Open Data for the Department of Budget and Management of the government of the Philippines.

Key milestones of the open-data in the Philippines

The major milestone for the open-data in the Philippines was the official launch of data.gov.ph in January 16, 2014 after a 6-month development period. “We have had 500,000 page views as of June this year. We published 650 datasets at the time and had infographics (static data visualizations and interactive dashboards) already which was the unique selling point of our data portal. We were able to push out an additional 150 datasets by May 5, 2014″, says Ivory.

01-OD_Philippines_vimeo

The team also lead two government-organized hackathons: #KabantayNgBayan (on budget transparency) and Readysaster (on disaster preparedness) to build awareness on the use and benefits of open government data. Another milestone is having a Data Skills Training for civil society organizations, media, and government to build capacity with data scraping, cleaning, and visualizing.

02-open-data-philippines

“Back in June, we likewise conducted our first Open Data training at a city level (Butuan City, Agusan del Norte) where local civil society organizations and local government units created offline visualizations from data disclosed by the Department of Interior and Local Government (DILG) via the Full Disclosure Policy Portal“, she adds.

Mapping the transports in Metro Manila with students embedded on all routes

While talking with Ivory and Levi Tan Ong, one of the co-founders of By Implication, a digital agency, I’ve heard about a quite funny story.

Just as in so many emerging markets, the transportation system is organically grown. Except for the MRT or subway systems where an official map helps to navigate the city, most routes by local bus (dubbed jeepneys in the Philippines, but one can think of Nairobi’s matutu as well) are unwritten. People just know them, stations are all over the road and nowhere at the same time.

03-philippines-open-data-transportation-map-manila-jeepney-innovation-is-everywhere-martin-pasquie

So the Department of Transport launched two initiatives to solve the issue. First, by putting students with GPS plotting software in all the jeepneys and local buses to map their actual routes, and then by releasing the data to have the communities of developers build an app for that. “From the little that I know, this was done because Department of Transport and Communication and its attached agencies have clashing statistics on the exact number of routes”, adds Ivory.

04-OD_philippines

Creative agency By Implication then won the Open Community Award at the Philippines Transit App Challenge, with Sakay.ph, an app which helps you to know which combination of transportation to use to go from A to B… quite convenient for the foreigner I am in the gigantic Metro Manila area! The app is recording about 50 000 requests per month since inception, and if there’s still some glitches on the data, it’s the first real online map and direction service for Manila.

Where is the Foreign Aid for disaster going? Open Reconstruction will tell

The same agency is also behind Open Reconstruction, an open-data platform which tracks where theaid money after typhoon Yolanda hit the archipelago in November 2013.

05-philippines-open-data

06-philippines-open-data

It’s not just a storytelling of where funds are allocated, as Levi says: “Several towns asked for money to rebuild infrastructure and housing, but at that time, it was a long process in 5 steps at least to get funding, and all was in paper. So what we provide is a digitalisation of the aid process. First, by streamlining the process of applying for money and making all steps digital, traceable, and in a second step, by releasing this data to the public to increase transparency of the overall aid effort”.

07-philippines-open-data

08-philippines-open-data

The connection between the agency’s work and the government open-data team seems to work on the topic of foreign aid. Ivory adds that “Context at the time was that there were a lot of news releases saying that humanitarian aid was coming in specifically for Yolanda. There were assumptions that government agencies might be getting funds yet are not using it for its intended purpose. When we finally launched the site and finished the scoping of the information-goods-cash flow [see infographic from the FAITH site below], we found out that only a small portion went to government anda vast majority went to multilateral agencies such as the UN and the Philippine Red Cross. Public demand died down because of it”.

09-Philippines-open-data

Open Reconstruction is the other half of what the open-data team wanted FAiTH data to be connected to: how the money was spent and if it was used for the intended purpose. It gives anyone, by bringing data to light, a chance to be a watchdog to hold government to account.

What’s next for open-data in the Philippines? Training, training, training

In just a few months, the open-data community did hit quite a few convincing milestones, both with government support and the involvement of the community of developers. There’s still a lot to do, as Ivory tells us, because as in any digitalisation, training, change management and making sure the administration and the public understand and accept this new policy is key.

“I guess this goes back to our first time to run the training to create offline data visualizations back in June. Local government unit representatives who were intimately familiar with local budget data had an easier time to create visualizations and explain it. After the crash course training for free online tools they can use, we went into a workshop proper where they select PDF files from the Full Disclosure Policy Portal (based on the city/municipality they lived in) and proceeded to discuss with their groupmates on how best to visualize it using colored paper and pentel pens.

These actors at a local level are important since they serve as potential information intermediaries who can communicate data into digestible stories that citizens can relate to based on their needs. Citizens who reside in remote or rural areas and are not familiar with government jargon/processes can be informed and empowered if intermediaries exist.

From our initial experience, I think I can propose 4 important must-have skills for intermediaries:

  • technological capacity (i.e. use of ICT) to clean/structure/visualize data
  • good understanding of government vocabulary and process (for data analysis and interpretation)
  • deep knowledge of local / community needs and priorities
  • communication skills, particularly storytelling with data

The last skill is important because stories are easier to understand versus listening to technical jargon. Filipinos are very much into knowing hat’s what in the lives of family, friends, celebrities, and politicians. Stories trump statistics in this case so learning how to narrate what dataset/s mean can be more useful. If Open Data is to make an impact in the lives of citizens, it must be in a language that is relatable and understandable”

Written by Martin Pasquier from Innovation Is Everywhere

7 Predictions for “Open Data” in 2015

What’s going to happen to the “open data” movement in 2015?  Here are Dennis D. McDonald‘s predictions:7predictionsOD2015

  1. Some high profile open data web sites are going to die. At some sites the lack of updates and lack of use will catch up with them.  Others will see highly publicized discussions of errors and omissions.  For some in the industry this will be black eye.  For others it will be an “I told you so” moment causing great soul-searching and a re-emphasis on the need for effective program planning.
  2. Greater attention paid to cost, governance, and sustainability. In parallel with the above there will be more attention paid to open data costs, governance, and program sustainability.  Partly this will be in response to the issues raised in (1) and partly because the “movement” is maturing.  As people move beyond the low-hanging-fruit and cherry-picking stage they will be giving more thought to what it takes to manage an open data program effectively.
  3. Greater emphasis on standards, open source, and APIs. This is another aspect of the natural evolution of the movement. Much of the open data movement has relied on “bottom up” innovation and the enthusiasm of a developer community accustomed to operating on the periphery of the tech establishment. Some of this is generational as younger developers move into positions of authority. Some is due to the ease with which data and tools can be obtained and combined by individuals and groups working remotely and collaborating via systems like GitHub.
  4. More focus on economic impacts of open data in developed and developing countries alike. While many open data programs have been justified on the basis of laudable goals such as “transparency” and “civic engagement,” sponsors will inevitably ask questions about “impact” as update costs begin to roll in.  Some of the most important questions are also the simplest to ask but the hardest to answer, such as, “Are the people we hoped would use the data actually using the data?” and “Is using the data doing any good?”
  5. More blurring of the distinctions between public sector and private sector data. One of the basic ideas behind making government data “open” is to allow the public and entrepreneurs to use and combine public data with other data in new and useful ways. It is inevitable that private sector data will come into the mix. When public and private data are combined some interesting intellectual property, ownership, and pricing questions will be raised. Managers must be ready to address questions such as, “Why should I have to pay for a product that contains data I paid to collect via my tax dollars?”
  6. Inclusion of open data features in mainstream ERP, database, middleware, and CRM products. Just as vendors have incorporated social networking and collaboration features with older products, so too will open data features be added to mainstream enterprise products to enable access via file downloads, visualization, and documented APIs. Such features will be justified by the extra utility and engagement they support. Some vendors will incorporate monetization features to make it easier to track and charge for data the new tools expose.
  7. Continued challenges to open data ROI and impact measurement. As those experienced with usage metrics will tell you it’s not just usage that’s important it’s the impact of usage that really counts. In the coming year this focus on open data impact measurement will continue to grow. I take that as a good sign.  I also predict that open data impact measurement will continue to be a challenge.  Just as in the web site world it’s easier to measure pageviews than measure the impacts of the information communicated via the pageviews, so too will it continue to be easier to measure data file downloads and API calls than the impacts the use of the data thus obtained will have.

By Dennis D. McDonald, Ph.D.

Open Spending: Tracking Financial Data worldwide

If you have followed the activites of the OKFN these last years, you probably already know Open Spending, the community-driven project initiated in 2007 and which has considerably grown since then. First, the idea started with Where Does My Money Go?, a database for UK public financial data, financed by the 4IP (4 Innovation for the Public) fund of the British channel 4. Few years later in 2011, the initiative has been internationalized and Open Spending was born, a worldwide platform which has largely gone beyond the British borders. Today, the site shows data from 73 countries from Bosnia to Uganda and the visualisation tool Spending Stories could be developed at the same time, thanks a grant from the Knight Foundation. Talking about funding, not to forget the Open Society Foundations which supports the community building work and the Omidyar Network which funded the research behind the report “Technology for Transparent and Accountable Public Finance”. You guessed it? Everything is Open Source.

OpenSpending_web

Open Spending consists not only in aggregating worldwide public financial data as budgets, spending, balance sheets, procurement or employees salaries; giving information on how public money has been spent all over the world and in your own city. It allows users to visualise directly the available data via Spending Stories and add new datasets as well. The community members making use of the tools and developing them show various backgrounds and every one is invited to join. Additionally, articles are regularly posted on the blog to incite to share knowledge each other.

The results so far are very good since numerous administrations and media have already used the visualisations, as the city of Berlin and the Guardian for instance. But besides them, independent journalists, activists from the civil society, students and engaged citizens take also avantage of the datasets, allowing a better understanding on public money.

Bildschirmfoto vom 2014-12-03 18:19:44           TheGuardian

Analysing journalistic data with detective.io

detectiveioNo doubt, the power of the internet has changed profoundly the way in which journalists gather their information. To keep up with the growing amount of data digitally available, more and more tools for data-journalists are being developed. They help facing the challenge of handling vast amounts of data and the subsequent extraction of relevant information (here you can find our little collection of useful tools).

One powerful tool is detective.io, a platform that allows you to store and mine all the data you have collected on a precise topic. Developed by Journalism++, a Berlin- and Paris-based agency for data-journalism, it was launched one year ago.

By now, several investigations that used the tool have made headlines in Europe, amongst others The Belarus Network, an investigation about Belarus’ president Alexander Lukashenko and the country’s elite affairs by French news channel France24, and, most notably, The Migrants Files, a database on the more than 25,000 migrants who have died on their way to Europe since 2000. According to the developers at Journalism++, the applied methodology, measuring the actual casualty rate per migration route – has now been picked up by UNHCR and IOM. Another example is a still ongoing investigation on police violence, started by NU.nl, the main news website in the Netherlands.

What does detective.io do?

Basically, detective.io lets you upload and store your data and search relationships in it bywith a graph search using some network analyses. The tool, which is open source and still a beta version, structures and maps relationships between subjects of an investigation. This can be a vast number of entities such as organizations, countries, people and events.

In its basic version, the tool offers three generic data schemes that help structuring the data you have – for instance on a corporate network, the respective ownerships, branches, individuals involved and so on. To deal with more complex datasets, a customized data scheme is needed. There is no need for special skills to use detective.io but one needs to think hard about what elements of information are needed for the analysis before creating the data structure. However, such custom data schemes are not included in the basic version. The team at Detective.io offers several paid plans that include additional and/or customized data schemes and respective customer support.

There are special offers for NGOs and investigative journalists, too.

Open Steps Directory - Detective.io 2014-11-09 13-56-12One powerful asset of detective.io is that investigations can be shared with collaborators and/or made public. Here you can have a look at what our Open Knowledge Directory looks like on detective.io and explore the relations of organizations and individuals by using the graph search.

Currently, the developers at Journalism++ are working on a new GUI/frontend for detective.io that will allow every user to edit the data schemes by themselves.

Here you can request an account for the beta version and if you are interested to collaborate in the development of detective.io, you can find the tool’s GitHub here.

Introducing the new Open Knowledge directory with PLP Profiles

Bildschirmfoto 2014-10-25 um 11.11.10
During Open Steps’s journey around the world discovering Open Knowledge initiatives, the existence of a global community of like-minded individuals and groups became clear. Across the 24 countries we visited, we could meet people working on Open Knowledge related projects in every single one of them. Currently, and thanks to social networks, blogs, discussion groups and newsletters, this community manages to stay connected and get organized across borders. However, getting to meet the right people can result a difficult task for somebody without the overview or who is who and doing what, specially in a foreign country.

Me and my travel companion, Margo Thierry, started building a contact list as we met new amazing people during this great journey and finally realized that sharing this information would have a positive impact. That’s how the Open Knowledge directory came to life, with its aim of increasing the visibility of Open Knowledge projects and help forging collaborations between individuals and organizations across borders.

After some iterations we are now releasing a new version which not only features a new user interface with better usability and sets a base for a continuous development that aims to fulfill the goal of connecting people, monitor the status of Open Knowledge worldwide and raise awareness about relevant projects and initiatives worth to discover.

Bildschirmfoto 2014-10-25 um 11.11.25One of the main features of this version is the implementation of the Portable Linked Profiles, short PLP. In case you did not read the article I wrote about the inspiring GET-D conference last month where I spoke about it for the first time, you would like to know that PLP allows you to create a profile with your basic contact information that you can use and share. With basic contact information I mean the kind of information you are used to type in dozens of online forms, from registering on social networks, accessing web services or leaving your feedback in forums, it is always the same information: Name, Email, Address, Website, Facebook, Twitter, etc… PLP tries to address this issue but also, and most important, allows you to own your data and decide where you want it to be stored.

By implementing PLP, this directory does not make use anymore of the old Google Form and now allow users to edit their data and keep it up-to-date easily. For the sake of re-usability and interoperability, it makes listing your profile in another directory so easy as just pasting the URI of your profile on it, listo! If you want to know more about PLP, kindly head to the home page or to the github repository with the documentation.PLP is Open Source software and is based on Open Web Standards and Common Vocabularies.

We invite you now to register on our Open Knowledge directory if you are not there yet or update your information if you are. This directory is meant to be continuously improved so please drop us a line if you have any feedback, we’ll appreciate it.

The best of ConDatos, the top Open Data event of Latin America

Three weeks ago, a most important serie of Open Data events took place in Mexico City. The biggest megacity of the whole American continent was chosen to hold the second edition regional conference for Open Data: ConDatos, after the success of the 2013 edition in Uruguay.

The main exhibition was enhanced by many parallel conferences, meetings, workshops and hackathons, with the objective of showing Latin American countries not only have joined the Open Data movement, but also have understood its potential and are decided to make use of it.

Imagen_ Abrelatam3

ConDatos: A new conference on its way to earn a global reputation

Apart from the well-known Open Knowledge Festival, very few are the Open Data events of this size, financially and logistically speaking. The organizers manifestly wanted to show the world that Latin America, and especially Mexico, had taken the Open Data turn.

ConDatos_Foto

The reunion took place in good standing cultural places: the Biblioteca Municipal de Mexico and the Cineteca Nacional, buildings that are big enough to gather 180 speakers, 1000 registered people, 15 sponsors (such as Google, IBM or Deloitte), and host 50 conferences on 2 days, according to the information provided by the organizers.

And these data don’t even take into account the numerous parallel events that took place during week. Workshops, Hackathons, “disconference” and other community meetings which gathered developers, lawyers, lobbyists, aid workers, entrepreneurs and public officials.

It clearly meant to be a complete review of the region’s challenges and opportunities, covering diversified themes such as economic development, mapping, journalism, privacy, health, environment, civic engagement, administrative transparency, international politics, data science or open licence.

Mexico_Karte

The finest to exhibit regional potential on Open Data matters

ConDatos gathered most of the international “crème de la crème” on Open Data and Transparency (and also a bit of Open Source): Open Knowledge Foundation, Open Data Institute, Transparency International, Sunlight Foundation, Knight Foundation… even some high representatives of public administrations Such as the OECD, Secretaries for digital transformation of Mexico, Chile, Uruguay….

Obviously, all relevant local actors were here, such as Ciudadano Inteligente, Desarrollando América Latina, Argentina’s La Nación datablog, Wingu, and Codeando Mexico. Many of them attended an Open Knowledge Foundation meeting after the conference, the occasion to acknowledge the importance of the Open Data community in Latin America: the mere Argentinian chapter of the Open Knowledge Foundation claims about 500 volunteers, and many local groups were represented, such as Costa Rica, Salvador, Mexico, and Brazil.

Codeando Mexico, the organization responsible for editing the first Open Data website in Mexico, told us about some of the very innovative features of their portal, showing a civil society initiative can be an interesting alternative to governmental portals. Codeando Mexico’s portal uses the OKFN’s open source software: CKAN, and integrates two made-in-google tools highly appreciated by any data user: Open Refine and Google Big Query (analysis of massive data).

datamx_screenshot

ConDatos 2014 has definitely shown that Latin America is bursting with energy when it comes to Open Data matters. The event is likely to earn a reputation after such a demonstration, and become a reference on the global level. Which will see next year in Santiago de Chile, where the 2015 edition will take place.

Beyond the general optimism, reluctance to transparency and lack of startups

If there’s clearly a shared optimism about the numerous Open Data initiatives and their potential to bring change and innovation, a few remarks can be made both about active transparency (governments intentionally liberating the data) and passive transparency (citizens asking for public information).

About active transparency first, most of Government’s open data portals register only a few datasets. Salvador, for instance, only has 57 datasets on its portal (as a comparison, there are more than 13.000 datasets published in France, and more than 150.000 on the US portal). Chile does a bit better with about 1200 datasets, but Brazil’s 350 datasets don’t look impressive considering the size of the country and the size of its administration. Argentina seems once again to be ahead: not only is has a furnished national open data website, but two of its biggest cities have one: Buenos Aires (26 datasets) and Bahia Blanca (200 datasets).

About passive transparency, a lot of the participants complained about the difficulty to access to public data, even where a transparency law exists. The administration regularly shows reluctance, through excessive paperwork or excessively long processes. In some countries, Open Data advocates even declare to fear retaliations if they ask for compromising data.

DataMarket

Beside, startups seemed underrepresented. Although there were a few ones like Junar, Socrata and Grupo Inco, almost every speaker was representing or an NGO, or a public entity, giving the impression that Open Data was only a dialogue between civil society and governments, leaving the private sector world out of it.

In Europe, startups such as ScraperWiki in the UK, Data Publica in France or Spazio Dati in Italy helped shaping the Open Data environment of their respective countries. We can only hope that a data startups movement will start to grow in Latin America, bringing their piece to the edification of a productive Open Data environment.

Visualising Daily Traffic in Santiago metro

Santiago metro is one of the main transportation system for its 6.5 millions inhabitants. A data visualization of the average daily traffic density in Santiago metro, made by Data Publica and Inria Chile, helps understand better how the daily traffic is organized in Santiago.

13uarjbSo-kvrbQAT8jL9rYW6WlAZCAhQEB4jfYsYTAl--47UkJmTeVEYORmS7cAj09SfPoKxmx9OdE6aQGRmAkt_VnYdUmd62O4NT2ALI4uThalOSaGDkkd9w

This is an interactive data visualization (Dataviz) showing the traffic in Santiago metro of an average working day, for each half-an-hour of the day. The traffic density is measured by the number of people getting in the metro. The data was found on the Chilean government’s Open Data portal: datos.gob.cl.

It shows where passengers come from, regardless of where they’re going and what connections they could have made during their trip. The size of a circle is proportionate to the number of people getting in the corresponding station, and the ranking buttons add up all the data of the day/morning/afternoon.

As expected, more entries into the metro have been registered in areas with a high density of corporations or universitary zones: Santiago Centro, Providencia, Las Condes. However, the station registering the most entries is surprisingly a station far from the center: La Cisterna, with an average of 74.133 entries a day.

The only data available to measure the traffic density is the number of people entering in each station, which had originally been published by the Subsecretary to Transports (a department of the Chilean Transport and Telecommunication Ministry). A lot of interesting data is still missing, for instance the number of people getting out a station, the number of combinations in main hubs (like Baquedano), the number of people taking a bus before or after taking the metro, etc.

To estimate the number of people getting in a specific metro station, the Ministry used a Origin-to-Destination Matrix of working day trips built up by the Universidad de Chile. The sample was taken in a regular working week (Monday to Friday) of April 2012.

To estimate the average number of entries for each half-an-hour, data of every day of the week was added and then divided by 5, the total number of working days a week.

The Ministry of Transport’s datasets were combined with the following one: Feed GTFS de Transantiago, in order to have the geographical position of each station. The geographical coordinates of the whole Santiago zone were obtained here.

Thanks to the map, and especially the sum of passengers during the day/afternoon/morning, it is possible to identify residential areas (more passengers in the morning) and working districts (more passengers in the afternoon). The traffic density is indeed higher in business districts such as Santiago Centro (Universidad de Chile and Los Héroes stations) and Providencia/Las Condes (Tobalaba, Pedro de Valdivia, Escuela Militar, Manquehue stations). The traffic density is also lower in the other districts of Santiago, for they are mainly residential.

It is also a tool to identify some unexpected hubs. For instance, La Cisterna station is rather far from the center, but it is still registering the most important number of entries of the whole metro system! It can be concluded that it is a hub within an important bus network, covering the whole south-eastern part of Santiago.

This article has been translated by its author Louis Leclerc from Spanish into English. The original article in Spanish can be found here.

Waiting for the new French Digital law

According to the last UN Survey on E-Government published this year, France proves to be at the top of the list of the countries embracing a high level of e-government development, reaching the 1st rank in Europe and the 4th worldwide. The study praises particularly the good integration of e-services through the online platform service-public initiated in 2005 which enables citizens, professionals and associations the access to administrative information (on their duties and legal texts among others), simplifies procedures and provides a large civil service directory. Not to forget Legifrance and vie-publique which both document legal and current affairs online. Let’s just say that efforts towards a transparent public administration have been the leitmotiv behind these initiatives.

Bildschirmfoto vom 2014-09-20 17:24:07Bildschirmfoto vom 2014-09-20 17:53:45

If we look at the Open Data side, we come to data.gouv.fr, the national Open Data platform launched in December 2011 which features nowadays its second version, this time developed with CKAN and without any fee so that the data gets indeed re-used. Those fees were one of the blackheads listed on the OKFN Index in 2013 which ranked France at the 16th position among 70 countries from all continents. Among the negative points are following the lack of relevant data like government spending or budget and the too low resolution of maps from the National Institute of Geographic and Forest Information. Thus, if a national Open Data strategy has been embraced since 2011, there is still lots to be done. Above all a law (currently being drafted) is needed to push local and regional administrations to liberate their data on an open way, because the situation is strongly disparate.

Bildschirmfoto vom 2014-09-20 17:08:32Actually, the French OD movement took root at the local level. It started in the Western region of France, Brittany, where the city of Brest decided in March 2010 to release its geographical data and in Rennes, the main town, which launched at the same time an OD site dedicated to transport data and a couple of months later the first OD platform in France, multi-sectoral and containing various web and mobile apps besides the datasets. A similar site in Nantes then regional initiatives in Loire-Atlantique and Saône-et-Loire followed during autumn 2011. Today, the map of the local and regional OD movement in France made by LiberTIC shows the commitment of administrations at different levels (regions, cities and even villages as the one of Brocas with OpérationLibre) in different parts of the country and the creation of civil society groups too.

According to the current draft of the law on decentralization imposing French towns to release their data as open, only municipalities over 3500 habitants will be affected that means 92% of them are excluded. In addition, the obligation is limited to the data already electronically available and none format or standards has been specified. Never mind, the law has to be in compliance with the implementation of the European Directive 2013/37/EU on the re-use of public sector information, named PSI Directive, which strengthens the Open principles and has to be transposed into the different national laws by each EU member country until the 18th July 2015. In France, Etalab, a special committee created in 2011 and dedicated to the governmental OD strategy, is in charge of the implementation.

The French FOI law dates back to 1978. It was modified in 2005 by an order, according to the European Directive 2003/98/EC, the first legislative measure which shaped the European framework for Open Data and was amended by the Directive of 2013 above mentioned. Preparing the implementation of this last one with the law on decentralization and another on digital technology, France appears to be very active these last months and hopefully that is a good omen for the future. Etalab organised last April a national conference on Open Data and Open Government, inviting representatives of the private sector and the civil society. The future appointment of a Chief Data Officer was announced (still to be designated) as well as the participation of the French government in the Open Government Partnership (OGP) and France will even join the OGP steering committee from 1st October. Last but not the least, the Senate published in June a report on the access to administrative documents and public data which supports the efforts made by the government since 2011 to release public data to the public domain but underlines that the results so far aren’t up to the actual challenges and don’t fulfil neither what has been expected by the civil society. Too often, the data is not complete or available in an unfriendly format, its quality varies depending on the administration, updates and meta-data are missing, revealing the lack of resources and reluctance to agree with the Open Data action. The report ends with 16 recommendations like the use of visualisations to make the data more comprehensible for the users which should be taken into consideration in the preparation of the both upcoming laws.