The best of ConDatos, the top Open Data event of Latin America

Three weeks ago, a most important serie of Open Data events took place in Mexico City. The biggest megacity of the whole American continent was chosen to hold the second edition regional conference for Open Data: ConDatos, after the success of the 2013 edition in Uruguay.

The main exhibition was enhanced by many parallel conferences, meetings, workshops and hackathons, with the objective of showing Latin American countries not only have joined the Open Data movement, but also have understood its potential and are decided to make use of it.

Imagen_ Abrelatam3

ConDatos: A new conference on its way to earn a global reputation

Apart from the well-known Open Knowledge Festival, very few are the Open Data events of this size, financially and logistically speaking. The organizers manifestly wanted to show the world that Latin America, and especially Mexico, had taken the Open Data turn.

ConDatos_Foto

The reunion took place in good standing cultural places: the Biblioteca Municipal de Mexico and the Cineteca Nacional, buildings that are big enough to gather 180 speakers, 1000 registered people, 15 sponsors (such as Google, IBM or Deloitte), and host 50 conferences on 2 days, according to the information provided by the organizers.

And these data don’t even take into account the numerous parallel events that took place during week. Workshops, Hackathons, “disconference” and other community meetings which gathered developers, lawyers, lobbyists, aid workers, entrepreneurs and public officials.

It clearly meant to be a complete review of the region’s challenges and opportunities, covering diversified themes such as economic development, mapping, journalism, privacy, health, environment, civic engagement, administrative transparency, international politics, data science or open licence.

Mexico_Karte

The finest to exhibit regional potential on Open Data matters

ConDatos gathered most of the international “crème de la crème” on Open Data and Transparency (and also a bit of Open Source): Open Knowledge Foundation, Open Data Institute, Transparency International, Sunlight Foundation, Knight Foundation… even some high representatives of public administrations Such as the OECD, Secretaries for digital transformation of Mexico, Chile, Uruguay….

Obviously, all relevant local actors were here, such as Ciudadano Inteligente, Desarrollando América Latina, Argentina’s La Nación datablog, Wingu, and Codeando Mexico. Many of them attended an Open Knowledge Foundation meeting after the conference, the occasion to acknowledge the importance of the Open Data community in Latin America: the mere Argentinian chapter of the Open Knowledge Foundation claims about 500 volunteers, and many local groups were represented, such as Costa Rica, Salvador, Mexico, and Brazil.

Codeando Mexico, the organization responsible for editing the first Open Data website in Mexico, told us about some of the very innovative features of their portal, showing a civil society initiative can be an interesting alternative to governmental portals. Codeando Mexico’s portal uses the OKFN’s open source software: CKAN, and integrates two made-in-google tools highly appreciated by any data user: Open Refine and Google Big Query (analysis of massive data).

datamx_screenshot

ConDatos 2014 has definitely shown that Latin America is bursting with energy when it comes to Open Data matters. The event is likely to earn a reputation after such a demonstration, and become a reference on the global level. Which will see next year in Santiago de Chile, where the 2015 edition will take place.

Beyond the general optimism, reluctance to transparency and lack of startups

If there’s clearly a shared optimism about the numerous Open Data initiatives and their potential to bring change and innovation, a few remarks can be made both about active transparency (governments intentionally liberating the data) and passive transparency (citizens asking for public information).

About active transparency first, most of Government’s open data portals register only a few datasets. Salvador, for instance, only has 57 datasets on its portal (as a comparison, there are more than 13.000 datasets published in France, and more than 150.000 on the US portal). Chile does a bit better with about 1200 datasets, but Brazil’s 350 datasets don’t look impressive considering the size of the country and the size of its administration. Argentina seems once again to be ahead: not only is has a furnished national open data website, but two of its biggest cities have one: Buenos Aires (26 datasets) and Bahia Blanca (200 datasets).

About passive transparency, a lot of the participants complained about the difficulty to access to public data, even where a transparency law exists. The administration regularly shows reluctance, through excessive paperwork or excessively long processes. In some countries, Open Data advocates even declare to fear retaliations if they ask for compromising data.

DataMarket

Beside, startups seemed underrepresented. Although there were a few ones like Junar, Socrata and Grupo Inco, almost every speaker was representing or an NGO, or a public entity, giving the impression that Open Data was only a dialogue between civil society and governments, leaving the private sector world out of it.

In Europe, startups such as ScraperWiki in the UK, Data Publica in France or Spazio Dati in Italy helped shaping the Open Data environment of their respective countries. We can only hope that a data startups movement will start to grow in Latin America, bringing their piece to the edification of a productive Open Data environment.

Tabula : Liberating data tables trapped inside PDF-Files @ Buenos Aires, Argentina

In the context of the Open Data movement, we are currently witnessing how organisations (whether public administrations or private corporations) are increasingly releasing data to the public domain. The intention behind this can be of becoming more transparent or to encourage developers to build useful applications on top of the published data.

Bildschirmfoto 2014-05-08 um 13.49.48For the sake of its re-use, this information should be optimally stored in a well-structured and machine-readable file, formatted as XML, CSV or EXCEL. However, this is not always the case and although such organisations are willing to share the data, the format is not properly chosen what, in some cases, makes the information even useless. It is the case of PDF files. PDF is a format originally thought to contain data meant to be printed. That is the reason why this kind of files support paging, paper-like sizing or can contain indexes, but in any case achieves the goal of storing large amounts of structured data as we expect from Open Data.

Activists, journalists or researchers willing to analyse big amounts of information published in PDF files often have to give up on their intention due to the effort associated to extracting all the numbers out of the files. That is why we want to introduce you Tabula, a tool that helps extracting the information contained in tables inside PDF files.

68747470733a2f2f662e636c6f75642e6769746875622e636f6d2f6173736574732f35333132392f3238373935372f36626566656564652d393236352d313165322d396538352d6165386631393337646562332e706e67Developed by Manuel Aristarán with the help of other fellows working on data journalism, Tabula can be installed on every computer (Windows, Mac or Linux) and, as if it was magic, extracts the information from tables present in PDF files, exporting it directly in a nice CSV formatted file. The interface makes the tool really easy to use, allowing the user to “draw” a box to select the relevant information. This saves up lots of valuable time.

Although, it is important to warn that only text-based PDFs are supported by now and not scanned documents, which are in their internal structure significantly different. This is a feature that would make the tool super powerful and is placed on the top of the improvements wish-list. Did we mentioned that Tabula is Open Source? That means that you can contribute improving it if you are a developer (OCR gurus more than welcomed!), contribute with some improvement ideas or give your feedback as user.

Meeting @ Cargografías, Buenos Aires, Argentina

DSCF7160Matter of fact, most of the experts and participants gathered in hackathons and events around Open Data / Open Government come from the IT or media scene. But Open Data and Open Government are not a private club for coders and journalists. You might give your two cents whatever you do. Designers are also part of the hacktivists initiating and developing such projects. And we have enjoyed so much exploring this perspective through the work of Andrés Snitcofsky.

Captura-de-pantalla-2013-10-23-a-las-15.27.59Both graphic designer and professor of heuristics at the University of Buenos Aires, Andrés had the idea in September 2011 to build what became later on Cargografías, an interactive time-line visualisation of the highest positions from the Argentinian political sphere. Users can search by position or name to explore and easily understand how the political framework is structured, what are the relations between the different positions at the power and how the higher politicians have been replaced along the years. The idea arose in the context of the Argentina’s economic crisis in 2011 and, although time was too short to fix it for the presidential elections in October 2011, the tool was finally achieved and revealed to be very useful during the next campaign of 2013.

The project got developed within the group of Hacks/Hackers Buenos Aires (HHBA), created at the same time in 2011 and which counts nowadays nearly 2500 members, the biggest local group in Latin America of the international grassroots journalism organization. With other members of HHBA, Andrés collected the raw data, researching on wikipedia or scrapping the information from other relevant sources before sorting it out manually into spreadsheets. The actual version 2.0. is the result of this collaboration and, even if it already represents a great piece of work, some updates are needed and new features could be added to extend the current capabilities. The users’ feedback, set as a participative function, help to point out what could be improved and also which contents have to be completed.

The initial team has sadly been changed and Andrés is now looking for a developer to implement the next version. This is a call for a coder! Cargografías will be released as Open Source as soon as some help (no matter from Argentina or not) will be found, since the tool is definitely worth to be replicated in further countries and political contexts.

Meeting @ GarageLab, Buenos Aires, Argentina

Bildschirmfoto 2014-04-18 um 13.05.06Our aim to document Open Knowledge initiatives in Buenos Aires led us this time to GarageLab, a great community-run Hackerspace. Dario Weiner, co-founder and coordinator, received us in their fantastic space and gave us lots of details about their origins, philosophy, past and ongoing projects. Again, we could feel how such open spaces are the best environment for the development of Open Cultures and Knowledge Sharing.

This group of enthusiasts (more than forty members today) started meeting around 2009. Since then, their ambitious goal has been to enrich the innovation ecosystem in Argentina from the civil-society, contrary to the established assumption that such a thing only happens within the walls of universities or private corporations.

DSCF7155Since its creation, the GarageLab community has worked on a wide spectrum of technology-related fields, what they call BANG: Bits, Atoms, Neurons & Genes. It was june 2012 when they finally found a space for setting up their machines (3D Printers, laser cutters and other geekery) and start sharing knowledge through regularly organised workshops and joining forces to collaborate on projects in topic-based meetings.

What we found most relevant from GarageLab is the problem-solving approach that characterizes this community. As Dario told us, improving social-issues became one of the main focus shortly after the creation of the group. That is how some of their projects started; a study on the pollution of the Buenos Aires’ creek, their collaboration with NGOs and advocates to explore maternal mortality or the production of “happier” and more adequate chairs for children schools.

DSCF7137DSCF7150DSCF7145

Also very interesting for us was to discover that several from these social-oriented projects are based on the collection, analysis and visualisation of Open Data. And as a matter of fact, GarageLab can be defined as a multidisciplinary community since is not only programmers but also designers, journalists or artists those who are part of the community.

Meeting @ GCBA, Buenos Aires, Argentina

In our experience, Open Government initiatives are usually implemented firstly at national level before being applied in regions and cities. This enables to test and experiment technologies and mechanisms prior to adapt them to more local administrative scenarios. The case of Argentina is different since it is the administration of the capital, the government of the city of Buenos Aires (GCBA), which pioneers in this matter.

DSCF7050DSCF7067

Gonzalo Iglesias, Chief of the Cabinet at the Open Government Directorate in the Ministry of Modernisation of Buenos Aires, welcomed us in its office. Located in the very centre of the city, the building serves also as co-working space and laboratory for experimentation on Open Data and citizen-oriented tools. The first tasks of this young and passionate team gathered two years ago were to integrate and improve the existing digital services of the different directorates of the municipality; and at the same time to conceive new mechanisms promoting transparency and empowering citizens. The launch of the Open Data platform ensuing the Open Gobernment legislative decree in 2012 has been one of the main milestones in their work and has served as reference for similar initiatives in the country, such as the city of Bahia Blanca.

Bildschirmfoto 2014-04-15 um 22.25.52

The team, composed by less than 20 members, focuses its efforts on two main fronts: First, to assist the numerous municipal sections and agencies on the process of releasing Open Data. This involves sharing know-how and sometimes even advocating for the benefits of sharing public information with the citizens. In order to address these challenges, the team organises a yearly unconference called GobCamp where civil servants have the opportunity to learn and exchange in small working groups how they can make data available and develop Open Government instruments. The second focus of the Open Government Directorate’s team is to ensure that all the information being released to the public domain is actually demanded and finally used. To achieve this, two hackathons and App Challenges, open to everyone interested, have been already organised and proved to be quite successful, if you consider the impressive amount of civic apps designed so far.

Buenos Aires counts with a FOIA law since 1998, but an equivalent at the national level is still missing. However, last year, the government has committed to initiatives such as the Open Government Partnertship, created its national platform to host and offer Open Data to users for download, and even organised a hackathon to encourage developers, designers, journalists and other interested to think of ways to turn the available datasets into something valuable for the society. But the fact that Argentina is a federal republic and also that the political party ruling the nation opposes the one in the capital make a collaborative political environment difficult. A closer cooperation where everyone could learn from other’s experiences could definitely accelerate the steps towards more Open Government in Argentina.

Meeting @ La Nación Data, Buenos Aires, Argentina

Data journalism is one of the topics we have been continuously following along our journey, but we never had the opportunity to visit a newsroom yet. This finally happened last week with our meeting at La Nación in Buenos Aires. Belonging to the oldest newspapers in Argentina, La Nación has been pioneering in technology in the last years: it is not only one of the first newspapers launching its online edition in 1995, but also it counts with a dedicated and passionate team focusing their work on Open Data. This fact is actually what brought us there.

LNdata_250px_400x400La Nación Data (LNData) was founded as an internal section in 2010 with the aim to use and promote the power of Open Data for journalistic purposes. They created their own Open Data platform where users can find and download numerous datasets which contain valuable information relating the argentinian citizens. As Digital Media Researcher Flor Coelho explained us, there is a huge effort behind each collection of data released on the platform. Relevant numbers, as those showing the dramatic variation of the inflation rates, cannot be found in such an usable form in other sources.

 Their efforts have been already acknowledged through several Data Journalism Awards, such as the Online Journalism Award of the University of Miami and GEN‘s Data Journalism Award. The latter, received in 2013 for their project “Gastos del Senado 2004-2013”. By extracting and analysing the data from over 33.000 scanned documents downloaded from governmental sites, the investigation team could find out several and major irregularities involving public funds. We invite you to watch the video below to get more detailed information.

After the impact of such results, and having still a big amount of documents to analyse, LNData did not stop there and had the genius idea to encourage citizens to participate in the investigation process. Bildschirmfoto 2014-04-13 um 19.34.45This is how VozData got created, a platform where everyone can help gathering information from those remaining papers. Since its launch, over 350 citizens have engaged themselves freeing the data from more than 3400 scanned documents. The goal is to turn the contents into useful data, giving the possibility to analyse it accurately thus bringing transparency on how public money is being used. A reason big enough to motivate users to invest their time on this collaborative challenge. The code of the site will be released as Open Source as soon as the last features get implemented and bugs fixed.

Another example of the great work LNData does in the field of Data journalism is the research on the number of casualties due to the floods that stroke the city of La Plata on the 2nd of April 2013. An efficient analysis and visualisation of the information contained in death records revealed more victims than the public authorities announced first. The publication of the results led to a review of the official number of deaths.

Besides their journalistic research, LNData puts lots of efforts on advocating for Open Data, sharing their experience with others. Events are regularly organised, not only intern trainings but also public workshops and conferences such as the Datafest. With an upcoming third edition taking place in October 2014, this will be fantastic opportunity for journalists and communication experts, developers, designers and everyone interested on the topic to exchange, learn and bring out new ideas. Save the date if you happen to be in Buenos Aires!