“Open Data Now”: an inspiring reading for Open Data entrepreneurs @ New York, USA

open-data-coverWritten by Joel Gurin, “Open Data Now” has being published early this year and “presents a strategy for success in the coming era of massive data”. The author, a former sciencejournalist who also worked as consumer advocate before turning into the federal administration, has the experience to give us an overview on how Open Data is affecting both private and public sector. Although useful for everyone interested in the topic, this book is specially dedicated for entrepreneurs, small business owners and corporate executiveswilling to build additional value on top of it. Not to forget that its subtitle reads: “The Secret to Hot Startups, Smart Investing, Savvy Marketing, and Fast Innovation”.

But also for citizens, advocates and researchers

The 14-chapters-book begins defining the concept of Open Data and shares details on how the movement got developed in its origins in the US. Already in the introduction, the author remarks the positive impact of Open Data in the private sector, and this focus remains present along the entire book. However, it is described how Open Data also acts as a regulatory mechanism that pushes organisations towards being more transparent. Different cases are presented where data is used to raise awareness on social or local issues, to improve public safety or to condemn irregularities that affect citizens.

But not only private companies and governments are being influenced by this new movement. The scientific research is a field where sharing data is also playing a very important role. Stressing the concept of Open Innovation, chapter ten gives examples of research institutions which are directly profiting from an increasing amount of data being released. Collaboration between scientists and crowdsourcing strategies are defined as new elements for the success of academic and scientific challenges. Reading these pages, we thought automatically about DNADigest.org, the initiative for sharing genomics data for research we previously covered.

As the reader can notice, the book focuses on the status of Open Data in the US and UK. Most of the examples come from there. It is clear that both countries are leaders in this global movement, but what happens with others? Can the contents of this book be applied to other parts of the world? We were curious about this and asked the author:

  • Mr. Gurin, your research is mainly based on the context of the US and UK. As you mentioned, Open Data is now a global topic and we can find actors in every continent. What was the reason to leave out other countries? Should we expect a second book with a worldwide approach?

I can’t say yet whether I’ll write a second book – I’m still spreading the word about this one! I focused on the US and UK in Open Data Now because this book is largely focused on business applications of Open Data, and my sense is that those have developed first and most extensively in those two countries. However, the Open Data 500 project (see below) has attracted interest from countries around the world, and we’re now preparing to replicate it in a dozen countries or more. I hope that will help bring a broader international perspective to the field.

Personal data for consumer’s benefit

In the third chapter, the concept of smart disclosure gets presented as a tool to help consumers take better decisions and spend money more wisely in different areas such as healthcare, energy or education. Furthermore, an efficient use of open governmental data leads to the creation of new business opportunities, some of them are illustrated along these pages.

The same chapter is also dedicated to the value of personal data. Although this information should not be qualified as open, its use offers benefits both for consumers and service providers (i.e optimizing shopping, helping to choose the best health insurance or finding a suitable house). It is in this part that we experienced for the first time about the “Blue Button” and “Green Button” initiatives which allow patients and consumers in the US download their medical and energy consumption reports respectively. We asked Mr. Gurin a second question in order to get more information about this:

  • Mr. Gurin, do you think that users are ready to share their personal data with third parties? At what price? Will this kind of data get the same momentum as Open Data has? Is the internet enough safe to allow a sustainable and secure development of this area?

This is a great question, and one that we can’t answer yet. My best guess is that consumers will be attracted to “personal data vaults” – the new technologies for storing your personal data in a secure way – because they promise a way to keep individual data safe and under the user’s control. Once personal data vaults become common, they’ll offer the opportunity for people to share their personal data selectively and securely with third parties who can help them by knowing more about them. Whether we can make the Internet safe enough to prevent serious data breaches, however, remains a question.

“Open Data Now” is not only a book

Bildschirmfoto 2014-01-25 um 10.56.38As the author states,“ the world of Open Data is moving fast, and no book on this topic can be completely current”. That’s why Mr. Gurin has created the website opendatanow.com which contains a blog with news and links to follow the latest developments, debates and opportunities around the topic. We encourage you to visit it to stay updated and also discover about the Open Data 500 project: a study run by The Governance Lab where the author serves as senior advisor. It consists on identifying 500 of the US companies that use open government data to generate new business and develop new products and services. The upcoming release is planned for early 2014 and will allow researchers to download collected data. A very interesting idea that will definitely help to monitor the influence of Open Data in the business sector in the US.

Get the book

Feeling interested? You can get the book or read the first chapter here!

HackYourPhd: reporting on Open Science from the US @ Boca Raton/Paris, USA/France

carte-voyage-HYPhDUS_rev2A summer trip through the US to discover and document Open Science projects? When we first heard about HackYourPhd, we were excited to notice how similar is the concept of their research with our own. The idea was initiated last year by two young french researchers, Célya Gruson-Daniel & Guillaume Dumas, and “aims to bring more collaboration, transparency, and openness in the current practices of research.” Célya travelled during 3 months from Boca Raton (Florida) to Washington DC, gathering information and meeting people and groups active in the Open Science scene.

While this roundtrip in the US is now over, HackYourPhd is still active and has become an online community where the research continues. Read below the interview with the two persons behind this fantastic initiative and discover how the idea came to life, the insights of the trip and what is coming next.

1) Hi Célya & Guillaume, you both co-founded HackYourPhd, a community focused on Open Science which gave a globetrotter-initiative in the US last year. We are really curious how did you get this idea and to know more about it. Don’t forget to introduce yourself and the concept of Open Science too!

Hi Margo & Alex, thanks for this interview. We discovered a few months ago your great project. Now, we are much happy to help you since it is a lot related to what we tried to do last summer with “HackYourPhD aux States”. But before speaking about this Open Science tour across the USA, let’s us remind first the genesis and the aim of HackYourPhD in general. HackYourPhD is a community which gathers young researchers, PhD and master students, designers, social entrepreneurs, etc. around the issues raised by the Open Science movement. We co-founded this initiative a year ago. The idea of this community emerged from our mutual interest to research and its current practices. Guillaume is indeed postdoc in cognitive science and complex systems. He is also involved in art-science collaborative projects and scientific outreach. Célya is specialized in science communication. After two years as community manager for a scientific social network based on Open Access, she is now working in science communication for different projects related to MOOCs and higher education. We are both strong advocator for Open Science and that mainly why we came up with HackYourPhD. While Guillaume has tried to integrate Open Science in his practice, Célya wanted to explore the different facets with a PhD. But before, she wanted to meet the multiple actors behind this umbrella word. This is what motivated “HackYourPhD aux States,” the globetrotter-initiative per-see.

2) Why did it make sense especially in the US to follow and report Open Science projects? Could you imagine yourself doing it in other countries? What about France?

Because this was in the English speaking country that the Open Science movement has been started. That is thus also there that it is the most developed to date, from Open Access (e.g. PLoS) to the hackerspaces (e.g. noisebridge). There is also a big network of entrepreneurs in Open Science, which is specifically an aspect we were interested in. Célya thus decided to first look at the source of the movement and take time (three month) before doing a similar exploration in Europe with shorter missions (e.g. one week). Concerning France, we have still begun to monitor what is taking off, from citizen science to open data and open access. While we have certainly a better vision, the movement is still embryonic. But the movement will also take other forms and that is also what we are interested in. Célya is thinking to make her PhD in a research action mode, being observer and actor in this dynamical construction of the French Open Science movement.

3) From our experience, we could schedule our encounters and events both before starting the journey and on the way. Is that the same for you? How did you select your stops, the projects documented and persons interviewed? Is Open Science a widespread topic or it was actually difficult to find cases for your research?

Célya had already a blueprint of the big cities and the main path to follow. With the help of the HackYourPhD community, she gathered many contacts and constitute a first database of locations to visits and people to meet. Before starting, the first step—San Diego and the bay area—was almost scheduled. Then, the rest of the trip was set up on the way. Few important meetings were already scheduled of course (e.g. the Center for Open Science, the Mozilla Science Lab, etc.) but across the travel, new contact were given spontaneously by the people interviewed. Serendipity is your friend there! Regarding difficulties to find cases, this is quite function of the city. While San Francisco was really easy, Boston for example, which is full of nice projects, was nevertheless more challenging.

4) We know it is difficult to point out just one of them … but could you tell us what is your favourite or one of the most relevant Open Science initiatives you have discovered?

When Célya was in Cambridge, she visited the Institute for Quantitative Social Science. She met the director of the Data Science, Mercè Crosas and her team. Célya discovered the Dataverse Network project. It is one of the most relevant Open Science initiatives she discovered. Indeed, this project combines multiple facets of Open Science. It consists in building a platform allowing any researcher to archive, share and cite his data. It has many functionalities cleverly linking it to other aspects of Open Science (open access journal with OJS, citation, alt-metrics..). Here are the interview Mercè Crosas

5) As we discussed previously with Fiona Nielsen, sharing knowledge in the scientific domain has a positive impact. After your research, why does Open Science matter and how does it change the way scientists have been working till now?

Open Science provides many ways to increase efficiency in scientific practices. For example, Open Data allows research to better collaborate; while this solution seems obvious to many, it appears as a necessity when it comes to big science (e.g. CERN, ENCODE, Blue Brain, etc.) Open Data means also more transparency, which is critical to solve the lack of reproducibility or even frauds.

Open Access presents several advantages but the main one remains the guarantee to access scientific papers to everyone. As a journalist, Célya faced many times the issue of paywalls, and this is always frustrating. Last but not least, Open Science opens up new possibilities for collaboration between academia and other spheres (entrepreneurs, civil societies, NGO, etc.) Science is a social and collective endeavour, it thus needs contact with society and leave its ivory tower. The Open Science movement is profoundly going in that direction, and that why it matters.

6) As you know, Open Steps focuses on Open Data related projects. Quoting you, “In Seattle, I noticed a strong orientation of Open Science issues around Open Data.”, could you tell us more about this relation and the current situation in the US? Could you point us to any relevant Open Data initiative that we might want to document?

Open Data depends on scientific fields. Indeed, Seattle was a rich environment on that topic, but this is certainly caused by the software culture in the city (Amazon, Microsoft, etc.) The Open Data topic is related to Big Data. Thus, the key domains are genetics, neuroscience, and health in general. Lot of projects are interesting. We already mentioned the Dataverse Network, but you may also enjoy the Delsa Global Project (interview with Eugene Kolker) or Sage Bionetwork.

7) There are a lot of sponsors supporting you. Was it easy to convince them? Is that how you finance 100% of the project or do you have others sources of income?

All the sponsors were done thanks to the crowdfunding campaign on KissKissBankBank. This is not a question of convincing them, they just demonstrated the need of covering the topic of Open Science in France. Their financial help represents 36% of the total amount collected.

Their were no other source of income. The travel was not expensive since Célya used the collaborative economy solutions (couchsurfing, carpooling, etc.)

8) Now the trip is over …. but HackYourPhd still running. How does it go on now?

We are pursuing the daily collaborative curation, with almost a thousand people on our Facebook group. We are also organizing several events, mainly in Paris but with a growing network with other cities and even countries. The community is self-organized but needs some structure. We are currently thinking about this specific issue and hope 2014 will be a great year for the project!

Merci à vous deux!

Mapping Open Data with CartoDB @ Madrid/New York, Spain/USA

logos_full_cartodb_lightIf you have been following Open Steps, you know that a great part of the project consists on running a workshop on Open Data visualisation in the different cities visited. In these sessions, after going through some theory, we get hands on and teach how geo-referenced datasets can be represented on a map. We wanted to teach an easy but powerful tool that could be used by everyone, so we chose CartoDB. And it was a good choice!

Greatly based on Open Source software, this online platform has been conceived to serve journalists, designers, scientists and a large etcetera in the task of creating beautiful and informative interactive maps. The developers behind the tool had Open Data in mind since the first days and fact is that importing and visualizing datasets couldn’t be easier and faster. In addition, great features such as dynamic visualizations, support for your favourite Open Data formats and the endless possibilities of its Javascript API allow beginners but also big organisations (NASA, The Guardian, National Geographic among others) to tell stories with numbers.

Andrew Hill, member of the team, took some time and answered our questions about the creation and philosophy of the tool, its Open Source core and the importance of Open Data for educational, scientific and social development. We invite you to find out more about CartoDB here:

1) Hi Andrew, can you introduce yourself briefly and explain us what CartoDB is?

Hi, I’m the senior scientist at Vizzuality and CartoDB. CartoDB is our online mapping platform that we built to let people make beautiful interactive maps easily.

2) Your company, Vizzuality, is based between Madrid and New York. What is the story behind its creation? Besides CartoDB, are you working on other products or have other activities?

Vizzuality was created by our co founders, Sergio Alvarez and Javier de la Torre, both from Madrid. Our first office was in Madrid where we started to grow the company. It wasn’t until a couple years later that Javier and I moved to New York to start the office here. The idea was just to grow and explore new collaborations.

Right now, our biggest focus by far is CartoDB. There is a lot of innovation around maps on the web right now and we are really enjoying contributing to it. CartoDB has become more than we could ever have imagined and now we can see so many ways to keep making it more incredible, so I’m sure we’re going to be focused on it for some time to come.

3) Let’s focus on CartoDB, since it is the tool we are teaching on our workshop. Who is currently using it? Journalists, designers, developers? Can you point us to remarkable projects making use of all the possibilities the tool has to offer?

Yeah, all of those people, plus students, governments, city planners, nonprofits, you name it :)

Sure, I think one of the best places to find recent examples is our blog or on Twitter. Some highlights include:

http://illustreets.co.uk/

http://clearstreets.org/

http://here.com/livingcities/

Twitter has been using us for a lot of quick visualizations

http://projects.aljazeera.com/2013/syrias-refugees/index.html

http://sweeten.com/maps

and many more…

4) CartoDB, as the rest of your products, is based on open source software and its code is released to the public domain. What is your motivation behind this decision? For your company and the development of your products, what is the impact of choosing an Open Source license?

We have always been committed open source. Largely it has to do with our background as a scientific company, working with and interacting with scientific research it seemed obvious to us that science benefits greatly from open source. Not only does it benefit from it, it almost seems irresponsible to do anything else.

With the importance of maps in society, I feel it also seems irresponsible to rely on black boxes for mapping. CartoDB doesn’t hide anything from you, it is there for you to criticize, improve or change as you need.

5) As we know, Open Source does not necessary exclude commercial products. What is the business-model for your products?

We offer a lot of incentives on top of our hosted service. Including our caching, backups, uptime, maintenance, upgrades, etc. With paid hosting plans you also get dedicated support and access to the foremost experts of CartoDB to help you become a better mapper, data visualization expert, or GIS expert on our platform. So there is a lot of benefits that using our hosted platform can bring to businesses and individuals and we are seeing already that businesses are being built around that, it feels great.

6) Let’s talk about the community around CartoDB. Do you receive feedback from users or from developers to improve the tool? How important is for an Open Source-based product to count with such contributions?

We have received a lot of feedback from our users including feature requests. We also do our best to contribute to the open source libraries that are used by CartoDB, so it is very much a community effort and that community is what makes it all possible for sure.

7) On our workshop, we teach how to import and visualise Open Data with CartoDB. Is the tool specially thought to be used with Open Data? In your opinion, why does Open Data and its visualisation matter?

We think about open data when developing CartoDB all the time. I wouldn’t say that is the sole target of our tool development, a lot of private companies are using CartoDB to analyse and map data that is part of a business offering, so not open. However, we think that visualizing open data can be a very powerful method of educating and demonstrating it’s contents and importance. The title of a recent article about some maps I created shows that I’m not alone in thinking that.

8) We recently saw that you have released great new features (dynamic visualisation, live data feeds,…). How do you set the priorities of the features you are developing? What are the next features you are working on? And in general, how does the future for CartoDB look like?

I’d say we balance three things as best we can when going for new features in CartoDB: what users express they want or need, what we see as improvements that can be made in performance, simplicity or design, and functionality that we see as innovations that we hope users will love :)

Thanks Andrew!

 

Workshop @ Transparency International Cambodia, Phnom Penh, Cambodia

DSCF4416On our last intervention in this busy week in Phnom Penh, we were hosted by Transparency International Cambodia. The office has been created in 2010 and, as the organisation does worldwide, its team works actively in the south-east asian country promoting transparency and fighting against corruption. A practical example of their activities consists on the adoption of the platform bribespot.com for campaigning against bribery, sadly a recurrent subject in the cambodian daily life.

The session was a great opportunity to discuss with the team and around thirty attendees (mostly students and Human Rights advocates) about how data and its proper visualisation can be used to explore society issues. Methodologies and tools for collecting and sharing information were topics that the participants were interested to learn more about. Along this line OpenDataKit was presented by one of them; an open-source suite of tools that helps organizations author, field, and manage mobile data collection solutions. Also, OKFN’s project CKAN could be a choice for those organisations willing to make the step and release their data following the open definition.

DSCF4423Although the concept of Open Data was in general not well known among the participants, fact is that the way they are already working shares a lot of the principles behind it. A big attention was raised on the practical part, where we went hands on with some online visualisation tools: CartoDB and Datawrapper.

Closing our stay in Cambodia where we met many enthusiastic Human Rights advocates and activists, we head now north and invite you to stay tuned for the next steps.

Slides of the presentation
Slides of the presentation

Workshops @DMC and @GIZ, Phnom Penh, Cambodia

DSCF4211The Department of Media and Communication (DMC) of the Royal University of Phnom Penh is the single education centre across Cambodia providing a training ground for journalists and communication practitioners. The director and faculty members have a big interest in Data journalism and we were asked to present the topic at the weekly guest lecture last friday. We started researching Data journalism some weeks ago when we documented journalism++, so this invitation was a great opportunity to extend our presentation with new material and discuss with around sixty DMC cambodian students, from all of the four courses that compose their studies. The interest they showed was great and although the topic is new, the session was very constructive.

DSCF4235
DSCF4244But the day was not over yet, since we conducted another session in the afternoon. This time for the Civil Peace Service (CPS) group of the GIZ, the german national agency for international cooperation which focus its work in developing countries. The CPS team in Cambodia partners with cambodian civil society and government institutions to carry out outreach and education about the Khmer Rouge Tribunal. The expectations of this smaller group of attendees were basically to learn more about tools and methodologies available for them to work more efficiently with the data they collect. Visualisation and management of data was also a central point of the debate. After speaking about the insights of existing Open Data platforms, we experienced that NGOs in Phnom Penh working on similar issues could actually profit from a common database to share documentation. Participants agreed that such a solution could facilitate collaborative work and the way their generated contents get published.

Slides of the presentation
Slides of the presentation

Meeting @ Open Development Cambodia , Phnom Penh, Cambodia

ODC-LogoIf you happen to search for Open Data initiatives in Cambodia, Open Development Cambodia is definitely going to appear on the top of the results list. Started in 2011 as a project under the activities of the EWMI and on the way to be registered as a NGO, ODC represents the most active effort in the South-East-Asian country to collect, use and share data for social improvement.

With a strong philosophy of objectivity and independence, the team does not focus on advocacy in particular sectors nor does it pursue any agenda, other than aggregating and offering information to the public in easily accessible forms. Self-defined as an intersection between NGO, media platform, and think-thank, ODC concentrates its resources on aggregating data (which necessarily must be already available somewhere in the public domain) and creating objective briefings, maps, and graphics available for everyone to download, analyse and re-use. Sources are quoted and even the methodology they employed to create these contents is transparent and can be found on their site. That is what can be understood as an open way of working.

Bildschirmfoto 2014-01-08 um 15.35.44Among other contents, we learned about their forest cover page. At the heart of the page are animated forest cover change maps developed based on analysis of satellite imagery released in public domain by NASA. These maps and accompanying graphics provides information about the extent and rate of Cambodia’s forest cover change over the past 40 years. This and other information found on the site has been already used by NGOs, bloggers, journalists, researchers, grassroots groups, rights advocates and even government technocrats and investors to inform their research, reporting, analysis, and planning. As an example, the local rights-focused website SITHI.org uses maps from ODC as base layers on which they add other analysis. An interesting statistic: since its creation, their website has counted visits from users from almost every country and state of the world, although the majority of users are Cambodians.

All this, in a country whose administration is not particularly supportive when it comes to releasing data to the public domain or sharing information with its citizens. It is important to note that there is currently no Freedom of Information laws in Cambodia, even an attempt to pass a draft law was rejected in January 2013. At the time we are writing these lines, there is no Open Data platform initiated or planned by the government.

PRAJ2Jul2013bHowever, the remarkable work of organisations such as ODC and the presence of a newly created local chapter of the OKFN are examples of the current will to fill the gap and realise a positive development of openness and transparency for Cambodia. Talking about what is to come, ODC team will add interesting new features on their platform, such as and API, to improve user experience and more effective access to their aggregated datasets. The site will also be available in Khmer language within the next few months.

LocalWiki: Collecting and sharing community knowledge @ San Francisco, USA

LocalWiki is a grassroots effort to collect, share and open the world’s local knowledge.” This is how the San Francisco based non-profit organisation defines its interesting initiative, and we absolutely wanted to cover it.

The idea consists on offering a platform that allows communities to collect and share local knowledge, all of this in a collaborative (crowdsourced) way. It all started with DavisWiki in 2004, a community-run wiki with contents about Davis town in California. Currently, LocalWiki counts worldwide with over 70 independent projects in 9 countries and in 7 languages.

As you can see in the following video, the tool is very easy to use and allows users to populate the knowledge database of their community in a quick and accurate way. Inserting text, links, pictures on pages and even on maps has never been so intuitive for a wiki platform and the revisioning system makes it really simple to discover what other users have modified.

[vimeo]https://vimeo.com/32534830[/vimeo]

Would you like to start a LocalWiki for your community? You can contact the team and they will assist you on the process. The technology is released as Open Source, meaning that you can take the code, use it and adapt if you feel like doing it.

The current state of the development already offers lots of useful functionalities. However, and since the platform is continuously being improved, some remarkable things are still to come. We recommend you to read their blog to discover more about their future plans.

We believe LocalWiki represents the principles behind Knowledge Sharing and the Open Source philosophy at its best and consider it a great piece of software that brings community members to work together. Fully support it!