Mozilla Weekend is coming to Berlin


In less than 2 weeks, Berlin will be lightened up by one of the flagship Mozilla community events: Mozilla Weekend, organized on the 11th and 12th.
As the name suggests, the whole weekend is dedicated to Mozilla, its products and its initiatives, especially, but not limited to, Firefox and Firefox OS. After the German speaking community meetup in February, Mozilla Weekend aims to cater to new contributors and help the onboarding process.



The first day of the event (Saturday) will be filled with presentations and will take place at the Wikimedia Offices, while the second day will focus on workshops. Also, don´t miss out on the AMA (ask me anything) sessions as the Mozilla Leadership will be there!
The variety of presentations offers something for anyone, no matter if technical or not. Afterall, the passion for the open internet is the greatest common ground for us. You can register your (free) ticket via Eventbrite on mozweekend.de
Of course there will be free goodies and drinks, so even if you cannot attend the whole day, feel free to pass by!


 

Wikimedia Office (Tempelhofer Ufer 23-24)

View Larger Map

Mozilla Office (Voltastr. 5)

View Larger Map

Open Source Conference Albania, OSCAL 2015

[alert type=”info” title=””]This is a blog post written originally by Redon Skikuli on his blog and has been aggregated with the author’s permission. [/alert]

OSCAL-Banner

OSCAL (Open Source Conference Albania) is the first international conference in Albania organized by Open Labs to promote software freedom, open source software, free culture and open knowledge, concepts that originally started more than 25 years ago.

The second edition of the conference will take place at 9 & 10 May 2015 in Tirana (Godina Liria) and will gather free libre open source technology users, developers, academics, governmental agencies and people who share the idea that software should be free and open for the local community and governments to develop and customize to its needs; that knowledge is a communal property and free and open to everyone.

I’m exited, proud and lucky to be part of the organizing team of the second edition of the event, working with a great group of Albanian FLOSS enthusiasts that know how to create qualitative projects in a decentralized way. This edition is organized in the most decentralized way of working possible in the decision making process and the software used to document and plan activities and tasks. These tools include, but are not limited to Etherpads, Telegram for chat and WordPress for the maintenance of the website. Unfortunately in some cases we also used some proprietary cloud services, but we are planing to change this in the next edition.

Working and taking decision in a decentralized way is not only amazing, but also the the key theme of my talk during the first day and is also the main message we want to share with the participants during OSCAL 2015.

Here is the list with some of the inspirational speakers for this year, the agenda, the blog section with all the latest news, a humble guide to Tirana for our friends from abroad, some banners in case you dig the whole thing and want to spread the #OSCAL2015 vibe and the mobile app, your companion during the event. There will also be competitions, side events related to Open Street Map, LibreOffice, Mozilla and Wikipedia and a massive after-party.

Participation is free of charge, but online registration is required.

Looking forward for the result of months of hard work from all the team and the amazing volunteers in the second weekend of May 2015!

Mozilla German-speaking Community Meetup 2015 in Berlin

I had the pleasure to be invited to the annual Mozilla german speaking community meetup in Berlin this year. Although I am based in Albania and not in Germany, Austria or Switzerland; I contribute from time to time also to the German community, having helped out for the Firefox 10h Anniversary campaign and various other stuff (Firefox has a market share of almost 50% in Germany!).

As I grew up in Germany, I am quite familiar with the culture and speak the language also fluently. However I am most of the time unable to put my German into good use in Albania, for obvious reasons, so it always feels good to practice it.

This was my first time in Berlin and my first time in Germany in almost 4 years. I never visited a Mozilla office before either, so I was really excited for the meetup this year.

Disclaimer: This is a short summary from everything which happened during the community meetup. I am including here Michael Kohler’s notes from his blog, simply due to laziness. Kudos to Mexikohler for being so awesome! Check out his blog for the German version also.




Day 1

The meetup was held on February 20 to February 22 2015. To facilitate the coordination between all volunteers and staff living/working in the German speaking countries (Germany, Austria, Switzerland) we meet once a year to discuss any topics, plans and goals for the year. Further it’s important to meet regularly to have certain discussions in person since these are faster and more efficient. In total 27 persons attended this meetup.

On Saturday we started the first official day at 10am.

Start End Topic Duration Who?
10:00 10:30 Getting to know each other, Mozilla in general 30′ Everyone
10:30 12:00 Introductionary Discussions + Mozilla Goals 1h 30′ Everyone
12:00 13:00 Discussions / Group Planning 1h Groups
13:00 14:00 Lunch in the Office 1h Everyone
14:00 15:30 Feedback of the working groups + Discussions 1h 30′ Everyone
16:30 17:30 Participation 2015 (English) 1h Everyone
17:30 19:00 Community Tiles 1h 30′ Everyone
20:00 22:00 Dinner 2h 30′ Everyone

We began the meetup with a short introduction round since not all of the attendees knew each other. It was nice to see that from all around the Mozilla projects people came to Berlin to discuss and plan the future.

After that Brian introduced us to Mozilla’s goals and plans for 2015. Firefox (more focus on Desktop this year), Firefox OS (user driven strategy), Content Services (differentiate income) and Webmaker were the focus. To reach our goals for the community we also need to know about Mozilla’s overall goals so we can align them.

To know where we currently stand with our community, we did a “SWOT” analysis (Strength, Weaknesses, Opportunities, Threats).


Strengths:

  • • L10N:  amount of work that was done and the quality of it
  • • a lot of different projects are worked on by the community
  • • we had more (and more impactful) events in 2013
  • • Being spontaneous
  • • …

Weaknesses:

  • • a lot of work
  • • “bus factor”
  • • communication
  • • not a lot of social media activities
  • • weekly meetings aren’t very efficient
  • • ….

Opportunities:

  • • Web Standards
  • • Rust
  • • Privacy
  • • Firefox Student Ambassadors
  • • …

Threats:

  • • Fragmentation
  • • Chrome + Google Services
  • • …

 

We splitted up in different groups to discuss group-specific topics and report back to everybody. We had “Localization”, “Developer Engagement / Programming”, “Community Building” and “Websites”.

We discussed the first outcomes of the groups together. Please refer to day 2 to see the results.

Markus, a local developer from Berlin, came by on Saturday. He’d like to organize regular events in Berlin to increase the presence of Mozilla in the city and to build a local community. We like this idea and will support him in 2015!

(Photo: Mario Behling)

After the group discussions Brian had further information: Participation. Please refer to Mark Surman’s blogpost to get more information about that.

At the end of the official part of the day we had a discussion about the “Community Tile”. When you open a new tab in a new Firefox profile you’ll see an overview of different sites you can visit. One of these links is reserved for the community. We discussed our proposal and came to the conclusion that we should focus to tell everyone what the German speaking community does and especially that there are local people working on Mozilla projects.

 

CommunityTiles(Photo: Hagen Halbach)

Want to see who was there? See for yourself!

(Photo: Brian King)

You can find all pictures of the meetup on flickr.




Day 2

On Sunday we once again started at 10am at the Berlin Office.

Start End Topic Duration
10:00 13:00 Plan 2015 / Events / Goals / Roles etherpad 45′ Everyone
13:00 13:45 Content mozilla.de 45′ Everyone
13:45 14:15 IRC Meeting + Summary Meeting 30′ Everyone
14:00 … Departing or other discussions … Everyone

At first we had the same breakout groups again, this time to evaluate goals for 2015. After that we discussed those together with the whole group and decided on goals.


Localization

The l10n group has worked out a few points. First they updated multiple wiki pages. Second they discussed several other topics. You can find the overview of topics here.

Goals:

  • • Finish the documentation on the wiki
  • • Get in touch with the “Localizers in Training”


SUMO

SUMO has done an introduction into the new tools. Further they decided on a few goals.

Goals:

  • • Have 90% of all articles on SUMO translated all the time
  • • For Firefox releases all of the top 100 articles should be translated


Programming

Goals:

  • • organize a “Mozilla Weekend” (this does not only cover developers)
  • • give a talk on Jetpack
  • • continue the Rust meetups
  • • developer meetups in Berlin
  • • recruit 5 new dev contributors

 Community Building

In the community building group we talked about different topics. For example we looked at what’s working now and what’s not. Further we talked about Firefox Student Ambassadors and recognition. You can find the overview here.

Goals:

  • • have at least 10 FSA until the end of the year
  • • have 2 new Reps in the north of Germany
  • • get WoMoz started (this is a difficult task, let’s see)
  • • finish the visual identity (logo) until end of Q2
  • • have at least 5 events in cities, where we never did events before
  • • Mozilla Day / Weekend
  • • define onboarding process
  • • better format for the weekly meeting

Websites

All German Mozilla sites are currently hosted by Kadir. Since Kadir doesn’t have enough time to support them, the goal is to move them to Community IT. This was agreen upon at the community meetup. You can find the relevant bug here.

Goal:

  • • transfer all sites
  • • refresh the mozilla.de content

All these plans and goals are summarized in our Trello board. All German speaking community members can self-assign a task and work on it. With this board we want to track and work on all our plans.

(Photo: Hagen Halbach)

After that we discussed what features should be on the mozilla.de website. In general, all the content will be updated.


  • • product and project overview
  • • landing page for the community tile
  • • list of events
  • • Download-Button
  • • link to “contribute”
  • • link to the mailing list (no support!)
  • • link to the newsletter
  • • Planet
  • • Social Media
  • • prominent link to SUMO for help
  • • link to the dictionaries

(Photo: Hagen Halbach)

At the end we talked about our weekly meeting and drafted a proposal how to make it more efficient. The following changes will be done once everything is clear (we’re discussing this on the mailing list). Until then everything stays the same.


  • • biweekly instead of weekly
  • • Vidyo instead of IRC
  • • document everything on the Etherpad so everybody can join without Vidyo (Workflow: Etherpad -> Meeting -> Etherpad)
  • • the final meeting notes will be copied to the Wiki from the Etherpad

Feedback / Lessions learned

  • • planning long-term before events makes sense
  • • the office is a good location for these kind of meetups, but not for bigger ones
  • • there is never enough time to discuss everything together, so individual breakouts are necessary

I’d like to thank all attendees who participated in very informative and constructive discussions during the weekend. I think that we have a lot to do in 2015. If we can save the motivation from this meetup and work on our defined plans and goals, we’ll have a very successful year. You can find all pictures of the meetup on flickr.

Exploring Open Science n°4: DNAdigest interviews Nowomics

This week I would like to introduce you to Richard Smith, founder and software developer of Nowomics. He kindly agreed to answer some questions for our post blog series and here it is – first hand information on Nowomics. Keep reading to find out more about this company.

richard_smith

Richard Smith, founder and software developer of Nowomics

1. Could you please give us a short introduction to Nowomics (goals, interests, mission)?

Nowomics is a free website to help life scientists keep up with the latest papers and data relevant to their research. It lets researchers ‘follow’ genes and keywords to build their own news feed of what’s new and popular in their field. The aim is to help scientists discover the most useful information and avoid missing important journal articles, but without spending a lot of their time searching websites.

2. What makes Nowomics unique?

Nowomics tracks new papers, but also other sources of curated biological annotation and experimental data. It can tell you if a gene you work on has new annotation added or has been linked to a disease in a recent study. The aim is to build knowledge of these biological relationships into the software to help scientists navigate and discover information, rather than recommending papers simply by text similarity.

3. When did you realise that a tool such as Nowomics would be of a great help to the genomic research community?

I’ve been building websites and databases for biologists for a long time and have heard from many scientists how hard it is to keep up with the flood of new information. There are around 20,000 biomedical journal articles published every week and hundreds of sources of data online, receiving lots of emails with lists of paper titles isn’t a great solution. In social media interactive news feeds that adapt to an individual are now commonly used as an excellent way to consume large amounts of new information, I wanted to apply these principles to tracking biology research.

4. Which part of developing the tool you found most challenging?

As with a lot of software, making sure Nowomics is as useful as possible to users has been the hardest part. It’s quite straightforward to identify a problem and build some software, but making sure the two are correctly aligned to provide maximum value to users has been the difficult part. It has meant trying many things, demonstrating ideas and listening to a lot of feedback. Handling large amounts of data and writing text mining software to identify thousands of biological terms is simple by comparison!

5. What are your plans for the future of Nowomics? Are you working on adding new features/apps?

There are lots of new features planned. Currently Nowomics focuses on genes/proteins and selected organisms. We’ll soon make this much broader, so scientists will be able to follow diseases, pathways, species, processes and many other keywords. We’re working on how these terms can be combined together for fine grained control of what appears in news feeds. It’s also important to make sharing with colleagues and recommending research extremely simple.

6. Can you think of examples of how Nowomics supports data access and knowledge dissemination within the genomics community?

The first step to sharing data sets and accessing research is for the right people to know they exist. This is exactly what Nowomics was set up to achieve, to benefit both scientists who need to be alerted to useful information and for those generating or funding research to reach the best possible audience. Hopefully Nowomics will also alert people to relevant shared genomics data in future.

7. What does ethical data sharing mean to you?

For data that can advance scientific and medical research the most ethical thing to do is to share it with other researchers to help make progress. This is especially true for data resulting from publicly funded research. However, with medical and genomics data the issues of confidentiality and privacy must take priority, and individuals must be aware what their information may be used for.

8. What are the most important things that you think should be done in the field of genetic data sharing?

The challenge is to find a way to unlock the huge potential of sharing genomics data for analysis while respecting the very real privacy concerns. A platform that enables sharing in a secure, controlled manner which preserves privacy and anonymity seems essential, I’m very interested in what DNADigest are doing in this regard.

Bildschirmfoto vom 2015-01-12 15:45:52

Exploring Open Science n°3: DNAdigests interviews NGS logistics

NGS logistics is the next project featured in our blog interviews. We have interviewed Amin Ardeshirdavani who is a PhD student involved in the creation of this web-based application. Take a look at the interview to find why this tool has become very popular within KU Leuven.

NGSlogistics

1. What is NGS logistics?

NGS-Logistics is a web-based application, which accelerates the federated analysis of Next Generation Sequencing data across different centres. NGS-Logistics acts as a real logistics company: you order something from the Internet; the owner processes your request and then ships it through a safe and trustful logistics company. In this of NGS-Logistics, the goods are human sequence data and researchers ask for possible variations and their frequency among the whole population. We try to deliver the answers in the fastest and safest possible way.

2. What is your part in NGS logistics?

Right now I am a PhD student at KU Leuven and the whole idea of my PhD project is designing and developing new data structures for analysing of massive amount of data produced by Next Generation Sequencing machines. NGS logistics is exactly that. I have done the whole design and development of the application and database. Hereby I would also like to acknowledge all the people from the KU Leuven, ESAT IT Dept., UZ Leuven IT Dept., and UZ Genomics core Dept. who assisted me on this project and for their kind support, especially Erika Souche.

3. When did you first start working on the idea of creating NGS logistics and what made you think it would be something useful?

It was almost three years ago when I had a meeting with my promotor Professor Yves Moreau, and he had an idea to somehow connect sequencing centres and query their data without moving them into one repository. As a person with an IT background it wasn’t that difficult for me to develop an application but there were lots of practical issues that needed to be taken care of. The majority of these issues are related to protecting the privacy of the individuals, because the data we deal with are coming from human genome sequencing experiments and people are rightfully worried about how this data will be used and protected. At the time of my first meeting there was no system in place to share this data but many people understood the need for this kind of structure and for us to start working on it. As we know, information can be a true scientific goldmine and by having access to more data we are able to produce more useful information. The novelty of the data, the possibility of sharing this wealth of information, and the complexity of this kind of applications make me so eager to work on this project.

4. How does your open source tool work and who it is designed for?

NGS-Logistics has three modules: Web Interface, Access control list and the Query manager. The source code of each one of these modules plus the database structure behind them is available upon simple request. As the modules are being upgraded continuously, I have not made any public repository for the source code yet. However, if someone would be interested to gain access to the source code it will be our pleasure to give it to them while I do think that the whole idea of the Data sharing is more important than the source code itself. Anyhow, it is our pleasure to share our experience with different problems and issues that we had to tackle during the past three years with others. In general, NGS-Logistics is designed to help researchers to save time when they need to have access to more data. It will help them to get a better overview of their questions and if they need to have access to the actual data, it will help them get the most useful data sets that match their cases.

5. Who has access to the system and how do you manage access permissions?

Researchers with a valid email address and affiliation are welcome to register and use the application. This means that we need to know who is querying the data to prevent structural queries, which may lead to identify an individual. I spent almost 20 months on the Access Control List (ACL) module. Most of the tasks are controlled and automatically updated by the system itself. Center Admins will be responsible for updating the list of samples they want to share with the others. PIs and their power users are responsible to group the samples as data sets and assign them to the users and groups. ACL has a very rich and user-friendly interface that makes it very easy to learn and use.

6. In what way do you think data sharing should be further improved?

Because of all the concerns around the term “Data Sharing”, I prefer to use the term “Result Sharing”. In our framework, we mostly try to answer very high-level questions like “The prevalence of a certain mutation in different populations”, preventing any private information from leaking out. By having more access to data we can gain more insight and produce more useful information; as Aristotle said: “The whole is greater than the sum of its parts.” On the other hand we always have to be careful about the consequences of sharing.

7. What does ethical data sharing mean to you?

It means everything and nothing. Why? Because ethics really depends on the subject and the location we are talking about. If we talk about sharing weather forecast data, I would say it is not important and it does not have any meaning. But when we talk about the data produced based on human genomes then we have to be careful. Legal frameworks differ a lot between many countries. Some of them are very restrictive when it comes to dealing with sensitive and private data whereas others are much less restrictive. Mostly this is because they have different definitions of private data. In most cases, any information that allows us to uniquely identify a person is defined as private information and as we know there is a possibility to identify a person by his or her genome sequence. Therefore, I feel that it is very important to keep track of what data is being used by who, when, at which level and for what reason.

NGS

Amin Ardeshirdavani et al, has published his work in Genome Medicine 6:71 : “NGS-Logistics: federated analysis of NGS sequence variants across multiple locations”. You can take a look at it here.

Exploring Open Science n°2: DNAdigest interviews SolveBio

DNAdigest continues with the series of interviews. Here we would like to introduce you to Mr Mark Kaganovich, CEO of SolveBio, who agreed on an interview with us. He shared a lot about what SolveBio does and discussed with us the importance of genomic data sharing.

Mark

Mark Kaganovich, CEO of SolveBio

1) Could you describe what SolveBio does?

SolveBio delivers the critical reference data used by hospitals and companies to run genomic applications. These applications use SolveBio’s data to predict the effects of slight DNA variants on a person’s health. SolveBio has designed a secure platform for the robust delivery of complex reference datasets. We make the data easy to access so that our customers can focus on building clinical grade molecular diagnostics applications, faster.

2) How did you come up with the idea of building a system that integrates genomic reference data into diagnostic and research applications? And what was the crucial moment when you realised the importance of creating it?

As a graduate student I spent a lot of time parsing, re-formatting, and integrating data just to answer some basic questions in genomics. At the same time (this was about two years ago) it was becoming clear that genomics was going to be an important industry with a yet unsolved IT component. David Caplan (SolveBio’s CTO) and I started hacking away at ways to simplify genome analysis in the anticipation that interpreting DNA would be a significant problem in both research and the clinic. One thing we noticed was that there were no companies or services out there to help out guys like us – people that were programming with genomic data. There were a few attempts at kludgy interfaces for bioinformatics and a number of people were trying to solve the read mapping computing infrastructure problem, but there were no “developer tools” for integrating genomic data. In part, that was because a couple years ago there wasn’t that much data out there, so parsing, formatting, cleaning, indexing, updating, and integrating data wasn’t as big of a problem as it is now (or will be in a few years). We set out to build an API to the world’s genomic data so that other programmers could build amazing applications with the data without having to repeat painful meaningless tasks.

As we started talking to people about our API we realized how valuable a genomic data service is for the clinic. Genomics is no longer solely an academic problem. When we started talking to hospitals and commercial diagnostic labs, that’s when we realized that this is a crucial problem. That’s also when we realized that an API to public data is just the tip of the iceberg. Access to clinical genomic information that can be used as reference data is the key to interpreting DNA as a clinical metric.

3) After the molecular technology revolution made it possible for us to collect large amounts of precise medical data at low cost, another problem appeared to take over. How do you see the solution of the problem that the data are not in a language doctors can understand?

The molecular technology revolution will make it possible to move from “Intuitive Medicine” to “Precision Medicine”, in the language of Clay Christensen and colleagues in “Innovator’s Prescription”. Molecular markers are much closer to being unique fingerprints of the individual than whatever can be expressed by the English language in a doctor’s note. If these markers can be conclusively associated with diagnosis and treatment, medicine will be an order of magnitude better, faster, cheaper than it is now. Doctors can’t possibly be expected to read the three billion base pairs or so that make up the genome of every patient and recall which diagnosis and treatment is the best fit in light of the genetic information. This is where the digital revolution – i.e. computing – comes in. Aggregating silo’ed data while maintaining the privacy of the patients using bleeding edge software will allow doctors to use clinical genomic data to better medicine.

4) What are your plans for the future of SolveBio? Are you working on developing more tools/apps?

Our goal is to be the data delivery system for genomic medicine. We’ve built the tools necessary to integrate data into a genomic medical application, such as a diagnostic tool or variant annotator. We are now building some of these applications to make life easier for people running genetic tests.

5) Do you recognise the problem of limited sharing of genomics data for research and diagnosis? Can you think of an example of how the work of SolveBio supports data access and knowledge sharing within the genomics community?

The information we can glean from DNA sequence is only as good as the reference data that is used for research and diagnostic applications. We are particularly interested in genomics data from the perspective of how linking data from different sources creates the best possible reference for clinical genomics. This is, in a way, a data sharing problem.

I would add though that a huge disincentive to distributing data is the privacy, security, liability, and branding concern that clinical and commercial outfits are right to take into account. As a result, we are especially tailoring our platform to address those concerns.

However, even the data that is currently being “shared” openly, largely as a product of the taxpayer funded academic community, is very difficult and costly to access. Open data isn’t free. It involves building and maintaining substantial infrastructure to make sure the data is up-to-date and to verify quality. SolveBio solves that problem. Developers building DNA interpretation tools no longer have to worry about setting up their data infrastructure. They can integrate data with a few lines of code through SolveBio.

6) Which is the most important thing that should be done in the field of genetic data sharing and what does ethical data sharing mean to you?

Ethical data sharing means keeping patient data private and secure. If data is used for research or diagnostic purposes and needs to be transferred among doctors, scientists, or engineers then privacy and security is a key concern. Without privacy and security controls genomic data will never benefit from the aggregate knowledge of programmers and clinicians because patients will be rightly opposed to measuring, let alone distributing, their genomic information. Patient data belongs to the patient. Sometimes clinicians and researchers forget that. I definitely think the single most important thing to get right is the data privacy and security standard. The entire field depends upon it.

logo-SolveBio

Open Spending: Tracking Financial Data worldwide

If you have followed the activites of the OKFN these last years, you probably already know Open Spending, the community-driven project initiated in 2007 and which has considerably grown since then. First, the idea started with Where Does My Money Go?, a database for UK public financial data, financed by the 4IP (4 Innovation for the Public) fund of the British channel 4. Few years later in 2011, the initiative has been internationalized and Open Spending was born, a worldwide platform which has largely gone beyond the British borders. Today, the site shows data from 73 countries from Bosnia to Uganda and the visualisation tool Spending Stories could be developed at the same time, thanks a grant from the Knight Foundation. Talking about funding, not to forget the Open Society Foundations which supports the community building work and the Omidyar Network which funded the research behind the report “Technology for Transparent and Accountable Public Finance”. You guessed it? Everything is Open Source.

OpenSpending_web

Open Spending consists not only in aggregating worldwide public financial data as budgets, spending, balance sheets, procurement or employees salaries; giving information on how public money has been spent all over the world and in your own city. It allows users to visualise directly the available data via Spending Stories and add new datasets as well. The community members making use of the tools and developing them show various backgrounds and every one is invited to join. Additionally, articles are regularly posted on the blog to incite to share knowledge each other.

The results so far are very good since numerous administrations and media have already used the visualisations, as the city of Berlin and the Guardian for instance. But besides them, independent journalists, activists from the civil society, students and engaged citizens take also avantage of the datasets, allowing a better understanding on public money.

Bildschirmfoto vom 2014-12-03 18:19:44           TheGuardian

DNAdigest Symposium: A tour in Open Science in human genomics research

This past weekend, DNAdigest organized a Symposium on the topic “Open Science in human genomics research – challenges and inspirations”. The event brought together very interested in the topic and enthusiastic people along with the DNAdigest team. We are very pleased to say that this day turned out to be a success, where both participants and organizers enjoyed the amazing talks of our speaker and the discussion sessions.

The day started with a short introduction on the topic by Fiona Nielsen.

DNAdigestSummit1

Then our first speaker, Manuel Corpas was a source of inspiration to all participants, talking us through the process he experienced in order to fully sequence the whole genomes of his family and himself and to share this data widely with the whole world.  Here is a link to the presentation he introduced on the day.

The Symposium was organized in the format of Open Space conference, where everybody got to suggest different topics related to Open Science or choose to join one which sounds most interesting. Again, we used HackPad to take notes and interesting thoughts throughout the discussions. You can take a look at it here.

DNAdigestSummit2

We had three more speakers invited to our Symposium: Tim Hubbard (slides) talked about how Genomics England gets to engaged the research community, in the face of genomic scientists and patient communities, to collaborate on both data generation and data analysis of the 100k Genomes Project for the public benefit. Julia Wilson (slides) came as a representative of the Global Alliance. She introduced us to the GA4GH and explained how their work helps to implement standards for data sharing across genomics and health. Last, but not least was Nick Sireau (slides). He walked us through an eight-step process to show us how exactly the scientific community and the patient community can engage in collaborations, and how Open Science (sharing of hypotheses, methods and results, throughout the science process) may be either beneficial or challenging in this context.

DNAdigest Symposium

The event came to its end with a summary of learning points and a rounding up by Fiona Nielsen.

We have also made a storify summary where you can find a collection of all the tweets and most of the photos covering the duration of the day.  Also there is a gallery including all pictures taken by our team members.

Now to all former and future participants, If you enjoy participating in these events please donate to DNAdigest by texting DNAD14 £10 to 70070, so that we can continue organizing more of these interactive and exciting events in the future. You can also buy some of our cool DNAdigest T-shirts and Mugs from our website shop.

It was great to see you all, and we look forward to welcoming you again for our next events!

DNAdigest team: Fiona, Adrian, Margi, Francis, Sebastian, Xocas and Tim

This event would not have been possible without the contributions of our generous sponsors:

DNAdigestSummit_sponsor3

DNAdigestSummit_sponsor

DNAdigestSummit_sponsor2

Exploring Open Science: DNAdigest interviews Aridhia

As promised last week in the DNAdigest’s newsletter, we are giving life to our first blog post interview. Be introduced to Mr Rodrigo Barnes, part of the Aridia team. He kindly agreed to answer our questions about Aridhia and their views on genomic data sharing.

rodrigo-barnes-300x198

Mr Rodrigo Barnes, CTO of Aridhia

1. You are a part of the Aridhia team. Please, tell us what the goals and the interests of the company are?

Aridhia started with the objective of using health informatics and analytics to improve efficiency and service delivery for healthcare providers, support the management of chronic disease and personalised medicine, and ultimately improve patient outcomes.

Good outcomes had already started to emerge in diabetes and other chronic diseases, through some of the work undertaken by the NHS in Scotland and led by one of our founders, Professor Andrew Morris. This included providing clinicians and patients with access to up-to-date, rich information from different parts of the health system.

Aridhia has since developed new products and services to solve informatics challenges in the clinical and operational aspects of health. As a commercial organisation, we have worked on these opportunities in collaboration with healthcare providers, universities, innovation centres and other industry partners, to ensure that the end products are fit for purpose, and the benefits can be shared between our diverse stakeholders. We have always set high standards for ourselves, not just technically, but particularly when it comes to respecting people’s privacy and doing business with integrity.

2. What is your role in the organisation and how does your work support the mission of the company?

Although my background is in mathematics, I’ve worked as a programmer in software start-ups for the majority of my career. Since joining Aridhia as one of its first employees, I have designed and developed software for clinical data, often working closely with NHS staff and university researchers. This has been great opportunity to work on (ethically) good problems and participate in multidisciplinary projects with some very smart, committed and hard-working people.

In the last year, I took on the CTO (Chief Technology Officer) role, which means I have to take a more strategic perspective on the business of health informatics. But I still work directly with customers and enjoy helping them develop new products.

3. What makes Aridhia unique?

We put collaboration at the very heart of everything we do. We work really hard to understand the different perspectives and motivations people bring to a project, and acknowledge expertise in others, but we’re also happy to assert our own contribution. We have also been lucky to have investors who recognise the challenges in this market and support our vision for addressing them.

4. Aridhia have recently won a competition for helping businesses develop new technology to map and analyse genes and more specifically to support the efforts of NHS to map whole genomes of patients with rare diseases or cancer. On which phase are you now and have you developed an idea (or even a prototype) that you can tell us more about?

It’s a little early to say too much about our product plans, but we have identified a number of aspects within genomic medicine that we feel need to be addressed. Based on our extensive experience in the health field, we think a one size fits all approach won’t work when it comes to annotating genomes and delivering that information usefully into the NHS (and similar healthcare settings). There will be different user needs, of course, but there are also IT procurement and deployment challenges to tackle before any smart solution can become common practice in the NHS.

We strongly believe that there is a new generation of annotation products and services waiting to emerge from academic/health collaborations. We believe that clinical groups have the depth of knowledge and the databases of cases that are needed to provide real insight into complex diseases with genetic factors, and we are keen to help these SMEs and spin outs validate their technology and get them ‘to market’ in the NHS and healthcare settings around the world.

Overall our initial objective is to help take world class annotations out of research labs and into operational use in the NHS. Both of these goals are very much in line with Genomic England‘s mandate to improve health and wealth in the UK.

5. Aridhia is a part of The Kuwait Scotland eHealth Innovation Network (KSeHIN). Can you tell us something more about this project and what your plans for further development are?

Kuwait has one of the highest rates of obesity and diabetes in the world, and the Kuwait Ministry of Health has responsibility for tackling this important issue. We’ve worked with the Dasman Diabetes Centre in Kuwait and the University of Dundee to bring informatics, education and resources to improve diabetes care. The challenge from the initial phase is to scale up to a national system. We think there are good opportunities to work with the Ministry of Health in Kuwait to achieve their goals as well as working with the Dasman’s own genomics and research programmes. This project is an excellent example of the combination of skills and resources needed to make an impact on the burden of chronic disease.

6. Do you recognise the problem of limited sharing of genomics data for research and diagnosis? How does the work of Aridhia support data access and knowledge sharing within the genomics community?

This is a sensitive subject of course, and we have to acknowledge that this is data that can’t readily be anonymised. Sharing, if it’s permissible, won’t follow the patterns we are used to with other types of data. That’s why we took an interest in the work DNA Digest is doing.

Earlier in the year, Aridhia launched its collaborative data science platform, AnalytiXagility which takes a tiered approach to the managed sharing of sensitive data. We make sure that we offer data owners and controllers what they need to ensure they feel comfortable in sharing data. AnalytiXagility delivers a protocol for negotiation and sharing, backed by a ‘life-cycle’ or ‘lease’ approach to the sharing and audit systems to verify compliance. This has been primarily used for clinical, imaging and genomics data to date.

In a ‘Research Safe Haven’ model, the analysts come to the data, and have access to that for the intended purpose and duration of their project. This system is in place at the Stratified Medicine Scotland – Innovation Centre, which already supports projects using genomic and clinical data. The model we are developing for genomic data extends that paradigm of bringing computing to the data. We are taking this step by step and working with partners and customers to strengthen the system.

From a research perspective, the challenges are likely to be related to having enough linked clinical data, but also having enough samples and controls to get a meaningful result. So we think we will see standards emerging for federated models – research groups will try to apply their analysis against raw genomic data at multiple centres using something like the Global Alliance 4 Genomics and Health API, and then collate results for analysis under a research safe haven model. We recently joined the Global Alliance and will bring our experience of working with electronic patient records and clinical informatics to the table.

7. What are your thought on the most important thing that should be done in the field of genetic data sharing?

Trust and transparency are important factors. I am interested in seeing what could be done to establish protocols and accreditations that would give participants visibility of how data is being used and how the benefits are shared.

aridhia_logo-300x231

Giving research data the credit it’s due

In many ways, the currency of the scientific world is publications. Published articles are seen as proof – often by colleagues and future employers – of the quality, relevance and impact of a researcher’s work. Scientists read papers to familiarize themselves with new results and techniques, and then they cite those papers in their own publications, increasing the recognition and spread of the most useful articles. However, while there is undoubtedly a role for publishing a nicely-packaged, (hopefully) well-written interpretation of one’s work, are publications really the most valuable product that we as scientists have to offer one another?

As biology moves more and more towards large-scale, high-throughput techniques – think all of the ‘omics – an increasingly large proportion of researchers’ time and effort is spent generating, processing and analyzing datasets. In genomics, large sequencing consortia like the Human Genome Project or ENCODE  were funded in part to generate public resources that could serve as roadmaps to guide future scientists. However, in smaller labs, all too often after a particular set of questions is answered, large datasets end up languishing on a dusty server somewhere. Even for projects whose express purpose is to create a resource for the community, the process of curating, annotating and making data available is a time-consuming and often thankless task.

images

Current genomics data repositories like GEO and ArrayExpress serve an important role in making datasets available to the public, but they typically contain data that is already described in a published article; citing the dataset is typically secondary to citing the paper. If more, easier-to-use platforms existed for publishing datasets themselves, alongside methods to quantify the use and impact of these datasets, it might help drive a shift away from the mindset of ascribing value purely to journal articles towards a more holistic approach where the actual products of research projects – including datasets as well as code or software tools used to analyse them, in addition to articles – are valued. Such a shift could bring benefits to all levels of biological research, from ensuring that students who toiled for years to produce a dataset get adequate credit for their work, to encouraging greater sharing and reuse of data that might not have made it into a paper but still has the potential to yield scientific insights.

Tools and platforms to do just this are gradually emerging and gaining recognition in the biological community. Figshare is a particularly promising platform that allows for the sharing and discovery of many types of research outputs, including datasets as well as papers, posters and various media formats. Importantly, items uploaded to Figshare are assigned a Digital Object Identifier (DOI), which provides a unique and persistent link to each item and allows it to be easily cited. This is analogous to the treatment of articles on preprint servers such as arXiv and bioRxiv, whose use is also growing in biological disciplines; however, Figshare is more flexible in terms of the types of research output it accepts. In addition to the space and ability to share and cite data, the research community could benefit from better quantification of data citation and impact. Building on the altmetrics movement, which attempts to provide alternative measures of the impact of scientific articles besides the traditional journal impact factor, a new Data-Level Metrics pilot project has recently been announced as a collaboration between PLOS, the California Digital Library and DataONE. The goal of this project is to create a new set of metrics that quantify usage and impact of shared datasets.

Although slow at times, the biological research community is gradually adapting to the new needs and possibilities that come along with high-throughput datasets. Particularly in the field of genomics, I hope that researchers will continue to push for and embrace innovative ways of sharing their data. If data citation becomes the new standard, it could facilitate collaboration and reproducibility while helping to diversify the range of outputs that scientists consider valuable. Hopefully, the combination of easy-to-use platforms and metrics that capture the impact of non-traditional research outputs will provide incentives to researchers to make their data available and encourage the continued growth of sharing, recognizing and citing biological datasets.