No meaning – no gain: Interpretation is the key

October 5, 2015 By Anni Ahonen-Bishopp, Director of R&D

In September this year, Genomics England introduced the “PanelApp” that helps experts to comb through the 100,000 genomes project’s data for rare diseases. This is a crowdsourcing project where the whole scientific community is invited to take part in interpretation and reviewing of the variants. The aim is to produce diagnostic quality gene panels. The PanelApp is a very ambitious project, I would say, but definitely the way to go, as no other formal interpretation systems yet exist anywhere.

There have been other attempts at collective interpretation, on various levels of expected contribution, and in many different formats. One project is the Cafe Variome -portal, brought into daylight in Gen2Phen -project some years back. Here the idea is to collect variants and phenotypic evidence from various contributing resources, with an easy API for submissions, and to make remote repositories available for discovery. It is a neat solution, and grows its community of collaborators, partnership networks, and other experts, to add more value to the content.

The LOVD -project provides local tools for locus-specific projects that allow the experts to choose, what information they want to expose outside the project, whilst keeping the data in-house. A hub connects these distributed LOVD instances, and allows discovery platforms (like Cafe Variome) to tap into the exposed data in the LOVDs. The toolbox also doubles as an internal information centre.

In both of these concepts the value is in the expert interpretation, not in data sharing itself. The caveat is that their use requires some effort (you need to install the tool, or you need to reformat your data to fit the communication layer), and this is a deterrent. We need things to be made very, very easy for us to use, unless we have the luxury of having an IT team at our command. The other deterrent is the ultimate sharing of data, unfortunately.

Crowdsourcing interpretation work at Genomics England is a wonderful idea. The data is already there, and people are invited to show their expertise. Giving your expert opinion is rather easy, and serves a large community of people, so the requirements for easiness, motivation, and working on ‘somebody else’s data’ exist. I believe PanelApp has the potential makings of a true discovery platform.

Still, the problems persist with isolated data collections, and there is no immediately easy solution for those. Maybe only these big, centralised efforts will have the chance to become truly meaningful, and to propel the medical advancements. Or maybe, someone somewhere, will come up with another kind of idea, and is able to connect the scattered dots into an interpretation app that tells us, what’s going on with the genes.

It is not necessary to own or control massive server pits and cellars of machinery to be at the top of the genomic data food chain. The possession of the information is one thing – getting value out of it is a completely different story. Companies like Congenia and Omicia (recent additions to the Genomics England -family) are increasingly more interesting for their ability to provide fast and automated services. But even they are dependent on the large scheme of building a genomic consensus out of the complex space of evidence and knowledge that’s still hidden in the experts’ hard drives.


Tags: , ,

The skill of following what your bioinformaticians are up to

March 12, 2015 By Anni Ahonen-Bishopp, Director of R&D

I very recently read an interesting review in Nature BoneKey reports about GWAS and NGS analyses (Alonso et al 2015 [1]). As a non-bioinformatician working for a bioinformatics company, I have learned by doing and listening. Still, there are many aspects and fine points that I probably fail to grasp in all but the most simplistic GWAS workflows. Therefore, a review addressing the standardisation of GWAS workflow and QC of data was very welcome to me.

If you are in a managerial position supervising bioinformatics professionals, but only know the ‘lingo’ at a level of an enthusiastic amateur, a paper like this should be very useful to you. This one takes the slant of comparing GWAS with NGS-based association analysis, which I find very topical, and walks you through every single step from initial QC to visualisation options.

Once you have walked through the recommended standard workflow in the paper, you will begin to understand, why your people are hyping about new ways to speed up PCA, and why they can’t agree on MAF thresholds (and what that even means).

There is a clear lack of standards and recommendations for QC and association analysis of NGS data, which I found initially quite surprising, as the technology was very rapidly adopted by people dealing with GWAS a wee while ago. However, the improvements in sequencing methods have definitely made QC of NGS data a rapidly moving target for standardisation. Simple lack in understanding how human genome works has caused major debates about how the variants should be interpreted, which directly affects how one should control for anomalies prior to association analysis. And so on.

It is also important to realise the fundamental differences between SNPs and NGS – mainly the vast volume of the latter swamping any reasonable signals, still detectable by the former. However, NGS in association has potentially so much more power and sensitivity in comparison to SNP approach that despite of these issues the community should and will persevere with the idea that NGS is here to stay, and will one future day be the de facto data source for association analyses.

[1] Big data challenges in bone research: genome-wide association studies and next-generation sequencing. Nerea Alonso, Gavin Lucas & Pirro Hysi. BoneKEy Reports (2015) 4, Article number: 635 (2015)

Tags: ,

Want to know more? Let’s talk

Get in touch