The skill of following what your bioinformaticians are up to
I very recently read an interesting review in Nature BoneKey reports about GWAS and NGS analyses (Alonso et al 2015 ). As a non-bioinformatician working for a bioinformatics company, I have learned by doing and listening. Still, there are many aspects and fine points that I probably fail to grasp in all but the most simplistic GWAS workflows. Therefore, a review addressing the standardisation of GWAS workflow and QC of data was very welcome to me.
If you are in a managerial position supervising bioinformatics professionals, but only know the ‘lingo’ at a level of an enthusiastic amateur, a paper like this should be very useful to you. This one takes the slant of comparing GWAS with NGS-based association analysis, which I find very topical, and walks you through every single step from initial QC to visualisation options.
Once you have walked through the recommended standard workflow in the paper, you will begin to understand, why your people are hyping about new ways to speed up PCA, and why they can’t agree on MAF thresholds (and what that even means).
There is a clear lack of standards and recommendations for QC and association analysis of NGS data, which I found initially quite surprising, as the technology was very rapidly adopted by people dealing with GWAS a wee while ago. However, the improvements in sequencing methods have definitely made QC of NGS data a rapidly moving target for standardisation. Simple lack in understanding how human genome works has caused major debates about how the variants should be interpreted, which directly affects how one should control for anomalies prior to association analysis. And so on.
It is also important to realise the fundamental differences between SNPs and NGS – mainly the vast volume of the latter swamping any reasonable signals, still detectable by the former. However, NGS in association has potentially so much more power and sensitivity in comparison to SNP approach that despite of these issues the community should and will persevere with the idea that NGS is here to stay, and will one future day be the de facto data source for association analyses.
 Big data challenges in bone research: genome-wide association studies and next-generation sequencing. Nerea Alonso, Gavin Lucas & Pirro Hysi. BoneKEy Reports (2015) 4, Article number: 635 (2015)