Fine Mapping Causal Variants and Allelic Heterogeneity

On Friday, April 28, 2017, in the CNSI Auditorium, Eleazar Eskin presented ZarLab’s research on fine mapping causal variants and allelic heterogeneity at the 2nd Annual Institute for Quantitative and Computational Biosciences (QCBio) Symposium.

Geneticists use a technique called Genome Wide Association Studies (GWAS) to identify genetic variants that cause an individual to exhibit a particular trait or disease. Typically, GWAS identifies an association signal which suggests that genetic variants within a region of the genome — known as a locus —  are associated with the condition. The process of identifying the actual variant in the region which has an affect on the disease is referred to as “fine mapping.”

In addition to finding the actual variants affecting a disease, fine mapping also seeks to address questions that are related to the genetic basis of disease. First, how many causal variants does a locus contain? A disease could be caused by one, single variant or multiple variants that independently affect disease status. We refer to the latter phenomenon as allelic heterogeneity (AH).

Second, when analyzing results from multiple GWASes, can the same causal variant identified in one study be assumed causal in other studies? A GWAS can identify many variants that are associated with two or more traits; however, this correlation can be induced by a confounding factor known as linkage disequilibrium. Colocalization methods seek to identify shared and distinct causal variants.

Farhad Hormozdiari, a recent alumnus of our group and a post-doc at Harvard University, developed several novel approaches for improving the accuracy and efficiency of fine mapping despite presence of AH in the study population. Hormozdiari’s software, CAVIAR, CAVIAR-Genes, and eCAVIAR, are capable of quantifying the probability of a variant to be causal in GWAS and eQTL studies, while allowing for an arbitrary number of causal variants.

In a video of his presentation, Eskin summarizes the progress on these problems.  A video of Eskin’s presentation may be found on the QCBio website:

More details about our research in fine mapping are available in the following papers:

Hormozdiari, Farhad; van de Bunt, Martijn; Segrè, Ayellet V; Li, Xiao; Joo, Jong Wha J; Bilow, Michael; Sul, Jae Hoon; Sankararaman, Sriram; Pasaniuc, Bogdan; Eskin, Eleazar

Colocalization of GWAS and eQTL Signals Detects Target Genes. Journal Article

In: Am J Hum Genet, 2016, ISSN: 1537-6605.

Abstract | Links | BibTeX

Hormozdiari, Farhad; Kichaev, Gleb; Yang, Wen-Yun Y; Pasaniuc, Bogdan; Eskin, Eleazar

Identification of causal genes for complex traits. Journal Article

In: Bioinformatics, 31 (12), pp. i206-i213, 2015, ISSN: 1367-4811.

Abstract | Links | BibTeX

Hormozdiari, Farhad; Kostem, Emrah ; Kang, Eun Yong ; Pasaniuc, Bogdan ; Eskin, Eleazar

Identifying causal variants at Loci with multiple signals of association. Journal Article

In: Genetics, 198 (2), pp. 497-508, 2014, ISSN: 1943-2631.

Abstract | Links | BibTeX

Hormozdiari F, Zhu A, Kichaev G, Ju CJ, Segrè AV, Joo JW, Won H, Sankararaman S, Pasaniuc B, Shifman S, Eskin E. Widespread allelic heterogeneity in complex traits. The American Journal of Human Genetics. 2017 May 4;100(5):789-802.

Involving undergraduates in genomics research to narrow the education-research gap

Serghei Mangul and Lana Martin, together with Eleazar Eskin, recently wrote a paper describing a model for training undergraduates in Bioinformatics. Our paper is available online as a preprint and is under review at a peer-reviewed journal.

The Education-Research Gap in Universities.

While the benefits of undergraduate research experiences (UREs) are recognized for undergraduates, the advantages of UREs for graduate students, post-doctoral scholars, and faculty are not clearly outlined.

Based on our experience mentoring undergraduates in ZarLab, we believe that the analysis of genomic data is particularly well-suited for successful involvement of undergraduates. In computational genomics research, undergraduate trainees who master a particular skill can contribute sufficient work to gain authorship on a peer-reviewed paper.

In our paper, we offer a framework for engaging undergraduates in genomics research while simultaneously improving lab productivity: first, identify particular “low-level” tasks that may take up to a week for an undergraduate to complete. Second, encourage students to “outsource” foundational education needs with workshops, online resources, and review articles. Third, genomics research labs can take advantage of department- and campus-wide undergraduate research and training initiatives.

The proposed strategy can be easily reproduced at other institutions, is pedagogically flexible, and is scalable from smaller to larger laboratory sizes. We hope that genomics researchers will involve undergraduates in more computational tasks that benefit both students and senior laboratory members.

Preprint copies of our manuscript are available for download here:

In tandem with this paper, we created an online catalogue of resources and papers aimed at bridging the research-teaching divide in computational genomics:

The full citation of our paper:
Mangul, S., Martin, L. and Eskin, E., 2017. Involving undergraduates in genomics research to narrow the education-research gap. PeerJ Preprints, 5, p.e3149v1.


Benefits of UREs to Research Lab and Undergraduates.

Addressing the Digital Divide in Contemporary Biology: Lessons from Teaching UNIX

Serghei Mangul and Lana Martin, together with Alexander Hoffmann, Matteo Pellegrini, and Eleazar Eskin, recently published a paper describing a workshop model for training scientists, who have no computer science background, to use UNIX. Our paper is available online as a preprint and will appear in an upcoming “Scientific Life” section of Trends in Biotechnology.

Scientists who are not trained in computer science face an enormous challenge analyzing high-throughput data. Serghei developed a series of workshops in response to growing demand for life and medical science researchers to analyze their own data using the command line.

Administered by UCLA’s Institute for Quantitative and Computational Biosciences (QCBio), these workshops are designed to help life and medical science researchers use applications that lack a graphical interface. Our paper presents a training model for these workshops—a flexible approach that can be implemented at any institution to teach use of command-line tools when the learner has little to no prior knowledge of UNIX.

QCBio currently offers similar workshops to the UCLA community. In tandem with this publication, we created an online catalogue of resources and papers aimed to provide first-time learners with basic knowledge of command line:

We encourage fellow instructors of Bioinformatics, as well as scientists who are new learners of the command line, to read our paper and share their thoughts! Email us at: lana [dot] martin [at] ucla [dot] edu.


The full citation of our paper:
Mangul, Serghei, Martin, Lana S., Hoffmann, Alexander, Pellegrini, Matteo, and Eskin, Eleazar. Addressing the Digital Divide in Contemporary Biology: Lessons from Teaching UNIX. Trends in Biotechnology; doi: 10.1016/j.tibtech.2017.06.007.

Advance preprint copies of our paper may be downloaded here: