April 9th, Adi Stern: Costs and benefits of mutational robustness

Adi Stern

Adi Stern

Adi is a post-doc working with Raul Andino in UCSF and Rasmus Berkeley in UC Berkeley. Her interests lie in understanding the “arms-race” evolution of viruses and their hosts. Adi received a B.Sc. in Biology and Psychology, and an additional B.Sc. in Math, from Tel-Aviv University. In her PhD research with Tal Pupko, she developed phylogenetic models of evolution, and used them to study adaptationin different HIV strains. Next, during a postdoc in the Weizmann institute with Rotem Sorek, she studied the evolution of the CRISPR antiviral system, and documented diversity of phages and CRISPRs in the human gut microbiome. Currently, Adi combines population genetics theory with experimental evolution to study the constraints governing the evolution of “mutant-clouds” of RNA viruses. In Oct. 2014 Adi is starting a research group in Tel Aviv University, where she plans to continue combining theory with experimental evolution to study how viruses of all types and forms adapt to continuously changing environments. For more information, have a look at her website.

Talk: Costs and benefits of mutational robustness

Hidden genetic variability, in the form of neutral mutations, is thought to facilitate evolution by providing a reservoir of potentially adaptive alleles. Mutational robustness, which is the ability of a population to buffer deleterious mutations, determines the neutral variability in a population. Here we focus on a particular type of robustness, namely multiple viruses replicating in the same cell. During high multiplicity of infection (MOI), complementation between virus variants can buffer the effect of detrimental mutations. We examine this prediction by comparing populations of RNA viruses grown at high or low MOI.  Comparison of the minor allele composition of these populations demonstrated that indeed high MOI buffers detrimental mutations. Next, guided by these experimental results, we developed a theoretical framework to compare the evolutionary behavior of robust viral populations and non-robust, i.e. brittle, populations as they adapt to the challenges of an environmental change. We find that robust populations adapt more rapidly but purge novel deleterious mutations more slowly. Brittle populations are better prepared to adapt if neutral alleles in a given environment become predominantly deleterious in a new environment. Whether mutational robustness facilitates or hinders adaptation to a new condition depends on the ratio of beneficial to detrimental mutations hidden within the “neutral” genetic variability in the starting population. We illustrate different types of environmental changes where mutational robustness, or lack there of, plays a role in viral adaptation. Thus, under certain conditions, diversity may actually be an impediment for viral adaptation.

Seminar details

Wednesday April 9th, 2014
1:00 PM Lunch (sign up below)
1:15 PM Seminar
Location: Clark Center S360
If you would like to speak with Adi, contact Pleuni Pennings (pleuni@stanford.edu)

April 2nd, Melissa Wilson Sayres: Sex-biased evolution and disease

WilsonSayres

Melissa Wilson Sayres is an evolutionary and computational biologist broadly interested in questions of genome evolution, mutation rate variation, and the consequences on species biology. She analyzes large-scale datasets to study questions relating to sex-specific mutational processes, including, how sex chromosomes arise and evolve, and how and why mutation rates differ between the sexes. She also develops models and analyzes experimental data to understand the genomic effects of natural selection, background selection, and convergent evolution.

Wilson Sayres received her B.S. in Medical Mathematics from Creighton University in Omaha, Nebraska, her Ph.D. in Integrative Biology: Bioinformatics & Genomics from The Pennsylvania State University, and currently works as a Miller postdoctoral fellow at the University of California, Berkeley.

Title: Sex-biased evolution and disease

Sex-biased processes occur on a variety of levels, from the differentiation of our sex chromosomes, to population dynamics, to the way that diseases affect each sex. The inundation of genomic and transcriptomic sequences provide the opportunity to apply computational and statistical approaches to understand sex-biased processes. The human sex chromosomes, X and Y, were once an indistinguishable pair of autosomes, but over the past 180 million years have become quite different. The Y has lost 90% of the ancestral gene content, but still retains relics of its ancestral partnership with the X. The Y chromosome, inherited through the genetic paternal line, and being nearly devoid of homologous recombination, also experiences evolutionary processes differently that regions that recombine. As such, studying patterns of genome-wide diversity can provide a unique insight into the history of sex-biased demography and selection acting on the Y chromosome. In addition to sex-biased genomics, many diseases, such as the autoimmune disease, Rheumatoid Arthritis (RA), act in a sex-biased manner. RA affects three times as many women as men, and its onset and severity are affected by a complex interaction between genotype and environment. Particularly, pregnancy often has an ameliorating effect on RA disease activity. I will discuss our computational approaches to: 1) understand the degradation of the Y, and how this process has affected the X chromosome; 2) illuminate the history of sex-biased demography and selection acting on the Y chromosome; and, 3) evaluate gene expression variation across clinical RA patients in the natural human model system of pregnancy.

Seminar details

Wednesday April 2nd, 2014
1:00 PM Lunch (sign up below)
1:15 PM Seminar
Location: Clark Center S360
If you would like to speak with Melissa, contact Pleuni Pennings (pleuni@stanford.edu)

March 19th, Eyal Elyashiv: Building a population genetic map of the effects of linked selection, with application to Drosophila melanogaster

About Eyal

Eyal Elyashiv, Columbia University

Eyal Elyashiv, Columbia University

Eyal is PhD candidate at the Hebrew University of Jerusalem and a visiting student at Columbia University, working under the supervision of Prof. Guy Sella.

Talk: Building a population genetic map of the effects of linked selection, with application to Drosophila melanogaster

Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of such “linked selection” on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and gaining insight into the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a site is polymorphic given local annotations and substitution patterns. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a population genetic map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from ~190 lines in Drosophila melanogaster and show that it reliably predicts diversity levels at the 1Mb scale, as well as helps interpret finer diversity patterns around substitutions in proteins and UTRs. The method outperforms existing ones and allows one to distinguish the contribution of sweeps from other modes of linked selection and to obtain robust estimates of sweep parameters, in particular providing strong evidence for sweeps in UTRs. More generally, our findings indicate that linked selection has had a pronounced effect in reducing diversity levels and increasing their variance in D. melanogaster, and suggest that other modes of selection (e.g. partial and soft sweeps) contribute substantially to these effects. Our approach presents the advantages of being flexible in the species to which it can be applied, the modes of selection that it can consider and in its ability to readily incorporate ever-improving functional annotations and genetic maps.

Seminar details

Wednesday March 19th, 2014
1:00 PM Lunch (sign up below)
1:15 PM Seminar
Location: Clark Center S360
If you would like to speak with Eyal, contact Jonathan Pritchard (pritch@stanford.edu)

March 12th, Claudia Bank: Shifting fitness landscapes in response to altered environments

About Claudia:

Claudia Bank

Claudia Bank

Claudia Bank is a postdoc in Jeffrey Jensen’s lab at the École Polytéchnique Fédérale de Lausanne in Switzerland, and currently spending a semester at UC Berkeley as a Simons- Berkeley fellow in the program “Evolutionary Biology and the Theory of Computing”. Following a Master’s degree in Mathematics from Germany, she obtained her PhD in Population Genetics under the supervision of Joachim Hermisson in Vienna, Austria.

Claudia’s research is focused on the study of evolution – and in particular, the population genetics of adaptation and speciation – at the interface between theoretical and empirical biology. The approaches she uses involve theoretical modeling, computational methods, and statistical data analysis.

Talk: Shifting fitness landscapes in response to altered environments

One of the most controversial questions in evolutionary biology is the role of adaptation in molecular evolution. After decades of debate between selectionists and neutralists, new high-throughput methods are beginning to illuminate the full distribution of fitness effects of new mutations. Here, we shed light on the adaptive potential in Saccharomyces cerevisiae by presenting systematic high-throughput fitness measurements for 560 point mutations in a region of Hsp90 under six environmental conditions. Under elevated salinity, we observe numerous beneficial mutations, all of which are observed to be associated with high costs of adaptation. We thus demonstrate that an essential protein can harbor adaptive potential upon an environmental challenge, and report a remarkable fit of the data to Fisher’s geometric model. In addition, we compare the differences in the DFEs resulting from mutations covering 1, 2 and 3 nucleotide steps from the wild type – showing that multiple-step mutations harbor more potential for adaptation in challenging environments, but also tend to be more deleterious in the standard environment. We utilize a Bayesian MCMC modeling framework to evaluate the statistical significance of the results – showing a remarkable accuracy of the experimental approach that allows us, e.g., to identify a deleterious synonymous mutation under standard conditions.

Seminar details

Wednesday March 12th, 2014
1:00 PM Lunch (sign up below)
1:15 PM Seminar
Location: Clark Center S360
If you would like to speak with Claudia, contact Pleuni Pennings (pleuni@stanford.edu)

March 5th, Julia Salzman: Circular RNA is expressed across 1 billion years of evolution

About Julia Julia Salzman-250

Julia Salzman is an Assistant Professor of Biochemistry at the Stanford University School of Medicine and Associate Member of the Stanford Cancer Institute. She received an A. B. in Mathematics magna cum laude from Princeton University and her Ph. D. in Statistics from Stanford University. Dr. Salzman spent one year on the Faculty in the Department of Statistics at Columbia University before returning to Stanford as a Postdoctoral research fellow in the laboratory of Dr. Patrick O. Brown and subsequently joining the faculty at Stanford. She has published broadly in fields including quantum information theory, statistical methodology,
computational biology and genetics. Her most significant contributions have been to show that circular RNA is a previously overlooked but ubiquitous component of eukaryotic gene expression programs. Dr. Salzman’s work has been funded by grants from the Division of Mathematical Sciences at the NSF and a K99/R00 award from the NCI. She is a 2014 Alfred P. Sloan Fellow.

The Salzman lab combines biochemical, genetic, algorithmic and statistical approaches to study RNA expression. Our goal is to use high throughput experimental and statistical tools to construct a high dimensional picture of gene regulation, including cis and trans control of the full repertoire of RNAs expressed by cells. Currently, we are focusing on the function and biogenesis of circular RNA, which we recently discovered to be a ubiquitous and uncharacterized component of eukaryotic gene expression. A second major focus is gene expression variation in human cancer. Here, we combine mining massive public datasets, and experimental study of primary tumors and cell lines with bioinformatic and statistical methods. We use the cancer genome as window into functional roles played by RNA, and are attempting to characterize potential biomarkers.

Talk: Circular RNA is expressed across 1 billion years of evolution

Until recently, circular RNA isoforms expressed from protein coding loci have largely gone unnoticed. Yet, these topologically circular molecules are expressed from a large fraction of human, mouse and fly genes. Since our initial report of widespread RNA circles in humans and mouse, constituting the dominant isoform in hundreds of genes, abundant circular RNAs have been reported in zebrafish, C. elegans and fruit flies; and other groups have confirmed our findings in human and mouse cells. Recently, we have discovered that circular RNAs are expressed in diverse species whose most recent common ancestor existed more than one billion years ago including fungi, a plant and protists. Some of these species have very short introns (~100 nucleotides or shorter) and few documented examples of exon skipping, yet they still produce circular RNAs, making it unlikely that all circular RNAs are by-products of alternative splicing or “piggyback” on signals used in alternative RNA processing. Furthermore, these results indicate that circular RNA may be an ancient, conserved feature of eukaryotic gene expression programs.

Wednesday March 5th, 2014
1:00 PM Lunch
1:15 PM Seminar
Location: Clark Center S360

February 26th, Eilon Sharon: Unraveling gene promoter and 3’ end effects on expression strength and noise using many designed sequences

Eilon Sharon is a postdoc in the labs of Jonathan Pritchard and  Hunter Fraser.

Eilon Sharon is a postdoc in the labs of Jonathan Pritchard and Hunter Fraser.

About Eilon

“I completed a PhD followed by a one year postdoctoral position in computational biology at the Weizmann Institute of Science located in Rehovot, Israel, working under the supervision of Prof. Eran Segal in the Departments of Computer Science and Applied Mathematics and Molecular Cell Biology. My PhD studies have focused mainly on developing computational methods and devising experimental methods, which use synthetic biology to decipher how transcription regulation is encoded in the yeast genome. I hold a double major BSc in biology (summa cum laude) and Computer Science (magna cum laude), and also spent two and half years working for Rosetta Genomics as an algorithms developer, where my team found over 100 novel miRNA in human.

During my PhD I developed a technology that accurately measures the induced transcription of thousands of fully designed promoters in a single experiment (Nature Biotechnology 2012). By combining several technologies (Oligo synthetic libraries, fluorescent reporter assay, fluorescence-activated cell sorting and deep sequencing) my method provides a ~1000-fold increase in the scale with which the effect of a fully designed sequence on expression can be studied. The results analysis produced several insights into the principles of transcriptional regulation .Due to its adaptable nature is currently applied to study a broad range of mappings between genotype and diverse biological phenotypes in Eran Segal lab.

In two additional projects I showed how yeast ribosomal protein use transcriptional regulation to compensate for differences in their gene copy number by accurately measuring their promoters derived expression and modeling their regulatory mechanism (Genome Research 2011); and developed a novel probabilistic method (based on Markov networks) to infer and model TF binding specificities from experimental results, while capturing inter-dependencies between binding positions (PLoS Computational Biology 2008).

On Feb. 1st 2014 I have started a postdoc at the labs of Prof. Jonathan Pritchard and Prof. Hunter Fraser.”

Talk: Unraveling gene promoter and 3’ end effects on expression strength and noise using many designed sequences

Despite extensive research, our understanding of the rules according to which cis-regulatory sequences are converted into gene expression is limited. We devised a method for obtaining parallel, highly accurate gene expression measurements from thousands of designed regulatory sequences. We first applied it to measure the effect on expression level of systematic changes in the location, number, orientation, affinity and organization of transcription-factor binding sites and nucleosome-disfavoring sequences in promoters. The results analysis revealed a clear relationship between expression and binding-site multiplicity, as well as dependencies of expression on the distance between transcription-factor binding sites and gene starts. We then applied our method to study promoter effect on noise in gene expression and found that noise levels of promoters with similar mean expression levels can vary over two orders of magnitude. Our results suggests that the effect of promoters on noise is partly mediated by the combination of nonspecific DNA binding and one-dimensional sliding along the DNA that occurs when transcription factors search for their target sites. Lastly we adopted our method for studying the effect of gene 3’ end sequence on expression and found that the main mechanism by which 3’ end sequences affect expression is mRNA 3’ end processing efficiency and that it is encoded by a single element in yeast gene 3’ end sequences. Our method can be used to study both cis and trans effects of genotype on transcriptional, post-transcriptional and translational control and is now being adopted to other organisms.

Seminar details

Wednesday Feb 26, 2014
1:00 PM Lunch (please sign up here)
1:15 PM Seminar
Location: Clark Center S360
Host: Jonathan Pritchard

February 19th, Olga Sazonova: Functional genomics of vascular smooth muscle cell differentiation

Olga Sazonova

Olga Sazonova

Olga is a post doc in the labs of Stephen Montgomery and Tom Quertermous.

Functional genomics of vascular smooth muscle cell differentiation

Coronary heart disease (CHD) and other complex human pathologies are products of genetic and environmental factors whose interactions are poorly understood. Genome-wide association studies (GWAS) demonstrate that most disease-associated genetic variants modulate the expression profile of a given gene, not the structure of its protein product. Thus, precise identification of regulatory SNPs and the environment-specific mechanisms of their function is critical for developing novel therapeutics in the post-genomic era. To this end, we have developed a novel computational method to detect gene-environment (GxE) interactions from RNA-Seq data by mapping differential allele-specific expression (dASE) in response to an environmental stimulus. We applied this method to detect dASE in vascular smooth muscle cells (VSMCs) exposed to a healthy or disease-like environment and discovered 72 genes (5% FDR) exhibiting dASE as a function of serum stimulation. Only 28 of these 72 genes were shown to exhibit differential expression (dE), illustrating the power of rASE to reveal environment-responsive transcriptional regulation not captured by conventional differential expression analysis. Further, we found enrichment of genes associated with coronary heart disease by GWAS among dASE genes but not dE genes, and this result further suggests that dASE mapping can reveal novel mechanistic insights about the identify and function of causal variants implicated in disease risk. Our pipeline can be applied to any paired case-control RNA-Seq data set to discover the presence of environment-sensitive regulatory variants and offers a novel and powerful avenue to study GxE interactions in complex human disease.

Seminar details

Wednesday Feb 19, 2014
1:00 PM Lunch
1:15 PM Seminar
Location: Clark Center S360
Host: Stephen Montgomery

Feb 5th, David Golan: Accurate Estimation of Heritability in Case-Control GWAS

David Golan (Tel Aviv University)

David Golan (Tel-Aviv University)

About David

David Golan is a PhD candidate in the department of statistics at Tel-Aviv University under the supervision of Prof. Saharon Rosset.

His research spans a wide range of problems in genetics and bioinformatics, ranging from modeling and analysis of deep sequencing data to population genetics problems such as heritability estimation using GWAS data. David is a Colton fellow at Tel-Aviv University and a fellow of the Edmund J. Safra Center for Bioinformatics.

Abstract

Linear mixed effects models (LMMs) have recently gained popularity as the method of choice for estimating heritability from GWAS data. Recent results using LMMs suggest that much of the “missing” heritability can be found in common SNPs with small effects which are unidentified by current-day GWAS due to low power.

However, many of the interesting diseases and disorders studied are rare (typically affecting <1% of the population), and so case-control designs are used, wherein the proportion of cases in a study is usually considerably higher than the proportion of cases in the population.

We show that this over-representation of cases invalidates several key assumptions of LMMs, e.g. the normality and independence of the random effects, and show that ignoring these problems results in shrunken estimates of heritability.

We propose an alternative approach for estimating heritability. We derive the relationship between the genetic similarity and the phenotypic similarity of any two individuals as a function of the heritability, while explicitly conditioning on the fact that both individuals were selected for the study. Our method then entials regressing the pairwise phenotypic similarities on the pairwise genetic similarities and using the slope to obtain an estimate of the heritability. We show, using simulations, that our method yields unbiased estimates which are considerably more accurate than the current state-of-the-art methodology.

Applying our method to several well-studied GWAS yields heritability estimates which are considerably higher than previously published results.

Seminar details

Wednesday Feb 5th, 2014
12:45 PM Lunch: sign up sheet here.
1:15 PM Seminar starts.
Location: Clark Center S360
Host: Jonathan Pritchard
Schedule: Tara Trim (ttrim at stanford.edu)

Jan 29th, Mark Wright (Koni): Local Methods for Evaluating Population Structure and Multiple Admixture in Plant and Animal Populations with Inbreeding

Mark (Koni) Wright, Cornell

Mark (Koni) Wright, Cornell

About Koni

Mark Wright, known to friends and colleagues by the nickname “Koni”, is a Research Associate in the Department of Plant Breeding and Genetics at Cornell University. Mark studied computer science at Cornell as an undergraduate with interests in high performance computing, distributed computing, network security and cryptography, graduating in 1998. For 3 years following Mark continued his undergraduate research job in the Department of Sociology as a professional programmer, developing and optimizing a dynamic microsimulation model of the United States population for policy research. In 2001, Mark opted for a radical career change and took a Programmer/Analyst position at Cornell with Dr. Steven Tanksley. Over the next 3 years he developed the Solanaceae Genomics Network database and website which continues today as the “SOL Genomics Network”. This was his first exposure to biology and genetics. During these next 3.5 years, Mark exercised Cornell employee benefits taking one course per semester, gradually accumulating a foundational background in biology, genetics, population genetics, and statistics. In the fall of 2004, Mark began pursuing a doctoral degree at Cornell full time, initially in the laboratory and field seeking a “hands on” experience but ultimately returning to analysis of large scale genomics datasets under Dr. Carlos Bustamante. Mark completed his Ph.D. in 2010. Currently, Mark works with Dr. Susan McCouch at Cornell developing and analyzing large genotype and sequence datasets in cultivated Asian rice (Oryza sativa) and its wild progenitor species Oryza rufipogon, as well as related African rice species. Mark’s research interests include plant and animal domestication, inference of population structure and history from large genomic datasets, detecting signatures of selection at putative domestication loci, simulation studies for the optimal design of multiple parent quantitative trait mapping populations, and the development of high performance computational methods to utilize increasingly larger datasets for these and other purposes. More recently, Mark has developed broader interests in applying methods he has developed to systems other than rice, and increasing interest in pharmacogenomics and personalized medicine.

Abstract

Cultivated Asian rice is broadly divided into two subspecies, Japonica and Indica rice, recognized morphologically since ancient times. More recently, analysis of molecular genetic markers has further characterized these two groups into 3 Japonica subpopulations and 2 Indica subpopulations. Pairwise FST between these 5 major subpopulations ranges from 0.20 to 0.43. The high level of divergence between the subspecies has been argued to support independent domestications but the presence of only a single haplotype at key domestication loci suggests otherwise. Alternatively, the high degree of subpopulation structure in rice may reflect a high level and complexity of structure in the related wild species Oryza rufipogon that may have been used repeatedly in different geographic regions to adapt domesticated rice imported from a single origin to local ecological conditions. While global ancestry of cultivated rice diversity panels has been extensively studied, population structure in Oryza rufipogon remains largely unexplored. Additionally, while it is known that natural and artificial admixture of cultivated rice populations occurs, there is no extensive study of local genome ancestry in diversity panels and core collections. Using a relatively simple approach we develop a method for unsupervised discovery of subpopulation structure at the genome local level to reveal ancient and recent cross-subpopulation introgressions in cultivated rice and explore the subpopulation structure of Oryza rufipogon. Key considerations were estimating and modeling sample specific inbreeding coefficients along with other model parameters such as subpopulation allele frequencies, and the ability to handle dense marker data on the order of 100,000 to 500,000 SNP markers, or more. We apply the method to several different types of rice datasets and find robust results are obtained, even in the case of low coverage (<0.5X) random-sheer NGS data as well as restriction site anchored genotype-by-sequencing (GBS) data with high missing data rates (>35%), yet no data imputation is required. Additionally, we explore broader applications in other systems such as cattle and horse and find this method confirms and extends published findings which used global model based structure analyses and PCA, but yields as well a map of the genome of the genotyped individuals showing introgression tracks in admixed populations. Taken together, these results suggest the method may be useful in a broad range of applications especially in characterizing large sample collections (eg, germplasm core collections) with cheap, “dirty” genotyping methods such as GBS or low coverage sequencing, and in new emerging systems of study where unlike rice there may be little or no history of population genetic analyses characterizing subpopulation structure.

 

 

Seminar details

Wednesday Jan 29th, 2014
12:45 PM Lunch: sign up sheet here.
1:15 PM Seminar starts.
Location: Clark Center S360
Host: Carlos Bustamante
Schedule: Rosario Monge (rmonge at stanford.edu)