Deciphering the code of gene regulation using mass.. (GENEREGULATION)
Deciphering the code of gene regulation using massively parallel assays of designed sequence libraries
Start date: Mar 1, 2014,
End date: Feb 28, 2019
Many gene expression changes that are associated with disease states have in turn been linked to changes in the genes’ regulatory regions. However, without a ‘regulatory code’ that informs us how DNA sequences determine expression levels, we cannot predict which sequence changes will affect expression, by how much, and by what mechanism.Here, we aim to arrive at a mechanistic and quantitative understanding of how expression levels are encoded in DNA sequence using a combined experimental and computational approach. To this end, we will construct libraries of >50,000 sequences, fuse them to fluorescent reporters, and genomically integrate them to yeast or human cells. We will then develop methods for accurately measuring, in parallel, the expression of each fused sequence within a single experiment, and for measuring the DNA binding state of each sequence at single cell resolution, resulting in ~1000-fold increase in the scale with which we can study the effect of sequence on expression.Notably, we will design our experimental system to be modular, allowing us to propose a highly ambitious yet realistic plan in which we will study the effect of sequence on (1) transcriptional and (2) post-transcriptional regulation; (3) Unravel the effect of genetic variation across human individuals on expression; (4) Quantify how cellular fitness depends on the expression level of individual endogenous genes; and (5) Construct a predictive model of the effect of DNA sequence on expression.Each of our libraries should provide novel insights into a different aspect of gene regulation, leading to new means by which we can interpret whole genome sequencing, which is rapidly being collected for many individuals. In particular, our unified model should allow us to predict expression changes among human individuals based only on their genotypic variation, greatly enhancing the ability to identify common or rare sequence variants that may affect molecular function or cause disease.
Get Access to the 1st Network for European Cooperation