Browsing by Subject "Next generation sequencing"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Addressing intrinsic challenges for next generation sequencing of immunoglobulin repertoires.(2014-05) Chrysostomou, Constantine; Georgiou, George; Iverson, Brent L; Maynard, Jennifer A; Alper, Hal S; Mullins, Charles BAntibodies are essential molecules that help to provide immunity against a vast population of environmental pathogens. This antibody conferred protection is dependent upon genetic diversification mechanisms that produce an impressive repertoire of lymphocytes expressing unique B-cell receptors. The advent of high throughput sequencing has enabled researchers to sequence populations of B-cell receptors at an unprecedented depth. Such investigations can be used to expand our understanding of mechanistic processes governing adaptive immunity, characterization of immunity related disorders, and the discovery of antibodies specific to antigens of interest. However, next generation sequencing of immunological repertoires is not without its challenges. For example, it is especially difficult to identify biologically relevant features within large datasets. Additionally, within the immunology community, there is a severe lack of standardized and easily accessible bioinformatics analysis pipelines. In this work, we present methods which address many of these concerns. First, we present robust statistical methods for the comparison of immunoglobulin repertoires. Specifically, we quantified the overlap between the antibody heavy chain variable domain (V H ) repertoire of antibody secreting plasma cells isolated from the bone marrow, lymph nodes, and spleen lymphoid tissues of immunized mice. Statistical analysis showed significantly more overlap between the bone marrow and spleen VH repertoires as compared to the lymph node repertoires. Moreover, we identified and synthesized antigen-specific antibodies from the repertoire of a mouse that showed a convergence of highly frequent VH sequences in all three tissues. Second, we introduce a novel algorithm for the rapid and accurate alignment of VH sequences to their respective germline genes. Our tests show that gene assignments reported from this algorithm were more than 99% identical to assignments determined using the well-validated IMGT software, and yet the algorithm is five times faster than an IgBlast based analysis. Finally, in an effort to introduce methods for the standardization, transparency, and replication of future repertoire studies, we have built a cloud-based pipeline of bioinformatics tools specific to immunoglobulin repertoire studies. These tools provide solutions for data curation and long-term storage of immunological sequencing data in a database, annotation of sequences with biologically relevant features, and analysis of repertoire experiments.Item Analysis of Genomic Imprinting of UBE3A in Neurons(2015-05-05) Hillman, Paul RandolphAngelman syndrome (AS), chromosome 15q11-q13 duplication syndrome (Dup15q), and Prader-Willi syndrome (PWS) are neurodevelopmental disorders associated with dysregulated expression of imprinted genes located within the human 15q11-13 imprinted region. Angelman syndrome is caused by loss-of-function or loss-of-expression of the maternally inherited UBE3A allele; Dup15q syndrome is attributed to maternally inherited copy number gains of UBE3A; and, paternally inherited deletions of the SNORD116 cluster cause PWS. The UBE3A gene is imprinted in the brain with maternal-specific expression and biallelically expressed in all other cell types. The imprint is regulated by expression of the UBE3A antisense transcript (UBE3A-AS), which is expressed only in neurons and imprinted with paternal-specific expression. The UBE3A-AS represents the 3` end of a long polycistronic transcript that includes the SNORD116 and SNORD115 gene clusters. Thus, the genes causing AS, Dup15q, and PWS are transcriptionally linked; however, the functional significance of the neuron specific imprint is largely unknown. In this dissertation, it was hypothesized that imprinting of UBE3A evolved as a mechanism to negatively regulate UBE3A protein levels in neurons. This hypothesis was tested by examining allelic expression patterns and associated protein levels of the mouse 7c imprinted region, the orthologous region of human 15q11-q13. Analyses revealed that imprinted expression of Ube3a in the brain resulted in elevated RNA and protein levels compared to tissues where Ube3a was biallelically expressed. Likewise, Snord116, Snord115, and Ube3a-AS transcripts were highly expressed in the brain. The elevated Ube3a protein levels in the brain were due to increased maternal-allelic expression during neurogenesis concurrent with paternal-allelic suppression. Analysis of UBE3A expression in the opossum, a metatherian mammal lacking an orthologous imprinted region, showed that the UBE3A imprint did not evolve to negatively regulate UBE3A protein levels in the brain. Extensive alternative splicing of Ube3a-AS was detected in the brain, which generated at least two transcripts containing novel open reading frames. Novel Ube3a alternatively spliced transcripts were also identified in the brain. Collectively, these data reject the hypothesis that the UBE3A imprint evolved to negatively regulate UBE3A protein levels in the brain; instead, they suggest that the UBE3A imprint may allow co-expression of the UBE3A and SNORD gene cluster in neurons, which may also facilitate or regulate the expression of novel brain-specific UBE3A transcripts.Item Genome-wide approaches to explore transcriptional regulation in eukaryotes(2014-05) Park, Daechan; Iyer, Vishwanath R.; Marcotte, Edward M; Paull, Tanya T; Miller, Kyle M; Stevens, Scott WTranscriptional regulation is a complicated process controlled by numerous factors such as transcription factors (TFs), chromatin remodeling enzymes, nucleosomes, post-transcriptional machineries, and cis-acting DNA sequence. I explored the complex transcriptional regulation in eukaryotes through three distinct studies to comprehensively understand the functional genomics at various steps. Although a variety of high throughput approaches have been developed to understand this complex system on a genome wide scale with high resolution, a lack of accurate and comprehensive annotation transcription start sites (TSS) and polyadenylation sites (PAS) has hindered precise analyses even in Saccharomyces cerevisiae, one of the simplest eukaryotes. We developed Simultaneous Mapping Of RNA Ends by sequencing (SMORE-seq) and identified the strongest TSS and PAS of over 90% of yeast genes with single nucleotide resolution. Owing to the high accuracy of TSS identified by SMORE-seq, we detected possibly mis-annotated 150 genes that have a TSS downstream of the annotated start codon. Furthermore, SMORE-seq showed that 5’-capped non-coding RNAs were highly transcribed divergently from TATA-less promoters in wild-type cells under normal conditions. Mapping of DNA-protein interactions is essential to understanding the role of TFs in transcriptional regulation. ChIP-seq is the most widely used method for this purpose. However, careful attention has not been given to technical bias reflected in final target calling due to many experimental steps of ChIP-seq including fixation and shearing of chromatin, immunoprecipitation, sequencing library construction, and computational analysis. While analyzing large-scale ChIP-seq data, we observed that unrelated proteins appeared to bind to the gene bodies of highly transcribed genes across datasets. Control experiments including input, IgG ChIP in untagged cells, and the Golgi factor Mnn10 ChIP also showed the strong binding at the same loci, indicating that the signals were obviously derived from bias that is devoid of biological meaning. In addition, the appearance of nucleosomal periodicity in ChIP-seq data for proteins localizing to gene bodies is another bias that can be mistaken for false interactions with nucleosomes. We alleviated these biases by correcting data with proper negative controls, but the biases could not be completely removed. Therefore, caution is warranted in interpreting the results from ChIP-seq. Nucleosome positioning is another critical mechanism of transcriptional regulation. Global mapping of nucleosome occupancy in S. cerevisiae strains deleted for chromatin remodeling complexes has elucidated the role of these complexes on a genome wide scale. In this study, loss of chromodomain helicase DNA binding protein 1 (Chd1) resulted in severe disorganization of nucleosome positioning. Despite the difficulties of performing ChIP-seq for chromatin remodeling complexes due to their transient and dynamic localization on chromatin, we successfully mapped the genome-wide occupancy of Chd1 and quantitatively showed that Chd1 co-localizes with early transcription elongation factors, but not late transcription elongation factors. Interestingly, Chd1 occupancy was independent of the methylation levels at H3K36, indicating the necessity of a new working model describing Chd1 localization.Item Global survey of the immunoglobulin repertoire using next generation sequencing technology(2014-12) Hoi, Kam Hon; Georgiou, GeorgeSpecific and sensitive recognition of foreign agents is a critical attribute of the overall effective immune system required for maintaining host protection against challenge from pathogenic cells. In the humoral arm of the immune system, this recognition attribute is carried out by the cell surface bound immunoglobulin-like receptors (BCR) and its soluble forms i.e. antibodies. Over several million years of evolution, the immune system has adopted several strategies for diversifying the antibody sequence and thus its ability to recognize an astronomical variety of molecules through the combinatorial assembly of a small number of DNA segments or genes. Among these immunoglobulin gene diversification strategies, antibody somatic VDJ recombination and junctional diversity are the fundamental mechanisms in generating a broad range of antibody specificities. Understanding how the genetic diversity of antibodies is affected in health and disease is critical for a wide range of medical applications, from vaccine evaluation to diagnostics and therapeutics discovery. Because of the very large number of distinct antibodies encoded by the more than 100 billion B cells in humans, it is essential to use high throughput next generation sequencing technologies in order to obtain an adequate sampling of the sequences and relative abundance of different antibodies expressed by B cells in clinical samples. The process requires rigorous methods for first, experimentally determining the sequences of antibodies in a sample and for second, informatics tools designed for distilling this information for practical purposes. This dissertation describes a variety of experimental approaches and informatics tools developed for the determination and mining of the antibody repertoire. The information from this work has led to major conclusions regarding the nature of the antibody repertoire in healthy individuals, in volunteers following vaccination, and in HIV-1 patients.Item Understanding coral dispersal(2014-05) Davies, Sarah Whitney; Matz, Mikhail V.Understanding the factors influencing species ranges and dispersal are becoming increasingly important as climate change alters species distributions worldwide. If species are to persist, life-history strategies must rapidly evolve to accommodate shifting environments. This dissertation assesses the factors modulating dispersal in corals. First, I examined if there were any systematic differences in settlement between Indo-Pacific and Caribbean coral larvae that might explain Caribbean recruitment failures. No differences were observed, however I detected significant divergences in settlement cue preferences among coral species across both the Caribbean (Diploria strigosa, and Montastraea franksi) and the Indo-Pacific (Acropora tenuis, A. millepora, and Favia lizardensis), even for coral larvae from the same reef. Secondly, I established the extent of coral dispersal between remote reefs. I evaluated the genetic diversity and divergence across Micronesia for two coral species and investigated if these islands served as a connectivity corridor between the Indo-West-Pacific (Coral Triangle) and the Central Pacific. I found isolation-by-distance patterns whose strength depended on species, suggesting these corals are not panmictic across their ranges and that island stepping-stones facilitate gene flow to remote Pacific reefs. Next, I investigated genetic structure of symbionts in these same corals, to see if horizontally transmitted symbionts are less dispersive than their coral hosts. Symbiont genetic divergence between islands was an order of magnitude larger than host divergence and both host species and environment modulated symbiont composition. These results suggest that symbiont populations are host-specific and associating with local symbionts might be a mechanism for broadly dispersing corals to adapt locally. Lastly, I estimated heritable variation in dispersal-related traits in coral larvae. I observed strong heritable variation in gene expression, as well as parental effects on two phenotypic traits, settlement and fluorescence. I observed that patterns of differential expression in three-day-old larvae predicted variation in settlement and fluorescence two days later. Correlations between proteoglycan expression and settlement suggest that the larval extracellular matrix plays a role in settlement. Down-regulation of ribosomal proteins and differential expression of oxidative stress genes correlated with increasing fluorescence, possibly indicating reduced growth and increased stress. Overall, this dissertation contributes to our knowledge of factors affecting coral dispersal and the potential for evolution of dispersal-related traits.