Show simple item record

dc.contributor.advisorVikalo, Harisen
dc.contributor.committeeMemberDhillon, Inderjit Sen
dc.contributor.committeeMemberRavikumar, Pradeepen
dc.contributor.committeeMemberSanghavi, Sujayen
dc.contributor.committeeMemberTewfik, Ahmeden
dc.creatorDas, Shreepriyaen
dc.date.accessioned2016-02-18T14:47:24Zen
dc.date.accessioned2018-01-22T22:29:32Z
dc.date.available2016-02-18T14:47:24Zen
dc.date.available2018-01-22T22:29:32Z
dc.date.issued2015-12en
dc.date.submittedDecember 2015en
dc.identifierdoi:10.15781/T27389en
dc.identifier.urihttp://hdl.handle.net/2152/33328en
dc.description.abstractThe field of genomics has witnessed tremendous achievements in the past two decades. The advances in sequencing technology have enabled acquisition of massive amounts of data that reveals information about individual genetic blueprint and is revolutionizing the field of molecular biology. Interpretation of such data requires solving mathematical (statistical and computational) problems rendered difficult by the complex interacting processes that are characteristic of biological systems; the data is high dimensional, typically noisy and often incomplete. Algorithm design in these settings requires deep understanding of the underlying biological principles, good mathematical abstractions permitting tractable inference and fast, scalable and accurate solutions using ideas from diverse fields such as optimization, probability, statistics and algorithms. This dissertation deals with two such problems occurring in the field of bioinformatics/computational biology. First, for the problem of basecalling for sequencing-by-synthesis (Illumina) platforms, I describe novel computationally tractable statistical models and signal processing schemes that are fast and have lower error rates than existing state-of-the-art basecallers. Extensions to a soft information exchange setup to do joint basecalling and SNP calling are also explored. Next, I describe two novel single individual haplotyping inference schemes using an (optimal) branch and bound framework and (scalable) low rank semidefinite programming ideas for diploid and polyploid species. In addition to improving the quality of basecalling, SNP calling, genotyping and haplotyping, I also developed user-friendly software that can be used by the biological research community for various purposes including cancer genomics and metagenomics studies.en
dc.format.mimetypeapplication/pdfen
dc.language.isoenen
dc.subjectBasecallingen
dc.subjectHaplotypingen
dc.subjectBioinformaticsen
dc.subjectComputational biologyen
dc.titleAlgorithms for next generation sequencing data analysisen
dc.typeThesisen
dc.description.departmentElectrical and Computer Engineeringen
dc.date.updated2016-02-18T14:47:24Zen


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record