Site menu:
Summary of research topics
- Analysis of DNA sequencing data.
Basically, we are interested in developing methods
to analyze data that is generated by DNA sequencing platforms.
- Genome variation discovery and
genotyping: The Human
Genome Project took 15 years and cost 3-10 billion dollars,
but the new high
throughput
seqencing (HTS) platforms now make it possible to resequence
the genome of a human individual in approximately 2 days for
approximately $1,000. We can discover various forms of genomic
variation from single
nucleotide
polymorphisms to structural
variants by analyzing the observed read properties; however,
this is a computationally difficult problem as the HTS platforms
produce billions of short (~100-150 characters long) sequence data
with low error rate, or longer (10-50 Kb) reads with high error
rate, where the human genome length is around 3 billion
characters. The problem is further complicated by the repeats
present in the human genome. We develop novel algorithms to
comprehensively and quickly discover genomic variants focusing on
structural variation using various sequencing technologies
including short and long reads, as well as linked-read platforms.
- Hardware/software co-design to
accelerate genome analysis: The success of all medical
and genetic applications of DNA sequencing critically depends on
the existence of computational technologies that can process and
analyze the enormous amounts of sequence data fast and in an
energy-efficient manner without the need to build large
infrastructures. Our goal is to develop such technologies by
combining the benefits of enhanced software algorithms and
specialized hardware accelerators such as GPGPUs, FPGAs, ASICs,
and processing-in-memory paradigm.