BEGIN:VCALENDAR
VERSION:2.0
PRODID:icalendar-ruby
CALSCALE:GREGORIAN
X-WR-CALNAME:PhD Preliminary Oral Exam – He Zhang
X-WR-TIMEZONE:Pacific Time (US & Canada)
BEGIN:VEVENT
DTSTAMP:20260517T024024Z
UID:tag:localist.com\,2008:EventInstance_35272243328511
DTSTART:20201203T210000Z
DTEND:20201203T230000Z
DESCRIPTION:Linear-time Algorithms of Partition Function\, Stochastic Sampl
 ing and two-strand Folding for RNA Secondary Structures\n\nMany RNAs fold 
 into multiple structures at equilibrium. The partition function-based meth
 ods are proposed to compute folding ensembles and estimate structure and b
 ase pair probabilities. However\, the classical partition function algorit
 hm scales cubically with sequence length\, and is therefore a slow calcula
 tion for long sequences. We design a linear-time heuristic algorithm\, Lin
 earPartition\, to approximate the partition function and base pairing prob
 abilities\, which is shown to be orders of magnitude faster than Vienna RN
 Afold and CONTRAfold. More interestingly\, the resulting base pairing prob
 abilities are even better correlated with the ground truth structures. Wit
 h LinearPartition\, the estimated base-pairing probabilities provide compa
 ct representations of the exponentially large ensemble\, but they cannot p
 rovide direct and intuitive descriptions\, and cannot be directly used for
  accessibility prediction. Stochastic sampling algorithm\, which samples s
 econdary structures according to their probabilities in the Boltzmann ense
 mble\, is widely used\, e.g.\, for accessibility prediction. However\, the
  current sampling algorithm suffers from three limitations: (a) the formul
 ation and implementation of the sampling phase are unnecessarily complicat
 ed\; (b) much redundant work is repeatedly performed in the sampling phase
 \; (c) the partition function runtime scales cubically with the sequence l
 ength. These issues prevent it from being used for full-length viral genom
 es such as SARS-CoV-2. To alleviate these problems\, we first propose a hy
 pergraph framework under which the sampling algorithm can be greatly simpl
 ified. We then present three sampli! ng algorithms under this framework of
  which redundant work is eliminated in the sampling phase. Finally\, we pr
 opose LinearSampling\, the first end-to-end linear-time stochastic samplin
 g algorithm\, which can be used to detect SARS-CoV-2 potential regions of 
 diagnostics and treatment. Many RNAs function through RNA-RNA interactions
 . LinearSampling is able to provide single strand accessibilities of RNA b
 inding\, however\, two-stand folding\, which can directly predict the stru
 ctures with consideration of RNA-RNA interaction\, is also well-desired. S
 ome existing tools\, such as RNAhybrid and RNAduplex\, are not only less i
 nformative but also less accurate due to omitting the competing between in
 termolecular and intramolecular base pairs. Another group of tools such as
  RNAup focus on predicting the binding region rather than predicting two-s
 trand co-folding structure. Other tools like RNAcofold are too slow due to
  cubic runtime complexity. To address these issues\, we propose LinearCoFo
 ld\, which is able to predict two-strand folding structure\, partition fun
 ction and base pairing probabilities in linear runtime and space. LinearCo
 Fold is a global co-folding approach without restriction on base pair leng
 th\, and can outputs both intermolecular and intramolecular base pairs.\n\
 nCo Advisor: Li-Jing (Larry) Cheng\nCo Advisor: Liang Huang\nCommittee: Li
 zhong Chen\nCommittee: Prasad Tadepalli\nGCR: Brett Tyler
LOCATION:
SUMMARY:PhD Preliminary Oral Exam – He Zhang
URL;VALUE=URI:https://events.oregonstate.edu/event/phd_preliminary_oral_exa
 m_he_zhang
CATEGORIES:Lecture or Presentation
END:VEVENT
END:VCALENDAR
