834-BioSeq2Seq to analyze transcriptional regulationPaper Talk

834-BioSeq2Seq to analyze transcriptional regulation

22分钟 ·
播放数0
·
评论数0

Genome annotation currently requires performing dozens of molecular assays in hundreds of cell and tissue samples, an expensive endeavor which is impractical to replicate across all species and conditions of interest. Here, we introduce BioSeq2Seq, a deep learning framework that infers cell-line-specific molecular assays widely used for genome annotation by leveraging a tri-modal input: evolutionarily conserved DNA sequence features, together with cell-line-specific transcriptional activity and directionality captured by a single run-on sequencing assay. BioSeq2Seq enables flexible genome annotation tasks through parameterized configurations of input features and output targets, combined with gradient-guided architectural refinement for specific biological objectives. Our model demonstrates high accuracy across four downstream tasks, showing improvements of 14.27% in histone modification prediction, 2.50% in functional element identification, and 2.90% in gene expression prediction compared to state-of-the-art methods. In transcription factor binding site (TFBS) prediction, it maintains performance comparable to that of leading existing approaches. By achieving competitive performance across tasks with single-cell-line input data, BioSeq2Seq provides an efficient and low-cost alternative for genome annotation.

References:

  • Zhang Z, Fan X, Zhong J, et al. An end-to-end generalizable deep learning framework to comprehensively analyze transcriptional regulation[J]. Nature Communications, 2026.