AI-Powered De Novo Motif Discovery System For Genomic Sequence Analysis

23 May

Authors: Khushi S Shukla, Prof. Shilpa M, Mohammed Viqar, Mohd Shahnawaz Khan, Prateeksha R Y

 

 

Abstract: Accurately identifying regulatory DNA motifs—short, recurring sequences that influence gene expression—is challenging due to their short length, sequence variability, and dependence on surrounding genomic context. Conventional experimental methods to identify motifs are time-consuming and not scalable. This study describes a computational workflow for de novo motif discovery that utilizes statistical and AI methods, such as Expectation Maximization, Gibbs Sampling, and deep learning algorithms, to recognize conserved sequence motifs from genomic data. By avoiding the pre-existing knowledge of motifs, the system identifies prospective transcription factor binding sites and other regulatory factors and deepens our understanding of gene regulation. The method is validated against benchmark datasets and visualized by sequence logos, providing a scalable and understandable solution for research in functional genomics.

DOI: http://doi.org/