A Study and Analysis of Disease Identification using Genomic Sequence Processing Models: An Empirical Review

Curr Genomics. 2023 Dec 12;24(4):207-235. doi: 10.2174/0113892029269523231101051455.

Abstract

Human gene sequences are considered a primary source of comprehensive information about different body conditions. A wide variety of diseases including cancer, heart issues, brain issues, genetic issues, etc. can be pre-empted via efficient analysis of genomic sequences. Researchers have proposed different configurations of machine learning models for processing genomic sequences, and each of these models varies in terms of their performance & applicability characteristics. Models that use bioinspired optimizations are generally slower, but have superior incremental-performance, while models that use one-shot learning achieve higher instantaneous accuracy but cannot be scaled for larger disease-sets. Due to such variations, it is difficult for genomic system designers to identify optimum models for their application-specific & performance-specific use cases. To overcome this issue, a detailed survey of different genomic processing models in terms of their functional nuances, application-specific advantages, deployment-specific limitations, and contextual future scopes is discussed in this text. Based on this discussion, researchers will be able to identify optimal models for their functional use cases. This text also compares the reviewed models in terms of their quantitative parameter sets, which include, the accuracy of classification, delay needed to classify large-length sequences, precision levels, scalability levels, and deployment cost, which will assist readers in selecting deployment-specific models for their contextual clinical scenarios. This text also evaluates a novel Genome Processing Efficiency Rank (GPER) for each of these models, which will allow readers to identify models with higher performance and low overheads under real-time scenarios.

Keywords: Genome processing; classification; deep learning; disease; gene network; machine learning.

Publication types

  • Review