Performance Evaluation Of BioBert And CNN Models For Neurological Disorder Detection

16 May

Authors: Abubakar Sadiq Muhammad, Salim Ahmad, Zaharaddeen S. Iro, Abba Dauda

Abstract: Accurate and efficient modeling of neurological disorders remains a significant challenge in clinical neuroscience. With the growing availability of unstructured clinical narratives, natural language processing (NLP) has emerged as a promising avenue for extracting diagnostic signals from text. This study presents a systematic comparison of two deep learning paradigms, transformer-based and convolution-based models for classifying neurological disorders from clinical notes. Specifically, we fine-tuned BioBERT, a domain-adapted transformer pretrained on biomedical corpora, and trained a Convolutional Neural Network (CNN) under identical experimental conditions, including dataset, preprocessing pipeline, hyperparameters (learning rate = 2e−5, batch size = 32, max length = 130), and evaluation metrics. BioBERT achieved 95.53% accuracy, 94.38% F1-score, and ROC-AUC of 0.952, significantly outperforming the CNN (89.62% accuracy, 88.32% F1-score, ROC-AUC = 0.918). The performance gap is attributed not to data or tuning advantages, but to fundamental differences in how the models process language: CNNs rely on local, n-gram–level pattern matching and fixed receptive fields, limiting their ability to resolve long-range dependencies and nuanced clinical expressions (e.g., negation, hedging, comorbidity descriptions); in contrast, BioBERT leverages bidirectional self-attention and domain-specific pretraining to capture contextual semantics, medical terminology, and subtle linguistic markers of pathology. These findings demonstrate that context-aware, domain-pretrained transformers offer a qualitatively distinct advantage over local-feature extractors like CNNs in clinical text understanding, supporting their integration into scalable, non-invasive diagnostic support systems.

DOI: https://doi.org/10.5281/zenodo.20234225