Authors: Dr. G Ramasubba Reddy, M Prathap, J.Sunil
Abstract: Natural Language Processing (NLP) has undergone a paradigm shift with the advent of Transformer-based architectures. Traditional RNN and CNN models struggled to capture long-range dependencies, but Transformers revolutionized this domain through self-attention mechanisms. This paper presents a comparative study of major Transformer architectures — BERT, RoBERTa, DistilBERT, and GPT-2 — focusing on text classification performance. Using a common dataset (IMDb reviews), models were fine-tuned and evaluated on accuracy, training time, and parameter efficiency. The results highlight trade-offs between accuracy and computational efficiency, providing insights into model selection for various NLP applications.
International Journal of Science, Engineering and Technology