Big Data Analytics For Predictive Healthcare And Early Disease Detection

3 Dec

Authors: Dr. C.K. Gomathy, Thulasi Ram. S, Nandagopal. S

Abstract: The convergence of advanced data collection technologies—specifically wearable sensors, Electronic Health Records (EHRs), and Internet of Medical Things (IoMT) devices—has led to an exponential increase in healthcare data volume and velocity. The efficacy of early disease prediction and subsequent improvement in clinical outcomes critically relies on the ability to analyze these massive, heterogeneous datasets in near real-time. This study posits that Big Data Analytics (BDA), leveraging scalable and distributed computing architectures, offers the necessary Artificial Intelligence (AI)-supported mechanisms for extracting timely and actionable clinical insights. This research investigates the utilization of Big Data frameworks, including Hadoop and Apache Spark, within a cloud-based environment for automated disease prediction and risk assessment. A sophisticated predictive model was developed using Spark MLlib, employing Random Forest (RF) and Gradient Boosting Trees (GBT) algorithms, specifically for the early detection of cardiovascular disorders. Experimental analysis demonstrates that the BDA-driven predictive system significantly improves diagnostic accuracy by 24%, reduces processing time by approximately 80%, and concurrently enhances resource efficiency relative to conventional analytical methodologies. The study concludes that BDA is instrumental for intelligent healthcare decision-making, facilitating the shift towards personalized medicine and proactive, preemptive clinical interventions.