Linguistic Bias In News Media: Detection Using Sentiment Analysis And Text Mining

12 May

Authors: Avantika, Dr. Dharmbir Yadav

Abstract: News media analysis offers a way to help people understand how public opinion is formed and how information spreads in contemporary society. With the rapid growth of digital journalism, concerns about political and ideological bias in news articles have grown dramatically Often, linguistic bias (selective word choice, attitude, feeling, and framing) affects readers' perceptions without altering the factual information. In addition, current approaches to identifying such bias are expensive, subjective, and are not suitable for data sets that are large in size. This research paper aimed to solve this problem by creating a system that detects linguistic bias in news media, including newspapers, news TV broadcasts, and news media online. To overcome this, a system for detecting linguistic bias in news media, such as newspapers, television news broadcasts, and news media online, was developed systematically by using sentiment analysis and text mining. The system would automatically analyze vast amounts of news text for a variety of sources and uncover differences in sentiment and linguistic style between ideologically opposed ideologies. The framework is a multi-stage pipeline that consists of data ingestion, data pre-processing, feature extraction from articles, sentiment analysis evaluation, and classification using machine learning. In order to identify context-based patterns in the articles, the following linguistic features are mainly used: term frequency-inverse document frequency (TF-IDF), n-grams, and named entities. Sentiment analysis is also performed at the document and entity level to compare and contrast sentiments shared in the articles. The technique applied is a classification approach based on Naïve Bayes and Support Vector Machine (SVM) models for classification of news articles according to their orientation of bias. Tests were conducted on a sample of news reports from different ideological sources on similar topics to see how effective the system is. The ability of the model to detect the biased content or information was evaluated using performance metrics such as accuracy, precision, recall, and F1-score.

DOI: https://doi.org/10.5281/zenodo.20132337