Authors: Dr. Pankaj Malik, Sachin Sethiya, Yarthik Soni, Harshita Kushwah, Jhalak Kavadiya
Abstract: Financial reports are often lengthy, complex, and rich in domain-specific terminology, making manual analysis time-consuming and inefficient. This paper proposes an automated summarization framework using Natural Language Processing (NLP) techniques to generate concise and informative summaries of financial documents. The system employs a hybrid approach that combines extractive methods (TF-IDF and TextRank) with abstractive transformer-based models such as BART and PEGASUS to enhance contextual understanding and coherence. The proposed model was evaluated on benchmark financial datasets, including annual reports and earnings call transcripts. Experimental results demonstrate that the hybrid model outperforms traditional extractive and standalone abstractive approaches, achieving a ROUGE-1 score of 0.52, ROUGE-2 score of 0.31, and ROUGE-L score of 0.48. Additionally, the model improved information retention by approximately 18% and reduced redundancy by 22% compared to baseline methods. The findings indicate that integrating extractive and abstractive techniques significantly enhances summarization quality, enabling faster and more accurate financial analysis. This approach can be effectively applied in investment decision-making, financial auditing, and automated reporting systems.
International Journal of Science, Engineering and Technology