YouTube Transcript Summarization

8 Sep

Authors: Dabhoiya Mohamed Amaan, Abhiket H Rajput, Siddiqui Mohd Sohel, Patel Amit, Salman Mohammedhanif Buddha

Abstract: The sudden growth of video on the Internet, spear- headed by YouTube, has created an unprecedented pool of information and learning content. However, the long duration of the majority of the videos tends to present an obstacle for people who require a quick extraction of relevant details. Manually sifting through timestamps and reading full transcripts is time- consuming and ineffective. In order to address this problem, we recommend a Chrome extension that will automatically generate short summaries for YouTube video transcripts using Natural Language Processing (NLP). It is wholly integrated with YouTube, retrieves transcripts directly from a video, preprocesses the text to remove redun- dancies, and employs summarization techniques to give coherent and meaningful summaries. The system accommodates extractive summarization using light algorithms such as TextRank and abstractive summarization using new NLP models accessed through APIs. The summary is user-adjustable, can be copied and used at once, or can be exported as a text or PDF file. An experimental study shows that the tool reduces the length of the transcript by 60 to 70% without sacrificing key ideas and thus saves users time, as well as increasing accessibility to long video content. The system is shown to be a sample of applying NLP in ordinary browsing toward successful digital learning and knowledge acquisition.

DOI: https://doi.org/10.5281/zenodo.17076157