Topological Data Analysis Of Hindi Literary Styles: A Persistent Homology Framework For Author Identification

8 Jun

Authors: Shainaj khan

Abstract: We introduce a completely new methodology: Topological Data Analysis TDA for Hindi literary stylometry. Unlike traditional frequency‑based or neural approaches, TDA captures the shape of a text’s stylistic features across multiple scales. By encoding each Hindi sentence as a point in a high‑dimensional feature space and constructing Vietoris–Rips complexes, we compute persistent homology barcodes that serve as unique topological signatures of an author’s style. This paper presents the first‑ever application of persistent homology to Hindi prose or poetry. The framework is fully implementable using standard TDA libraries and requires no prior training data. We demonstrate the concept on synthetic Hindi text samples. All definitions, algorithms, and the evaluation metric are original

DOI: http://doi.org/10.5281/zenodo.20584182