LIP AND SIGN ASSISTANT (LISP) USING HYBRID ARCHITECTURE

24 Apr

Authors: Rohith.G, Praveen R, Sandhya R, Jaraline

Abstract: This project, titled LISA (Lip and Sign Assistant), presents a novel, full-stack solution for the real-time, speaker-independent translation of lip movements and hand gestures into text and audio. The system utilizes a webcam-based hardware setup for video acquisition, offering a controlled environment for data capture. The system implements a robust six-stage methodology, starting with Digital Signal Processing (DSP) for efficient video signal conditioning and feature engineering. Raw video is subjected to 2D Gaussian filtering and YCbCr normalization before extracting crucial spatial and temporal features, including the Mouth Aspect Ratio (MAR) and Hu Moments. These optimized features form a low-dimensional input for the Machine Learning Classification stage, utilizing a Convolutional Neural Network (CNN) for sequence recognition. The output is integrated into a complete Application Layer consisting of a Mobile App for real-time text/audio feedback and a secure Web Portal for medical record management and progress tracking (User ID access). Tested on a 20-sign vocabulary, the system achieved an overall CNN classification accuracy of 92.3%, demonstrating the computational efficiency of the hybrid DSP-ML architecture. This work validates a scalable, accessible solution for bridging communication gaps for the deaf and mute community.

DOI: https://doi.org/10.5281/zenodo.19729396