PalmPilot: A Touchless Gesture Control System

12 Jun

Authors: Nitin Dhawas, Ritesh Sirpor, Yash Patil, Jamal Siddiqui

Abstract: Contemporary advances in computer vision and edge-optimized neural inference have established a practical pathway for deploying high-fidelity hand-tracking pipelines on commodity hardware, enabling entirely touchless human-computer interaction. This paper presents a vision-based, touchless gesture-controlled operating system interface that converts natural hand movements captured by a standard RGB webcam into a comprehensive set of OS-level input events. The proposed framework integrates Google's MediaPipe Hands for real-time 21-landmark detection, OpenCV for acquisition and preprocessing, a discrete-time Kalman filter for tremor suppression, and PyAutoGUI/pynput for platform-native event dispatch. Nine interaction primitives are supported: cursor movement, left/right click, double-click, drag-and-drop, bidirectional scrolling, three-finger Task View invocation, and pinch-to-zoom. A frame-persistence

DOI: http://doi.org/