The Challenge
Live events at IPL create unique constraints: high footfall, variable lighting, no calibration per user, no tolerance for latency or error, and zero acceptable downtime.
The brief was for something that 'just works' instantly for any user walking up.
The Solution
A real-time computer vision pipeline running on edge GPU hardware, capable of recognizing user gestures and translating them into game inputs with sub-100ms latency.
High-speed cameras feed live video into a custom processing pipeline. Computer vision models run on GPU-accelerated edge devices, eliminating cloud round-trip latency.
System Architecture
- High-speed camera input (60+ FPS)
- GPU-accelerated edge processing unit
- Computer vision models for gesture and skeletal tracking
- Real-time game engine integration
- Optimized for variable lighting and crowd conditions
Key Capabilities
- Real-time gesture recognition (sub-100ms latency)
- Multi-user simultaneous interaction
- Robust to variable lighting / event conditions
- Zero-calibration onboarding for users
- Custom game logic integration
- Scalable to multiple stations per venue
Impact
- Successfully deployed at Tata IPL — one of the most-watched live events in the world
- Operated at high-footfall conditions without performance degradation
- Eliminated touch-based interaction (hygiene + durability advantage)
- Repeat deployments across KKR, TCS, and entertainment companies

