In this talk, Joanna Materzyńska introduces a Python-based, deep learning gesture recognition model. The model is deployed on an embedded system, works in real-time and can recognize 25 different hand gestures from simple webcam stream.
In contrast to traditional vision-based gesture controllers like the Microsoft Kinect, our system requires no depth information. The development of such an architecture is a complex process that requires careful consideration during each step.
Joanna Materzyńska guides you through our process and share the lessons we learned. This includes: our large-scale crowd-acting operation to collect over 150,000 short video clips, our process to decide which deep learning framework to use, the development of a network architecture that allows for classifications of video clips solely with RGB input frames, the iterations necessary to make the neural network run in real-time on embedding devices, and lastly, the discovery and development of playful gesture-based applications.