GSoC’21 RoboComp project: Sign language recognition
14th Aug 2021
Conclusion:
Throughout this projest, three components are published: BodyHandJointsDetector, ImageBaseGestureRecognition, PoseBasedGestureRecognition. I will also code the testing client for each approach.
-
BodyHandJointsDetector: component , client. In this component, I use Openpose light model and media-pipe lib to get skeleton from image body.
-
ImageBasedRecognition: component , client. In this project, we implement WLASL recognizer. There are pretrained model for this dataset. Therefore, we reuse these models without any training. In the image-based approach, they use I3D model for recognition.
-
PoseBasedRecognition: component , client. For pose-based reocngition, we reuse Pose-TGCN (graph neural network). This model have body/hand joints input and output the gesture classes.
The pretrained recognized model is trained by 25 frames/second dataset. Therefore, our components is expected to work well with this FPS. Furthermore, in order to work well with body/hand joints detector, the background of video should be simple (not to many colors and objects).
Inference acceleration:
The inference directly from Python for Pytorch models usually performs poorly. Therefore, I apply some techniques:
- Using C++ code for post-processing.
- Change the Pytorch format to ONNX format.
- Combine trained ONNX model with NVIDIA® TensorRT.
Pull Requests:
I have made my commits in some pull requests:
- For all components: https://github.com/robocomp/robocomp-robolab/pull/77
- For IDSL files: https://github.com/robocomp/robocomp/pull/343
- For blog: https://github.com/robocomp/web/pull/259, https://github.com/robocomp/web/pull/252, https://github.com/robocomp/web/pull/250, https://github.com/robocomp/web/pull/249, https://github.com/robocomp/web/pull/224.
Future works:
I have just finished 3 listed components. However the accuracy of PoseBasedRecognizer is still low. Furthermore, the applying of Unsupervised model is still not used. Therefore, in the future, we would like to:
- Improve result of Pose-based approach.
- Apply unsupervised techniques for gesture recognition.
Thanking Note:
The journey of GSoC 2021 is really interested. I learn a lot about: open source contribution, robocomp library, and also about sign language problem for the first time. Furthermore, I faced some challenges and it’s quite fun to deal with.
I would like to thank Aditya Aggarwal and Kanva Gupta for patiently help me in this project.