Abstract:
Numerous applications, including translation tools, interpreting services, video remote interpreting, human-computer interaction, online hand tracking of human communication in desktop settings, real-time multi-person recognition systems, games, virtual reality settings, robot controls, and natural language communications, benefit from sign language recognition advantages. Multimodal data contains information from different sources such as video, sensors, electrocardiograms (ECGs), while emotions refer to the non-verbal cues that accompany language use, such as facial expressions and body posture. Integrating these additional sources of information helps to better understand the user’s intent, which improves the performance of the sign language recognition model. To build such a model, a set of multimodal data and emotions must be collected. This data set should be differentiated and cover different individual/isolated signs, emotions and body gestures. The model is designed to integrate multimodal data and emotions, which would involve combining different machine and deep learning algorithms adapted to different types of data. In addition, the model will need to be trained to recognize the different emotions that accompany sign language. Once the model is trained, it can be tested on the test dataset to assess its performance and also plan for a test on real data (with signing people). In this paper we propose a study to use the multi-modal machine learning for sign recognition language.