Sign-Language Through Camera Lets Alexa Respond to Gestures

Nelson Régo

6 years ago

Developer Abhishek Singh, the creator of an app that allows Amazon Alexa to respond to sign language uses a camera-based system to identify gestures and interpret them as text and speech.

Speech recognition is rarely able to pick up the rhythms of deaf users. And a lack of hearing presents a clear challenge to communicating with voice-based assistants.

Singh’s project offers one potential solution, rigging Amazon’s Alexa to respond in text to American Sign Language (ASL).

“If these devices are to become a central way we interact with our homes or perform tasks, then some thought needs to be given to those who cannot hear or speak,” he says. “Seamless design needs to be inclusive in nature.”

The developer trained an AI using the machine-learning platform Tensorflow, which involved repeatedly gesturing in front of a webcam to teach the system the basics of sign language.

Once the system was able to respond to his hand movements, he connected it to Google’s text-to-speech software to read the corresponding words aloud.

The Amazon Echo reacts and its vocal response is automatically transcribed by the computer into text, which is read by the user, it is a workaround, with the laptop as an interpreter between the user and Alexa.

Singh said, “There’s no reason that Amazon Show, or any of the camera and screen based voice assistants, couldn’t build this functionality right in. To me that’s probably the ultimate use case of what this prototype shows.”

There have been a number of previous attempts to use AI and image recognition to translate sign language. Microsoft, for example, has trialled the use of its motion-sensing Kinect cameras for the purpose, a project fated to dwindle once the Kinect was discontinued in 2017.

YOU MIGHT ALSO LIKE Introducing HandSight: A Tiny Fingertip Camera Helps Blind People Read without Braille.

Nvidia has also explored ways artificial intelligence could be used to automatically caption videos of sign language users, as has the translation software company KinTrans.

Jeffrey Bigham, an expert in human-computer interaction from Carnegie Mellon University, says Mr Singh’s project is “a great proof of concept” but a system fully capable of recognizing sign language would be hard to design “as it requires both computer vision and language understanding that we don’t yet have.”

“Alexa doesn’t really understand English either, of course,” he adds, noting that voice assistants understand only a relatively small set of template phrases.

Alexa Captioning has previously been available for US owners of the Echo Show and Echo Spot. The company is now bringing the feature to users in Canada, Germany, Japan, India, France, Australia, UK and New Zealand.