AI-powered sonar on smart glasses tracks gaze and facial expressions

Researchers at Cornell University have developed two technologies that track people's gaze and facial expressions through sonar-like sensing.

The technology is small enough to fit into off-the-shelf smart glasses and virtual reality (VR) or augmented reality (AR) headsets, while consuming significantly less power than similar tools that use cameras.

Both use speakers and microphones attached to the frames of the glasses to bounce inaudible sound waves off your face and pick up reflected signals caused by facial and eye movements. One of his devices, GazeTrak, is the first eye-tracking system to rely on acoustic signals. His second, EyeEcho, is the first glasses-based system to continuously and accurately detect facial expressions and reproduce them in real-time through an avatar.

Ke Li is the principal investigator for the GazeTrak smart eyewear tracking technology.

The device can last several hours on battery in smart glasses and more than a day on VR headset.

Cheng Zhang, assistant professor of information science in the Cornell Ann S. Bowers College of Computing and Information Sciences He said no. Zhang, who created the new device, directs the Smart Computer Interfaces for Future Interactions (SciFi) Lab.

Ke Li, a doctoral student in information science who led the development of GazeTrak and EyeEcho, said, “In a VR environment, detailed facial expressions and eye movements can be used to better interact with other users. We need to recreate it.”

Using audio signals instead of video also reduces privacy concerns, Lee said. There are a number of camera-based systems for tracking facial expressions and eye movements in this research field, such as Vision Pro and Oculus, and there are also commercial products, he said. But not everyone wants a wearable's camera to be able to capture you and your surroundings all the time.

Li will present GazeTrak: Exploring Acoustic-based Eye Tracking on a Glass Frame at the Annual International Conference on Mobile Computing and Networking (MobiCom24) in September. I will make an announcement. From the 30th to October 4th.

“As VR/AR headsets become significantly smaller and eventually become similar to today's smart glasses, privacy concerns related to systems that use video will become increasingly important,” co-authors said. said Franois Gimbletire, a Cornell Bowers CIS professor of information science. Multicollege Department of Design Technology Both technologies are extremely small and power efficient, making them lightweight and sleek, making them ideal for AR glasses.

For GazeTrak, the researchers placed one speaker and four microphones inside the frame of each eye of the glasses, reflecting and picking up sound waves from the eyeball and the area around the eye. The resulting audio signal is fed into a customized deep learning pipeline that uses artificial intelligence to continuously infer the direction of a person's gaze.

Although GazeTrak still doesn't work as well as state-of-the-art eye-tracking technology that relies on cameras, the new device proves the concept that audio signals can also be useful. The researchers believe that with further optimization, they can achieve the same accuracy and reduce the number of speakers and microphones needed.

In EyeEcho's case, one speaker and one microphone are placed next to the hinge of the glasses, pointing downwards to capture the movement of the skin as facial expressions change. The reflected signals are also interpreted using AI.

This technology allows users to make hands-free video calls through their avatars, even in noisy cafes or on the street. While some smart glasses have the ability to recognize faces and distinguish between some specific facial expressions, there are currently none that continuously track facial expressions like EyeEcho.

Lee will be presenting this work, “EyeEcho: Continuous and low We present “Output Facial Expression Tracking”.

These two advances have uses beyond enhancing a person's VR experience. GazeTrak can be used with Screen Reader to read portions of text aloud when people with low vision browse websites.

GazeTrak and EyeEcho may also be useful in diagnosing and monitoring neurodegenerative diseases such as Alzheimer's disease and Parkinson's disease. In these conditions, patients often have abnormal eye movements and dull facial expressions, and this type of technology could potentially track the progress of the disease from the comfort of a patient's home.

Several Cornell researchers also contributed to the study, including information science doctoral students Ruidong Zhang, Mose Saketa, and Saif Mahmud. James Chen 24 years old, Sean Chen 24 years old, Kenny Liang 24 years old. and Sicheng ying, master's student at the University of Edinburgh.

This research was supported by the National Science Foundation and the IGNITE Innovation Acceleration Program.

Patricia Waldron is a writer in the Cornell Ann S. Bowers College of Computing and Information Sciences.




