



In a new research paper, Google details the technology behind the impressive Project Starline demo from this year’s I / O conference. Project Starline is essentially a 3D video chat booth, aimed at replacing a one-on-one 2D video conference call with the experience of actually sitting in front of a human.

Sounds simple, Google’s research paper highlights how challenging it is to trick your brain into thinking that you have a real person sitting just a few feet away. I am. Obviously, the image should be high resolution and free of disturbing artifacts, but it should also look correct from its relative position within the booth. Audio is another challenge because the system needs to make people’s words sound like they’re actually coming out of the mouth. And there is a small problem with eye contact.

But in the end, we hope that Project Starline will be able to provide users with a presence similar to virtual reality or augmented reality without having to wear bulky headsets or trackers.

Display unit and its various tracking hardware. Image: Google

This white paper details the amount of hardware needed to start resolving these issues. The system is built around a large 65-inch 8K panel that operates at 60Hz. Around it, Google engineers have placed three capture pods that can capture both color images and depth data. The system also includes four additional tracking cameras, four microphones, two speakers, and an infrared projector. Color images and 3 depth maps from a total of 4 perspectives are captured for a total of 7 video streams. Audio is captured at 44.1kHz and encoded at 256Kbps.

Obviously, all this hardware produces a lot of data that needs to be sent. According to Google, the transmit bandwidth ranges from 30Mbps to 100Mbps, depending on the texture details of the user’s clothing and the size of the gesture. In other words, it’s much better than a standard Zoom call, but it can’t be handled by a typical office in a metropolitan area. Project Starline is equipped with four high-end Nvidia graphics cards (two Quadro RTX 6000 cards and two Titan RTX) to encode and decode all this data. The end-to-end delay is reportedly 105.8 ms on average.

The system consists of a backlight unit and a display unit. Image: Google

According to a Google research paper, employees using Starline on the three installed sites will help measure presence, personal connection, attention, and response over traditional video conferences. I think it’s excellent. According to the company, 117 participants held a total of 308 meetings at the telepresence booth over a nine-month period, with an average meeting time of just over 35 minutes.

It all sounds very promising, but it has not yet been shown when the system will or may be commercialized. Little information is also available on the actual cost of Starlines’ extensive hardware (although Table 4 in the research paper outlines the tracking and display hardware to use when calculations are needed). For now, Google says it is expanding the availability of Project Starline in more Google offices across the United States.

