During face-to-face conversation, eye contact, and gaze direction provide important visual cues to express emotion, attention, and interest. Unfortunately, 2D video telecoferencing does not recreate accurate eye contact breaking the emotion connecting between participants. When a remote participant looks directly into the camera, everyone watching the video stream sees the same image of the participant looking towards them; if the participant looks away from the camera, no one receives the eye gaze. Our one-to-many teleconferencing system uses a novel arrangement of 3D acquisition, transmission, and display technologies to achieve accurate reproduction of gaze direction and eye contact.
The 3D display works by projecting high-speed video onto a rapidly spinning mirror made from brushed aluminum. As the mirror turns, it reflects a different and accurate image to each potental viewer. The size, geometry, and material of the spinning surface have been optimized for the display of a life-sized human face. Its two-sided shape provides two passes of a display surface to each viewer per full rotation, achieving a 30Hz visual update at 900 rpm. A pair of high speed DLP projectors project 8,640 1-bit (black or white) frames per second using a specially coded DVI video signal. Instead of rendering a color image,each projector takes a 24-bit color frame of video and displays each bit sequentially as seperate frames. Effectively, the mirror reflects 144 unique views of the scene across a 180-degree field of view with an angular view separation of 1.25 degrees.
A 90 degree field of view 2D video feed allows the remote participant to view the audience interacting with their three-dimensional image on the 3D display. A polarized beam-splitter is used to virtually place the camera close to the position of the eyes of the three-dimensional head. The video from the aligned 3D display camera is transmitted to the remote participant where it is shown on a geometrically calibrated LCD screen.
To correct the vertical perspective on the 3D display, we use marker-less face detection from OpenCV to track viewers based on the 2D video feed. In this way, the display's horizontal parallax provides binocular stereo with no lag as the viewers move their heads horizontally, while vertical parallax is achieved through tracking.
The ICT Graphics Laboratory graciously acknowledges the hardware support of nVIDIA
Quadro FX 5800 cards provided by for this project, achieving our rendering speeds
of 8,640 frames per second.
SIGGRAPH 2009
3D Teleconferencing Images
How does the display work?
We project high-speed (4,320 frames per second) video of different views of the face onto
a spinning mirror. As the mirror spins, the different views are reflected to all different
viewpoints around the display. In our teleconferencing system, seventy-two views of the
face rendered by a PC with a modern graphics card are projected across 180 degrees field
of view. By the time the mirror rotates to reflect images from one eye position to another,
a different view of the scene is projected, so the result appears to be three-dimensional.
Is this a hologram?
It's similar to the "Holograms" in the Star Wars movies in that it's a three-dimensional
animated image of a person floating in space, and the displayed objects can have the general
appearance of solidity rather than appearing as illuminated volumes of mist. However, no
holographic film or traditional holography is involved to achieve the three-dimensional effect.
Could the display be in color?
Yes, we'd just need to modify a three-chip DLP video projector to achieve high speed
video projection instead of the single-chip projector we've used in our prototype.
We could also use multiple single-chip projectors to achieve the effect. Or, as in
the previous iteration,
we achieved color with a multicolored display surface.
Is this like CNN's "Hologram" they showed during their 2008 presidential election coverage?
No, our system produces a real 3D image you can actually see whereas CNN's "hologram"
was just a visual effect. To the home viewer, CNN's anchor Wolf Blitzer appeared to
be looking at three-dimensional images of guests Jessica Yellin and Will.i.am,
but this was a video overlay effect done for the home audience with no actual
3D display technology involved. In CNN's studio, Blitzer
was actually staring across empty space toward a standard 2D flat panel television.
What is the spinning surface you're projecting onto?
It's a steep tent shape formed by two panels of 8-by-10 inch brushed aluminum - the
same material found on many kitchen appliances. We use this material since it is
inexpensive and spreads light vertically but not horizontally, which allows people
of any height to see the image formed on the display. It rotates fifteen times per
second, reflecting thirty frames per second of 3D video from the two sides of the tent.
Could you display a whole body?
Yes, though to display a person life size we would need a larger spinning mirror to
project onto. At the end of our first 3D Display project video, we showed a five-inch-tall
3D image of Bruce (one of our group's researchers) realistically running in place.
Why are you doing this? What is the benefit of a 3D teleconference?
We are hoping to better recreate the natural, rich communication that people experience
in person-to-person conversations, especially the effects of gaze, attention, and eye
contact between a speaker and one or more other people. These effects are not reproduced
in conventional 2D videoconferences: if the speaker looks at the camera, they appear to
be looking at everyone; if they look away, they are looking at no one. With our system,
the speaker can look at any of the people he or she is speaking to, and all participants
see a natural view of the speaker.