Time-Offset Conversations on a Life-Sized Automultiscopic Projector Array
Time-Offset Conversations on a Life-Sized Automultiscopic Projector Array

Andrew Jones    Koki Nagano    Jay Busch    Xueming Yu    Hsuan-Yueh Peng    Joseph Barreto    Oleg Alexander    Mark Bolas    Paul Debevec   
USC Institute for Creative Technologies, USA


Figure 1: (left) The anisotropic screen scatters light from each projector into a vertical stripe. The individual stripes can be seen if we reduce the angular density of projectors. Each vertical stripe contains pixels from a different projector.

 


Abstract:

We present a system for creating and displaying inter- active life-sized 3D digital humans based on pre-recorded interviews. We use 30 cameras and an extensive list of ques- tions to record a large set of video responses. Users access videos through a natural conversation interface that mimics face-to-face interaction. Recordings of answers, listening and idle behaviors are linked together to create a persis- tent visual image of the person throughout the interaction. The interview subjects are rendered using flowed light fields and shown life-size on a special rear-projection screen with an array of 216 video projectors. The display allows mul- tiple users to see different 3D perspectives of the subject in proper relation to their viewpoints, without the need for stereo glasses. The display is effective for interactive con- versations since it provides 3D cues such as eye gaze and spatial hand gestures.



Introduction:

What would it be like if you could meet someone you admire, such as your favorite artist, scientist, or even world leader, and engage in an intimate one-on-one conversation? Face-to-face interaction remains one of the most compelling forms of communication. Unfortunately in many cases, a particular subject may not be available for live conversation. Speakers are both physically and logistically limited in how many people they can personally interact with. Yet the ability to have conversations with important historical figures could have a wide range of applications from entertainment to education.

Traditional video recording and playback allows for speakers to communicate with a broader audience but at the cost of interactivity. The conversation becomes a one-sided passive viewing experience when the narrative timeline is chosen early in the editing process. Particularly with first person narratives, it can be especially compelling to look the speaker in the eye, ask questions, and make a personal connection with the narrator and their story. Research has shown that people retain more information through active discussion over a passive lecture.



Results:

The first full application of this technology was to preserve the experience of in-person interactions with Holocaust survivors. Currently, many survivors visit museums and classrooms to educate, connect and inspire students. There is now an urgency to record these interactive narratives for the few remaining survivors before these personal encounters are no longer possible. Through 3D recording, display, and interacting, we seek to maintain a sense of intimacy and presence, and remain relevant to the future.

Our first subject was Pinchas Gutter. Mr Gutter was born in Poland in 1932, lived in a Warsaw ghetto and survived six concentration camps before being liberated by the Russians in 1945. The interview script was based on the top 500 questions typically asked of Holocaust survivors, along with stories catered to his particular life story. The full dataset includes 1897 questions totaling 18 hours of dialog. These questions are linked to 10492 training questions providing enough variation to simulate spontaneous and informative conversations. The interactive system was first demonstrated on an 80-inch 2D video screen at the Illinois Holocaust Museum and Education Center [26]. A user study based based on the 2D playback found that interactive video inspired students to help others, learn about genocide, and feel they could make a difference. Several students noted that that the experience felt like a video teleconference with a live person.

 



Figure 2: (left) View generated using bilinear interpolation exhibits aliasing. (center) View generated using optical flow interpolation has sharper edges and less aliasing. (right) Closeups of face.


Figure 3: Stereo photograph of subjects on the display from three positions, left-right reversed for cross-fused stereo viewing.



Conclusion:

The problem of simulating natural human interaction is a long standing problem in computer science. Our system is able to imitate conversations with real human subjects by selecting from a large database of prerecorded 3D video statements. The interface is intuitive responding to regular spoken questions. We further increase the sense of presence by playing back each video clip in 3D on a dense projector array. We envisage that this system could be used to document a wide range of subjects such as scientists, politicians, or actors with applications in education and entertainment. We are working to generalize the interview framework to other domains, where less prior knowledge exists for each subject. 3D displays such as ours should become increasingly practical in the years to come as the core graphics and image projection components decrease in price and increase in capabilities. Our user interaction and rendering algorithms could also be adapted to other types of 3D displays. Our hope is that this technology will provide a new way for people communicate with each other and the past.




Material:

PAPER:


Related Projects:

Footer With Address And Phones