The mid sagittal plane is defined by three points: the point directly in front of you, the point directly in back of you, and the point directly overhead. A source of sound in this plane is symmetrically located with respect to your two ears, and therefore the sound waves in your two ears ought to be identical. No matter where the source might be in this plane, there ought to be no binaural difference. Nevertheless, you can localize a source of sound that lies in this plane; for instance, you can tell the difference between front and back.

Peter Zhang and Yongfang Zhu set up the VRX experiment in the anechoic room at Michigan State University.

Years of psychoacoustical research have shown that you successfully localize sounds in the mid-sagittal plane because sounds from different directions are filtered differently as they are diffracted around parts of your anatomy, particularly your outer ears. Your outer ears are relatively small, and the diffracted wavelengths in question need to be rather short. Therefore, sound localization in the mid-sagittal plane is thought to be primarily a high-frequency phenomenon, and also a very delicate phenomenon. But of course, your anatomy is not really absolutely symmetrical. And you are a lot more sensitive to small differences in signals at your two ears than to the alleged spectral differences caused by anatomical filtering. Is it possible that mid-sagittal plane localization depends partly on binaural differences?

Lines of questioning like this lead to a host of questions: Is it possible to localize sources in this plane based on the signal at only one ear? Given conflicting information at the two ears, do listeners pay more attention to the spectrum at one ear or the other? Is mid-sagittal plane localization exclusively a high-frequency effect or are there important low-frequency differences, caused by diffraction from larger parts of the anatomy?

These questions are addressed by the VRX experiment - Extreme Virtual Reality. The experiment uses transaural synthesis (cross-talk cancellation) to precisely control the signal in the listener's two ears in such a way as to simulate a signal from in front or in back of the listener. What is extreme about the experiment is that the synthesis needs to work at the high frequencies of mid-sagittal localization - as high as 16,000 Hz. This requirement sets an unprecedented standard for precision.

Using tiny microphones in the listener's ear canals, Peter Zhang, graduate student in the Michigan State University Department of Physics and Astronomy, has developed a transaural synthesis technique that precisely simulates sources in front or back from 200 to 16,000 Hz. Although the listener and the sound sources are resting on a wobbly wire grid in an anechoic room, the technique needs to be capable of geometrical precision of less than 1 millimeter. A critical step in the process attempts to train listeners to distinguish between the real source and the synthesized source. After extensive training, listeners still cannot tell the difference. After this critical test, the experimenter is ready to begin the experiment - looking for binaural aspects of sound localization in the mid-sagittal plane.