Pour avoir le descriptif détaillé en français: téléchargez d'abord le premier morceau (dont Table des Matières et Intro)
Page created: 10/17/2000. First published: 10/24/2000. Last update: ...
Back to the Research Page - Back to GyronymO's Home Page
This document has been written in french, produced as a
PostScript file, cut into pieces, then zipped (following paragraphs).
You
can also download it in one piece as a zipped PDF file
(about 3 Mo).
Front page, Abstracts, Table Of Contents, and Introduction (pp 1-12) [46 Ko].
Part I : Theoretical Foundations and Preliminary Analysis.
Chapter 1 (pp 13-76) [967 Ko]: Acoustics,
Spatial Perception, and a thourough proof/interpretation of the localisation
theories (Makita, Gerzon) based on the velocity V and energy
E
vectors. Explicit prediction laws are presented between the propagation
characterisation (by V and E) and the localisation
cues (esp. ITD). (See also the PowerPoint presentation)
Chapter 2 (pp 77-144) [914 Ko]:
Spatial Reproduction Principles and associated Sound Field Representation
Strategies. Are discussed: two channel and multichannel stereophony; traditional
(1st order) ambisonic systems; binaural and transaural, including new strategies
(binaural B-format, double-transaural). More than a classical review, this
chapter presents an analysis of the rendering properties which is based
on the acoustic characterisation of the synthetized field, and the localisation
cues (esp. ITD) behaviour as the head rotates or moves. Three main classes
of virtual sound imaging over loudspeakers are defined. Each of them is
charaterised by the compromises between the accuracy/strength of the reproduced
binaural cues and their robustness and naturalness as the head moves.
Part II : Higher Order Ambisonics.
Chapter 3 (pp 145-204) [702 Ko]:
Extension of almost all ambisonic aspects to all higher order (2D/3D),
from the encoding to the decoding. The most important chapter for those
who are interested in high order systems. The extended encoding formalism
is presented as well as conversion formulae between the existing encoding
conventions. Some extended transformation formulae are also presented.
The optimised decoding solutions which were existing only for first order
systems (Gerzon's optimisation for a centred listener, and Malham's solution
for large areas) are generalised into three solution families (called basic,
max
rE and in-phase solutions). Underlying concepts and properties
are explicited: the fundamental properties attached to the truncature of
the spherical harmonic decomposition; the notion of directional sampling
of the spherical harmonic base and its regularity property (which is involved
in the decoding problem and in the definition of ambisonic microphones).
The PowerPoint presentation I used to defend
my thesis should give a very appreciated help to understand these points.
Chapter 4 (pp 205-242) [1150 Ko]:
Evaluation of the rendering for various listening conditions (ideal, ie
individual centred listening position, or collective, ie off-centred
listening position). The objective analysis are corroborated by informal
listening tests. They highlight the contribution of higher orders and optimised
solutions. The removal of some accomodating hypothesis (plane wave hyp.,
etc.) is also thouroughly discussed, regarding especially: the near field
effect of the loudspeakers (finite distance), its compensation, and the
convergence towards a holophonic system; the effect of a near field encoded
source (spherical wave) when a true acoustic reconstruction cannot be achieved
even for low frequencies (large listening area); the effect of the reproduction
room.
Part III [493 Ko] : Implementation,
Tests and Applications.
Chapter 5 (pp 243-274): Some implementation aspects regarding
the applications I developed to experiment and demonstrate various spatialisation
techniques: binaural, transaural, VBAP, Ambisonics (up to the second order,
at the time), room effect synthesis, and some details about the interfaces
(2D/3D). Suggestions are made for using these applications as evaluation
tools.
Chapter 6 (pp 275-292): Applications and perspectives. To begin,
I expose the experience of one possible use of trad. Ambisonics: encoding
and transmitting multichannel sound (originally 5 channels). Then I discuss
potential applications and perspectives related to high order ambisonics:
mixt decoding (1st+2nd order); 3D impulse responses for room effect synthesis
or for diffuse field analysis, etc...
General Conclusion (pp 293-294)
Bibliography (pp 295-300) and Appendices (pp 301-320) [84 Ko]. Appendix A deals with questions related to the spherical/cylindrical harmonic decomposition(s): mathematical background and memento; the problem of the diffraction by a rigid sphere (used for simulations and illustrations); details on the optimisation of high order ambisonic decoding.
Corrected versions of AES papers [380 Ko] (to be included in appendices B and C).
This PowerPoint Presentation has been used while defending the thesis
(09/19/00).
It contains many pictures and movies which should make many points
of my thesis more understandable for everyone (french people and others!).
The movies (.AVI) are downloadable by groups (zip files) and can be played
outside the PowerPoint presentation.
I intend to make an adaptation as a HTML document (as soon as possible),
with more comments, in french and in english.
Caution:
All figures, pictures and movies have been entirely made by myself.
This material is dedicated to an educational purpose. Please do not modify
this document or any part of it without my permission. Please do not use
it for a public presentation without my permission or without an appropriate
acknowledgement.
The zipped presentation [1419
Ko].
First group of movies (21 movies)
[1685 Ko].
Second group (2 movies) [1191 Ko].
Third group (2 movies) [966 Ko].
Fourth group (2 movies) [1355 Ko].
Comments:
This material covers the major points of the postscript document with
roughly the same structure.
Though I had only 45 minutes to defend my thesis, (much) more time
would be required to travel through the whole presentation properly.
According to what the "reader" is mostly interested in, he/she can ignore
some slides and focus on others, of course.
The first substantial page appears after some introductive slides for
instance. This is a very didactic page which introduces and describes the
main mechanisms for auditory localisation, focusing on static and dynamic
lateralisation mechanisms. It highlights specifications for an efficient
3D sound reproduction, regarding not only the localisation performance
and naturalness, but also the preservation of spatial qualities and impressions,
such as the envelopment.
The following slides may also be found to be of great interest...
Advices for use:
Last note:
This presentation is not accompanied with detailled comments yet (I
should have already written them!). But the differents points and concepts
are supposed to appear in a very progressive and logical way. So don't
hesitate to spend a little time on observing the figures and the movies,
and on putting them in relation with the equations or the words that have
just appeared or that appear just after...
An adaptation of the content of my thesis, inspired by my PowerPoint presentation, in HTML and in english. This should have a thematic structure (encoding formulae, decoder definition, etc...), with different viewpoints (a practical approach with directly usable formulae and instructions; a more theoretical approach with the underlying concepts and properties).
More illustrations and movies: 3D rendering, off-center localisation...
"The
experimenter corner" (with sounds!): binaural simulations of
centred and off-centred listening experiences with various ambisonic renderings...
an audible proof of the contribution of high order and optimised solutions!
(Just let me the time for converting some
.wav files to mp3)