PhD Stuff
Note: I need to redo this mini-site. The text here isn't exactly wrong, but it could be clearer, and better-worded.
Main Aim
To investigate the connection between audio and visuals with a view to creating a musical (or more strictly, audiovisual) instrument based around this connection.
The Instrument
The instrument (named Ashitaka) is intended to be analogous to a musical
block of clay - visually represented as an amorphous 'blob'
which can be manipulated the same way clay can be (i.e. you can stretch it,
squeeze it, twist it, mold it with your fingers). The audio side of things
will be based on a physical model of some kind. The original idea was that
this would simplify the mappings quite a bit, but having investigated
mappings a bit more that seems to have been wishful thinking. The mapping
system currently being developed will combine single audio and visual
perceptual parameters into an integrated audiovisual parameter. The
instrument will then have a number of these audiovisual parameters, which
will be mapped to the parameters of the performer's interface. See the
images page for some more descriptive block
diagrams.
Furthermore, the instrument will exist within a 3d environment in the
computer, where it may interact with other objects or instruments.
Technology
X3D will be used as a file format for
representing the 3d environment, and towards this end, an X3D browser is
being developed, using Ambisonics to spatialise the audio of the instrument
(and any other objects in the environment). For the instrument's physical
model, a relatively simple model (modified by certain further audio
processes) will be used, based on the audio engine from the
Tao physical modelling language. The
software will be GPL, and I will hopefully release the first version fairly
soon.
I'm also going to create a hardware interface for the instrument, designed
to offer the same kind of gestures possible with a block of clay. See the
images page for the current prototype and a very out of date mockup.
Also, one of the requirements of my funding is that the software must take
advantage of parallel processing hardware (i.e. multiple processors/hyper
threading), so it's going to be heavily multi-threaded, and should
(hopefully) run more efficiently on a multi-processor machine than
traditional audio software.
Main Ideas
- Motion as the connection between audio and visuals: This is an idea born from Michel Chion's notion of Synchresis, from Audio-Vision (amazon.co.uk). There's a paper on it on the Writings page, but the basic idea is that audio and visuals may be connected by their respective motions (assuming those motions are related in some way). An example would be: imagine a film/video of a pen falling onto a desk. When the pen hits the desk, our experience of the world tells us to expect a corresponding sound of some kind. This sound does not have to be the sound of a pen hitting a table for our brain to make a connection between sound and image, as long as there is some kind of similarity between the motion of the pen as we see it and the motion of the accompanying sound (by which I mean the sounds amplitude envelope, its spectral content etc.). This idea forms the basis for the audiovisual mappings which are themselves the primary focus of this PhD.
- Audiovisual Parameters: This is a mapping approach intended to incorporate the above idea into an instrument, where as well as audio and visual elements, you also have the performer's gestures to consider. Essentially an audiovisual parameter is the combination of a single perceptual audio feature and a single perceptual visual feature, mapped to each other somehow, and controlled by a single input from the performer. The idea then is that an audiovisual parameter is equivalent to a synthesis parameter in a traditional musical instrument, and that our audiovisual instrument would have a number of these audiovisual parameters, mapped using various methods to the performer's gestures.
- Gestural Capture: In addition to the above mapping approach, a higher-level strategy will also be employed, primarily to exploit the instrument's 3d accelerometer data. Essentially, gestural capture will be used to identify certain patterns in the performer's gestures, and when a pattern is detected, it will trigger a particular state change, or set up certain conditions which will evolve over time.
- The Instrument as an Ecosystem: Ashitaka is conceived as a kind of ecosystem, made up of various objects (organisms?) which interact with each other and are manipulated by the performer. See here.