[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: perceptual learning

To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: perceptual learning
From: Peter Meijer <peter.b.l.meijer@xxxxxxxxxxx>
Date: Mon, 3 Apr 2000 13:09:18 +0200
Comments: cc: ward@IHR.GLA.AC.UK
Reply-to: peter.b.l.meijer@xxxxxxxxxxx
Sender: AUDITORY Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>
Ward writes

> So, if an object in the visual space were turned to sound for a
> blind person--would it then not be most useful to have the sound
> emanate (perceptually) from the location of the object?

Yes, as long as there is only one object or at most a very
few objects, this approach would be intuitive and have the
advantage of near-instantaneous sensory feedback. Unfortunately,
this "spatial" or "3D" sound scheme that is so popular with games
doesn't scale and generalize well to arbitrary and more complex
images. It breaks down quite rapidly when there are more than a
few objects or shapes in a planar ("photographic") view. Basically,
it gets hard to avoid that when you have something like a bright
spot on your left and an identical bright spot on your right that
the combination sounds as a single spot right in front of you. With
a simple sine wave to indicate lateral position through ITD and
ILD this becomes readily apparent: the ITD and ILD effects cancel
each other in a left-right symmetrical situation, while pitch is
needed to indicate elevation and is assumed the same for the left
and right spot in this example. Using a more complex spectrum for
a spot does not help here either as long as left-right symmetry
is to be maintained. A "time division multiplexing" scheme (term
coined originally by Paul Bach-Y-Rita for his tactile matrix
displays, but in our case it could be coined "inverse spectrographic"
or something like that) resolves the problem by breaking the
symmetry while also allowing for a higher image resolution by
spreading the visual information in time, but the resulting dynamic
spectral profile may indeed be no longer quite as intuitive, and
perceptual latency is now an inevitable consequence or artefact.
However, part of the intuitive lateral localization can be restored
by adding stereo panning to the time axis of the general "inverse
spectrographic" image to sound mapping. More on that below.

By the way, if it is mainly for detecting nearest obstacles rather
than general vision substitution, then a distance-to-pitch mapping
with directional (spatial, 3D) sound can indeed work quite well,
and a number of sonar-based mobility devices for blind people are
in fact making use of this. There is a publication on the auditory
perception of objects by blind persons using sonar devices forthcoming
in the JASA of May 2000.

> Peter has height mapped to frequency and left to right mapped
> to time (would this be right to left in some countries?)

Yes, we'll have a special edition for the Middle-East area
as soon as the market justifies extended regional support. ;-)

> At any rate, it wouldn't be tremendously difficult with headphones
> to have that image shift from left to right acoustically as it does
> visually.

True, and that is what the The vOICe "Learning Edition" does
by default: it supplements and reinforces the perception of
left-to-right scanning through stereo panning, using ILD and
ITD. HRTF would not be very useful here because the spectral
shape is already being used to convey visual content in this
image to sound mapping. For those interested, an illustration
of this scanning + stereo panning is the sound of an artificial
image showing one period of a bright sine wave trace and ten
little bright squares in a sound sample of just one second 
duration (22K file size) at the URL

   http://www.seeingwithsound.com/voiscopebw.wav

or its two-second "slow-motion" version (44K) at

   http://www.seeingwithsound.com/voiscopebw2.wav

One can make spectrograms of these sound samples to check out
their visual content against what one hears (using a logarithmic
frequency scale for proper results), and use autorepeat or an
equivalent option during playback to hear the short samples a
number of times. Listeners will likely notice that their brain
applies "mental saccades" to focus attention to the different
time-frequency components within the sound as the same sound
sample is being played repeatedly. Hearing it "all at once" in
a single pass appears to be rather hard to an untrained person.

> Persons who are blind may well have a heightened auditory sensations
> (since all the brain mass most people used to see could be used by
> the other senses)

A few years ago, Cohen et al. showed (or at least made plausible)
through fMRI measurements that congenitally blind people use part
of their visual cortex for tactile processing:

   L. G. Cohen, P. Celnik et al., ``Functional relevance of cross-modal
   plasticity in blind humans,'' Nature, vol. 389, pp. 180-183, Sept. 11,
   1997.

I haven't seen comparable publications on auditory processing in the
visual cortex of blind people yet, but it would seem plausible too
when conceptually generalizing from various kinds of other neuroscientific
experiments where neighbouring cortical areas tend to "take over" areas
of amputated limbs, and with coordination-affecting RSI (not the more
common painful variant) apparently mixing signals in normally distinct
cortical areas, etc.

Another issue is sensory deprivation: to what extent does part of
the cortex perhaps degenerate for lack of input from the senses? For 
instance, most recently a blind man regained partial sight through
surgery after over fourty years of near-total blindness, and (for
the moment at least) he has significant problems understanding what
he sees with his eyes - although he is highly intelligent, and has
seen normally until he was three years old. I had the pleasure of
meeting this man last year before his surgery. Now these special
cases of (problems with) "learning to see" can be of great interest
to any attempt to provide some form of vision substitution through
technical means, because the whole chain from the purely technical
to low-level auditory perception, higher-level processing, neural
plasticity and human psychology (for motivation) needs to be covered,
and the weakest link could be just about anywhere along the chain.

Best wishes,

Peter Meijer


Seeing with Sound - The vOICe
http://www.seeingwithsound.com/voice.htm
Prev by Date: Re: Jud Wolfskill (MIT Press): Boulanger book announcement
Next by Date: Threshold in quiet
Previous by thread: Re: perceptual learning
Next by thread: [Fwd: perceptual learning]
Index(es):
- Date
- Thread