[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] Gammatone filter bank in MATLABr2019a



Dear Les and all,

this is my (not complete) list:

https://engineering.purdue.edu/~malcolm/interval/1998-010/
https://code.soundsoftware.ac.uk/projects/aimmat
http://legacy.spa.aalto.fi/software/HUTear/
http://amtoolbox.sourceforge.net/
https://uol.de/en/mediphysics/downloads/ (now also at
https://doi.org/10.5281/zenodo.2643400)

Best, Volker

On 17.04.2019 15:10, Bernstein,Leslie wrote:
> Thanks, Volker.  A link to those implementations would be very helpful.
> 
> Les
> 
> On 4/17/2019 4:58 AM, Volker Hohmann wrote:
>> Dear Dick and all,
>>
>> just want to add that the re-synthesis method they apply is not optimal.
>> I would recommend using the Matlab implementations contributed by our
>> community, which have been described properly in citable publications,
>> are readily available and have been running flawlessly for many years
>> under whatever Matlab version came out.
>>
>> Best regards,
>>
>> Volker
>>
>> On 17.04.2019 02:51, Richard F. Lyon wrote:
>>> Bastian,
>>>
>>> That's an interesting distinction that needs to be made, between the
>>> peripheral and "whole system" auditory filter, whether gammatone or
>>> otherwise.  In my book, I say this about that (in Part III – The
>>> Auditory Periphery):
>>>
>>>     13.1 What Is an Auditory Filter?
>>>     The auditory filters that we consider here include both those
>>>     motivated by psychoacoustic experiments, such as detection of tones
>>>     in noise maskers, and those motivated by reproducing the observed
>>>     mechanical response of the basilar membrane or neural response of
>>>     the auditory nerve. One thesis of this work is that a single model
>>>     can do a good job for both of these, and thereby provide a good
>>>     basis for a machine hearing system. Since there are several stages
>>>     of neural processing between the cochlea and our psychoacoustic
>>>     perceptions, it would not be surprising if the best parameters were
>>>     different between these types of models, but it seems likely that
>>>     the linear and nonlinear filtering due to the cochlea plays a
>>>     sufficient role in perception that we may find one set of parameters
>>>     is adequate, at least for a range of machine hearing applications.
>>>
>>>
>>> And to be fair, the gammatone was originally proposed as a model of frog
>>> hearing physiology, and is widely used in cochlear models, even though
>>> Patterson popularized it in the psychoacoustic domain.
>>>
>>> So the MathWorks ought to be more careful what they say.  I'd have
>>> several other quibbles with their docs (in the Audio Toolbox reference
>>> at https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mathworks.com_help_pdf-5Fdoc_audio_index.html&d=DwIFaQ&c=EZxp_D7cDnouwj5YEFHgXuSKoUq2zVQZ_7Fw9yfotck&r=2Pw2GwelGcMR4953G-STHGpPJm2-pYYYSPmTwJk3sWM&m=GHXIqZnxZ7ZjCjlEGmDuiQlnjJQizpHYy3weycRYNko&s=BE9euCO95AcdvV7T4r3Kob_OyFq4F1_v9-0p75nY_Ok&e=). 
>>>
>>> Quibbles:
>>>
>>> 1. "The gammatoneFilterBank follows the algorithm described in [1] and
>>> first proposed by [2]."  [1] is Slaney's method, a simple filter cascade
>>> based on analyzing the Laplace transform of the gammatone.  [2] is
>>> Patterson et al.'s "Complex Sounds and Auditory Images", a great paper
>>> but it doesn't say one word about how to implement the gammatone (they
>>> did have other implementation papers elsewhere, but not this method and
>>> not here).
>>>
>>> 2. Ref 2 says "the shape of the magnitude characteristic of the
>>> gammatone filter is very similar to that of the roex(p) filter commonly
>>> used to represent the magnitude characteristic of the human auditory
>>> filter."  Mathworks says "The gammatone filter is similar to the roex
>>> filter derived from the notched-noise
>>> experiment."  A cursory look at more recent literature on auditory
>>> filters, including Patterson's, would suggest omitting or at least
>>> tempering this claim.  See my book Chapter 13 or this paper:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__storage.googleapis.com_pub-2Dtools-2Dpublic-2Dpublication-2Ddata_pdf_36895.pdf&d=DwIFaQ&c=EZxp_D7cDnouwj5YEFHgXuSKoUq2zVQZ_7Fw9yfotck&r=2Pw2GwelGcMR4953G-STHGpPJm2-pYYYSPmTwJk3sWM&m=GHXIqZnxZ7ZjCjlEGmDuiQlnjJQizpHYy3weycRYNko&s=_Jft13aI1rDz891VcgKid-OKGfUIm6NugFjoDEcj1lg&e=
>>>
>>> 3. Error where it says b –– bandwidth, set to 1.019*erb2hz(fc).  Either
>>> the documentation is wrong, or the functionality is wrong.  Hopefully
>>> the former.
>>>
>>> 4. The parameterization by only FrequencyRange, NumFilters, and
>>> SampleRate is rather impoverished.  It is not documented whether the
>>> filters match the ERB bandwidth if some of these parameters are changed,
>>> or whether adjacent filters continue to cross over about 3 dB down; you
>>> can't have both, but you might want one or the other, and there's not
>>> enough control to say what you want.  With a few more parameters one
>>> could do useful comparisons, tradeoffs, and tunings of filter numbers,
>>> orders, bandwidths, and phases for example.  With just a few more one
>>> could include better auditory filter variants (that differ only in the
>>> locations of the zeros of the cascaded second-order filters), including
>>> APGF and OZGF.
>>>
>>> R2019a also adds gtcc (gammatone cepstral coefficients).  Their
>>> algorithm uses log(energy) before the DCT, instead of the cube root
>>> proposed by the Shao et al. reference, which also uses a slightly
>>> different acronym:  GFCC (gammatone frequency cepstral coefficients). 
>>> Not clear why.  The referenced paper did not really investigate whether
>>> their improvement over mfcc was due to the different frequency scale
>>> (700 Hz  mel vs 229 Hz ERB break point between linear and exponential
>>> spacing), or the filter shape (triangle vs gammatone), or the
>>> nonlinearity (log vs cube root), or the domain of implementation
>>> (frequency vs time). With the impoverished parameterizations of these
>>> functions in the audio toolboxes, it's hard to further compare such
>>> things (though the gtcc does allow some of that).  The other gtcc ref
>>> (Rabiner and Schafer) has nothing on gammatone or gtcc or gfcc.
>>>
>>> I could go on...
>>>
>>> Dick
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Apr 16, 2019 at 12:24 AM Bastian Epp
>>> <000000a94eb56441-dmarc-request@xxxxxxxxxxxxxxx
>>> <mailto:000000a94eb56441-dmarc-request@xxxxxxxxxxxxxxx>> wrote:
>>>
>>>     Dear list,
>>>
>>>     This morning I read through the release notes of MATLAB R2019a and was
>>>     happy to find that there was an implementation of a Gammatone filter
>>>     bank included:
>>>
>>>     "Gammatone Filter Bank: Mimic the human auditory system"
>>>
>>>     With the reference to (among others):
>>>
>>>     Glasberg, Brian R., and Brian CJ Moore. "Derivation of Auditory Filter
>>>     Shapes from Notched-Noise Data." Hearing Research. Vol. 47. Issue 1-2,
>>>     1990, pp. 103 –138.
>>>
>>>     This made me quite happy because it is a proper description of what
>>>     Gammatone filter banks most often are used for - to model the frequency
>>>     selectivity of the auditory system (as measured using psychoacoustics).
>>>
>>>     However, in the DOC page, they show a picture of the Basilar membrane
>>>     on top with the frequency response of the filter bank - suggesting that
>>>     there exists a 1:1 correspondance.
>>>
>>>     Everybody needs a topic to grow old and grumpy on - mine is this: 
>>>
>>>     From my point of view, this is only correct under the (overly strong?)
>>>     assumption that the cochlear is the only place in the auditory system
>>>     underlying the perceptually observed frequency selectivity. Measuring
>>>     "auditory filters" means to evaluate the auditory system as a
>>>     whole (the concept of a "neuron" also only makes sense when being
>>>     embedded in its network). "Cochlear filters" are measured on/in the
>>>     cochlea . 
>>>
>>>     Besides the common critiques (linearity, coarse approximation of the
>>>     actual "filter" shape, etc), the main problem in my point of view is
>>>     that we teach students that we can "measure" the function of a
>>>     "subsystem" (the cochlea) using a method that assesses the function of
>>>     the "whole" system. There are some data sets that suggest a strong
>>>     link, but the "tool" of psychoacoustics simply does not allow such a
>>>     statement.
>>>
>>>     Even though I like the working hypothesis "The brain exists to keep the
>>>     cochlea warm", I think equating cochlear frequency selectivity with
>>>     auditory filters (without explicitly stating the assumption that no(!)
>>>     element along the auditory pathway modifies this frequency selectivity)
>>>     is a point where we could  be more careful to avoid misconceptions and
>>>     overly strong conclusions. In most publications and books, this point
>>>     is not explicitly wrong, but not as precise as it could be in my
>>>     opinion.
>>>
>>>     I hope that someone from MATHWORKS follows this list and considers a
>>>     more careful description in the DOCs. I would also be happy to compile
>>>     all the constructive arguments that people might have for/against my
>>>     point of view.
>>>
>>>     Have a great day everybody!
>>>
>>>     BAstian
>>>
>>>
>>>
>>>
>>>     -- 
>>>     Bastian Epp
>>>     Associate Professor
>>>
>>>     DTU Healthtech    
>>>     ------------------------------------
>>>     Technical University of Denmark
>>>     Ørsteds Plads
>>>     Building 352, Room 118
>>>     2800 Kgs. Lyngby
>>>     Direct +45 45253953
>>>     bepp@xxxxxx <mailto:bepp@xxxxxx>
>>>     https://urldefense.proofpoint.com/v2/url?u=http-3A__www.dtu.dk_english&d=DwIFaQ&c=EZxp_D7cDnouwj5YEFHgXuSKoUq2zVQZ_7Fw9yfotck&r=2Pw2GwelGcMR4953G-STHGpPJm2-pYYYSPmTwJk3sWM&m=GHXIqZnxZ7ZjCjlEGmDuiQlnjJQizpHYy3weycRYNko&s=UtZyeOWPT8vhvgDk4ouA5eLQ9REPci24KX0I7LjUw3s&e=
>>>
> 
> 
> -- 
> *Leslie R. Bernstein, Ph.D. **| *Professor
> Depts. of Neuroscience and Surgery (Otolaryngology)| UConn School of
> Medicine
> 263 Farmington Avenue, Farmington, CT 06030-3401
> Office: 860.679.4622 | Fax: 860.679.2495
> 
> 

-- 
---------------------------------------------------------
Prof. Dr. Volker Hohmann
Medizinische Physik and Cluster of Excellence Hearing4all
Universität Oldenburg
D-26111 Oldenburg
Germany

Tel. +49 441 798 5468
FAX  +49 441 798 3902
Email volker.hohmann@xxxxxxxxxxxxxxxx
http://www.uni-oldenburg.de/mediphysik/
http://www.uni-oldenburg.de/auditorische-signalverarbeitung/
Public Key and Key Fingerprint
http://medi.uni-oldenburg.de/members/vh/pubkey_vh_uni.txt
C75A 8A8D 9408 28EE FCFD 20CA 1D9F 23CC BAD2 B967
---------------------------------------------------------