Re: [AUDITORY] Logan's theorem - a challenge (Prof Leslie Smith )


Subject: Re: [AUDITORY] Logan's theorem - a challenge
From:    Prof Leslie Smith  <l.s.smith@xxxxxxxx>
Date:    Mon, 27 Sep 2021 10:34:27 +0100

I sen this originally Alain de Chaveigne, but perhaps I should have made it more public. Here goes. Dear Alain: I did some related work with my student Madhuranda Pahar some while ago: it ended up with the publication linked to below. What we did was to resynthesize speech (or any other sound) from zero-crossings (positive-going only) in band-limited signals (using the gamma tone filterbank) plus some information about the maximal size of th= e signal in the previous half-cycle. In essence, given a surprisingly small number of channels, plus a little information about the signal level (i.e. a log-based coding of the signal amplitude in the previous half-cycle, using 4 or 5 values - called threshold levels in the paper), one can quite easily make out the speech. It's not a wonderful paper, and could do with more work and more examples= , and the resynthesis is not particularly straightforward (but that's not important - what matters is the possibility of resynthesis, as the brain interprets the AN signal, rather than re-creating it. And we'd never hear= d of Logan's theorem (unfortunately!). Still, I hope this might be of interest. I believe i have the Matlab code still (but it could do with being reworked. The paper can be found at http://www.cs.stir.ac.uk/~lss/recentpapers/PID6701133.pdf Reference: M.Pahar, L.S. Smith Coding and Decoding Speech using a Biologically Inspired Coding System presented at IEEE SSCI 2020, (virtual conference) 1-4 December 2020. DOI 10.1109/SSCI47803.2020.9308328. --Leslie Smith Alain de Cheveigne wrote: > Hi all, > > Here=E2=80=99s a challenge for the young nimble minds on this list, and= the old > and wise. > > Logan=E2=80=99s theorem states that a signal can be reconstructed from = its zero > crossings, to a scale, as long as the spectral representation of that > signal is less than an octave wide. It sounds like magic given that ze= ro > crossing information is so crude. How can the full signal be recovered > from a sparse series of time values (with signs but no amplitudes)? > =E2=80=9CBand-limited=E2=80=9D is clearly a powerful assumption. > > Why is this of interest in the auditory context? The band-limited prem= ise > is approximately valid for each channel of the cochlear filterbank > (sometimes characterized as a 1/3 octave filter). While cochlear > transduction is non-linear, Logan=E2=80=99s theorem suggests that any > information lost due to that non-linearity can be restored, within each > channel. If so, cochlear transduction is =E2=80=9Ctransparent=E2=80=9D,= which is > encouraging for those who like to speculate about neural models of > auditory processing. An algorithm applicable to the sound waveform can = be > implemented by the brain with similar results, in principle. > > Logan=E2=80=99s theorem has been invoked by David Marr for vision and s= everal > authors for hearing (some refs below). The theorem is unclear as to how > the original signal should be reconstructed, which is an obstacle to > formulating concrete models, but in these days of machine learning it > might be OK to assume that the system can somehow learn to use the > information, granted that it=E2=80=99s there. The hypothesis has far-r= eaching > implications, for example it implies that spectral resolution of centra= l > auditory processing is not limited by peripheral frequency analysis (as > already assumed by for example phase opponency or lateral inhibitory > hypotheses). > > Before venturing further along this limb, it=E2=80=99s worth considerin= g some > issues. First, Logan made clear that his theorem only applies to a > perfectly band-limited signal, and might not be =E2=80=9Capproximately = valid=E2=80=9D > for a signal that is =E2=80=9Capproximately band-limited=E2=80=9D. No = practical > signal is band-limited, if only because it must be time limited, and th= us > the theorem might conceivably not be applicable at all. On the other > hand, half-wave rectification offers much richer information than zero > crossings, so perhaps the end result is valid (information preserved) e= ven > if the theorem is not applicable stricto sensu. Second, there are many > other imperfections such as adaptation, stochastic sampling to a > spike-based representation, and so on, that might affect the usefulness= of > the hypothesis. > > The challenge is to address some of these loose ends. For example: > (1) Can the theorem be extended to make use of a halfwave-rectified sig= nal > rather than zero crossings? Might that allow it to be applicable to > practical time-limited signals? > (2) What is the impact of real cochlear filter characteristics, > adaptation, or stochastic sampling? > (3) In what sense can one say that the acoustic signal is "available=E2= =80=9D to > neural signal processing? What are the limits of that concept? > (4) Can all this be formulated in a way intelligible by non-mathematica= l > auditory scientists? > > This is the challenge. The reward is - possibly - a better understandi= ng > of how our brain hears the world. > > Alain > > --- > Logan BF, JR. (1977) Information in the zero crossings of bandpass > signals. Bell Syst. Tech. J. 56:487=E2=80=93510. > > Marr, D. (1982) VISION - A Computational Investigation into the Human > Representation and Processing of Visual Information. W.H. Freeman and C= o, > republished by MIT press 2010. > > Heinz, M.G., Swaminathan J. (2009) Quantifying Envelope and Fine-Struct= ure > Coding in Auditory Nerve Responses to Chimaeric Speech, JARO 10: 407=E2= =80=93423 > DOI: 10.1007/s10162-009-0169-8. > > Shamma, S, Lorenzi, C (2013) On the balance of envelope and temporal fi= ne > structure in the encoding of speech in the early auditory system, J. > Acoust. Soc. Am. 133, 2818=E2=80=932833. > > Parida S, Bharadwaj H, Heinz MG (2021) Spectrally specific temporal > analyses of spike-train responses to complex sounds: A unifying framewo= rk. > PLoS Comput Biol 17(2): e1008155. > https://doi.org/10.1371/journal.pcbi.1008155 > > de Cheveign=C3=A9, A. (in press) Harmonic Cancellation, a Fundamental o= f > Auditory Scene Analysis. Trends in Hearing (https://psyarxiv.com/b8e5w/= ). --=20 Prof Leslie Smith (Emeritus) Computing Science & Mathematics, University of Stirling, Stirling FK9 4LA Scotland, UK Tel +44 1786 467435 Web: http://www.cs.stir.ac.uk/~lss Blog: http://lestheprof.com


This message came from the mail archive
src/postings/2021/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University