Subject: Re: [AUDITORY] Logan's theorem - a challenge From: Ken Grant <ken.w.grant@xxxxxxxx> Date: Tue, 28 Sep 2021 04:43:32 -0400--00000000000000834105cd0a36de Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable This discussion reminded me of an important and often overlooked paper by Ron Cole and Brian Scott *Cole, R. A., & Scott, B. (1974). Toward a theory of speech perception. Psychological Review, 81(4), 348=E2=80=93374. https://doi.org/10.1037/h0036656 <https://psycnet.apa.org/doi/10.1037/h0036656>* V/r Ken On Tue, Sep 28, 2021 at 2:42 AM Prof Leslie Smith <l.s.smith@xxxxxxxx> wrote: > I sen this originally Alain de Chaveigne, but perhaps I should have made > it more public. Here goes. > > Dear Alain: > > I did some related work with my student Madhuranda Pahar some while ago: > it ended up with the publication linked to below. > > What we did was to resynthesize speech (or any other sound) from > zero-crossings (positive-going only) in band-limited signals (using the > gamma tone filterbank) plus some information about the maximal size of th= e > signal in the previous half-cycle. > > In essence, given a surprisingly small number of channels, plus a little > information about the signal level (i.e. a log-based coding of the signal > amplitude in the previous half-cycle, using 4 or 5 values - called > threshold levels in the paper), one can quite easily make out the speech. > > It's not a wonderful paper, and could do with more work and more examples= , > and the resynthesis is not particularly straightforward (but that's not > important - what matters is the possibility of resynthesis, as the brain > interprets the AN signal, rather than re-creating it. And we'd never hear= d > of Logan's theorem (unfortunately!). > > Still, I hope this might be of interest. I believe i have the Matlab code > still (but it could do with being reworked. > > The paper can be found at > http://www.cs.stir.ac.uk/~lss/recentpapers/PID6701133.pdf > > Reference: M.Pahar, L.S. Smith Coding and Decoding Speech using a > Biologically Inspired Coding System > presented at IEEE SSCI 2020, (virtual conference) 1-4 December 2020. DOI > 10.1109/SSCI47803.2020.9308328. > > --Leslie Smith > > Alain de Cheveigne wrote: > > Hi all, > > > > Here=C3=A2=E2=82=AC=E2=84=A2s a challenge for the young nimble minds on= this list, and the old > > and wise. > > > > Logan=C3=A2=E2=82=AC=E2=84=A2s theorem states that a signal can be reco= nstructed from its zero > > crossings, to a scale, as long as the spectral representation of that > > signal is less than an octave wide. It sounds like magic given that ze= ro > > crossing information is so crude. How can the full signal be recovered > > from a sparse series of time values (with signs but no amplitudes)? > > =C3=A2=E2=82=AC=C5=93Band-limited=C3=A2=E2=82=AC is clearly a powerful= assumption. > > > > Why is this of interest in the auditory context? The band-limited > premise > > is approximately valid for each channel of the cochlear filterbank > > (sometimes characterized as a 1/3 octave filter). While cochlear > > transduction is non-linear, Logan=C3=A2=E2=82=AC=E2=84=A2s theorem sugg= ests that any > > information lost due to that non-linearity can be restored, within each > > channel. If so, cochlear transduction is =C3=A2=E2=82=AC=C5=93transpare= nt=C3=A2=E2=82=AC , which is > > encouraging for those who like to speculate about neural models of > > auditory processing. An algorithm applicable to the sound waveform can = be > > implemented by the brain with similar results, in principle. > > > > Logan=C3=A2=E2=82=AC=E2=84=A2s theorem has been invoked by David Marr f= or vision and several > > authors for hearing (some refs below). The theorem is unclear as to how > > the original signal should be reconstructed, which is an obstacle to > > formulating concrete models, but in these days of machine learning it > > might be OK to assume that the system can somehow learn to use the > > information, granted that it=C3=A2=E2=82=AC=E2=84=A2s there. The hypot= hesis has far-reaching > > implications, for example it implies that spectral resolution of centra= l > > auditory processing is not limited by peripheral frequency analysis (as > > already assumed by for example phase opponency or lateral inhibitory > > hypotheses). > > > > Before venturing further along this limb, it=C3=A2=E2=82=AC=E2=84=A2s w= orth considering some > > issues. First, Logan made clear that his theorem only applies to a > > perfectly band-limited signal, and might not be =C3=A2=E2=82=AC=C5=93ap= proximately valid=C3=A2=E2=82=AC > > for a signal that is =C3=A2=E2=82=AC=C5=93approximately band-limited=C3= =A2=E2=82=AC . No practical > > signal is band-limited, if only because it must be time limited, and th= us > > the theorem might conceivably not be applicable at all. On the other > > hand, half-wave rectification offers much richer information than zero > > crossings, so perhaps the end result is valid (information preserved) > even > > if the theorem is not applicable stricto sensu. Second, there are many > > other imperfections such as adaptation, stochastic sampling to a > > spike-based representation, and so on, that might affect the usefulness > of > > the hypothesis. > > > > The challenge is to address some of these loose ends. For example: > > (1) Can the theorem be extended to make use of a halfwave-rectified > signal > > rather than zero crossings? Might that allow it to be applicable to > > practical time-limited signals? > > (2) What is the impact of real cochlear filter characteristics, > > adaptation, or stochastic sampling? > > (3) In what sense can one say that the acoustic signal is "available=C3= =A2=E2=82=AC > to > > neural signal processing? What are the limits of that concept? > > (4) Can all this be formulated in a way intelligible by non-mathematica= l > > auditory scientists? > > > > This is the challenge. The reward is - possibly - a better understandi= ng > > of how our brain hears the world. > > > > Alain > > > > --- > > Logan BF, JR. (1977) Information in the zero crossings of bandpass > > signals. Bell Syst. Tech. J. 56:487=C3=A2=E2=82=AC=E2=80=9C510. > > > > Marr, D. (1982) VISION - A Computational Investigation into the Human > > Representation and Processing of Visual Information. W.H. Freeman and C= o, > > republished by MIT press 2010. > > > > Heinz, M.G., Swaminathan J. (2009) Quantifying Envelope and > Fine-Structure > > Coding in Auditory Nerve Responses to Chimaeric Speech, JARO 10: > 407=C3=A2=E2=82=AC=E2=80=9C423 > > DOI: 10.1007/s10162-009-0169-8. > > > > Shamma, S, Lorenzi, C (2013) On the balance of envelope and temporal fi= ne > > structure in the encoding of speech in the early auditory system, J. > > Acoust. Soc. Am. 133, 2818=C3=A2=E2=82=AC=E2=80=9C2833. > > > > Parida S, Bharadwaj H, Heinz MG (2021) Spectrally specific temporal > > analyses of spike-train responses to complex sounds: A unifying > framework. > > PLoS Comput Biol 17(2): e1008155. > > https://doi.org/10.1371/journal.pcbi.1008155 > > > > de Cheveign=C3=83=C2=A9, A. (in press) Harmonic Cancellation, a Fundame= ntal of > > Auditory Scene Analysis. Trends in Hearing (https://psyarxiv.com/b8e5w/ > ). > > > -- > Prof Leslie Smith (Emeritus) > Computing Science & Mathematics, > University of Stirling, Stirling FK9 4LA > Scotland, UK > Tel +44 1786 467435 > Web: http://www.cs.stir.ac.uk/~lss > Blog: http://lestheprof.com > --=20 Ken W. Grant, Ph.D. Chief, Scientific and Clinical Studies Section America Building, Room 5601 Walter Reed National Military Medical Center 4954 North Palmer Road Bethesda, MD 20889-5630 OFFICE: 301-319-7043 CELL: 301-919-2957 kenneth.w.grant.civ@xxxxxxxx ken.w.grant@xxxxxxxx --00000000000000834105cd0a36de Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto">This discussion reminded me of an important and often ove= rlooked paper by Ron Cole and Brian Scott</div><div dir=3D"auto"><span styl= e=3D"font-family:sans-serif;font-size:14px;color:rgb(51,51,51)"><br></span>= </div><div dir=3D"auto"><span><b><span style=3D"font-family:sans-serif;font= -size:14px;color:rgb(51,51,51)">Cole, R. A., & Scott, B. (1974). Toward= a theory of speech perception.=C2=A0</span><em style=3D"box-sizing:border-= box;font-family:sans-serif;font-size:14px;color:rgb(51,51,51)">Psychologica= l Review, 81</em><span style=3D"font-family:sans-serif;font-size:14px;color= :rgb(51,51,51)">(4), 348=E2=80=93374.=C2=A0</span><a target=3D"_blank" href= =3D"https://psycnet.apa.org/doi/10.1037/h0036656" style=3D"box-sizing:borde= r-box;font-family:sans-serif;text-decoration:none;font-size:14px;color:rgb(= 44,114,183)">https://doi.org/10.1037/h0036656</a></b></span><br></div><div = dir=3D"auto"><span><br></span></div><div dir=3D"auto"><span>V/r</span></div= ><div dir=3D"auto"><span><br></span></div><div dir=3D"auto"><span>Ken</span= ></div><div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_= attr">On Tue, Sep 28, 2021 at 2:42 AM Prof Leslie Smith <<a href=3D"mail= to:l.s.smith@xxxxxxxx">l.s.smith@xxxxxxxx</a>> wrote:<br></div= ><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border= -left-width:1px;border-left-style:solid;padding-left:1ex;border-left-color:= rgb(204,204,204)">I sen this originally=C2=A0 Alain de Chaveigne, but perha= ps I should have made<br> it more public. Here goes.<br> <br> Dear Alain:<br> <br> I did some related work with my student Madhuranda Pahar some while ago:<br= > it ended up with the publication linked to below.<br> <br> What we did was to resynthesize speech (or any other sound) from<br> zero-crossings (positive-going only) in band-limited signals (using the<br> gamma tone filterbank) plus some information about the maximal size of the<= br> signal in the previous half-cycle.<br> <br> In essence, given a surprisingly small number of channels, plus a little<br= > information about the signal level (i.e. a log-based coding of the signal<b= r> amplitude in the previous half-cycle, using 4 or 5 values - called<br> threshold levels in the paper), one can quite easily make out the speech.<b= r> <br> It's not a wonderful paper, and could do with more work and more exampl= es,<br> and the resynthesis is not particularly straightforward (but that's not= <br> important - what matters is the possibility of resynthesis, as the brain<br= > interprets the AN signal, rather than re-creating it. And we'd never he= ard<br> of Logan's theorem (unfortunately!).<br> <br> Still, I hope this might be of interest. I believe i have the Matlab code<b= r> still (but it could do with being reworked.<br> <br> The paper can be found at<br> <a href=3D"http://www.cs.stir.ac.uk/~lss/recentpapers/PID6701133.pdf" rel= =3D"noreferrer" target=3D"_blank">http://www.cs.stir.ac.uk/~lss/recentpaper= s/PID6701133.pdf</a><br> <br> Reference: M.Pahar, L.S. Smith Coding and Decoding Speech using a<br> Biologically Inspired Coding System<br> presented at IEEE SSCI 2020, (virtual conference) 1-4 December 2020. DOI<br= > 10.1109/SSCI47803.2020.9308328.<br> <br> --Leslie Smith<br> <br> Alain de Cheveigne wrote:<br> > Hi all,<br> ><br> > Here=C3=A2=E2=82=AC=E2=84=A2s a challenge for the young nimble minds o= n this list, and the old<br> > and wise.<br> ><br> > Logan=C3=A2=E2=82=AC=E2=84=A2s theorem states that a signal can be rec= onstructed from its zero<br> > crossings, to a scale, as long as the spectral representation of that<= br> > signal is less than an octave wide.=C2=A0 It sounds like magic given t= hat zero<br> > crossing information is so crude. How can the full signal be recovered= <br> > from a sparse series of time values (with signs but no amplitudes)?<br= > > =C3=A2=E2=82=AC=C5=93Band-limited=C3=A2=E2=82=AC=C2=A0 is clearly a po= werful assumption.<br> ><br> > Why is this of interest in the auditory context?=C2=A0 The band-limite= d premise<br> > is approximately valid for each channel of the cochlear filterbank<br> > (sometimes characterized as a 1/3 octave filter).=C2=A0 While cochlear= <br> > transduction is non-linear, Logan=C3=A2=E2=82=AC=E2=84=A2s theorem sug= gests that any<br> > information lost due to that non-linearity can be restored, within eac= h<br> > channel. If so, cochlear transduction is =C3=A2=E2=82=AC=C5=93transpar= ent=C3=A2=E2=82=AC , which is<br> > encouraging for those who like to speculate about neural models of<br> > auditory processing. An algorithm applicable to the sound waveform can= be<br> > implemented by the brain with similar results, in principle.<br> ><br> > Logan=C3=A2=E2=82=AC=E2=84=A2s theorem has been invoked by David Marr = for vision and several<br> > authors for hearing (some refs below). The theorem is unclear as to ho= w<br> > the original signal should be reconstructed, which is an obstacle to<b= r> > formulating concrete models, but in these days of machine learning it<= br> > might be OK to assume that the system can somehow learn to use the<br> > information, granted that it=C3=A2=E2=82=AC=E2=84=A2s there.=C2=A0 The= hypothesis has far-reaching<br> > implications, for example it implies that spectral resolution of centr= al<br> > auditory processing is not limited by peripheral frequency analysis (a= s<br> > already assumed by for example phase opponency or lateral inhibitory<b= r> > hypotheses).<br> ><br> > Before venturing further along this limb, it=C3=A2=E2=82=AC=E2=84=A2s = worth considering some<br> > issues.=C2=A0 First, Logan made clear that his theorem only applies to= a<br> > perfectly band-limited signal, and might not be =C3=A2=E2=82=AC=C5=93a= pproximately valid=C3=A2=E2=82=AC <br> > for a signal that is =C3=A2=E2=82=AC=C5=93approximately band-limited= =C3=A2=E2=82=AC .=C2=A0 No practical<br> > signal is band-limited, if only because it must be time limited, and t= hus<br> > the theorem might conceivably not be applicable at all.=C2=A0 On the o= ther<br> > hand, half-wave rectification offers much richer information than zero= <br> > crossings, so perhaps the end result is valid (information preserved) = even<br> > if the theorem is not applicable stricto sensu.=C2=A0 Second, there ar= e many<br> > other imperfections such as adaptation, stochastic sampling to a<br> > spike-based representation, and so on, that might affect the usefulnes= s of<br> > the hypothesis.<br> ><br> > The challenge is to address some of these loose ends. For example:<br> > (1) Can the theorem be extended to make use of a halfwave-rectified si= gnal<br> > rather than zero crossings? Might that allow it to be applicable to<br= > > practical time-limited signals?<br> > (2) What is the impact of real cochlear filter characteristics,<br> > adaptation, or stochastic sampling?<br> > (3) In what sense can one say that the acoustic signal is "availa= ble=C3=A2=E2=82=AC=C2=A0 to<br> > neural signal processing?=C2=A0 What are the limits of that concept?<b= r> > (4) Can all this be formulated in a way intelligible by non-mathematic= al<br> > auditory scientists?<br> ><br> > This is the challenge.=C2=A0 The reward is - possibly - a better under= standing<br> > of how our brain hears the world.<br> ><br> > Alain<br> ><br> > ---<br> > Logan BF, JR. (1977) Information in the zero crossings of bandpass<br> > signals. Bell Syst. Tech. J. 56:487=C3=A2=E2=82=AC=E2=80=9C510.<br> ><br> > Marr, D. (1982) VISION - A Computational Investigation into the Human<= br> > Representation and Processing of Visual Information. W.H. Freeman and = Co,<br> > republished by MIT press 2010.<br> ><br> > Heinz, M.G., Swaminathan J. (2009) Quantifying Envelope and Fine-Struc= ture<br> > Coding in Auditory Nerve Responses to Chimaeric Speech, JARO 10: 407= =C3=A2=E2=82=AC=E2=80=9C423<br> > DOI: 10.1007/s10162-009-0169-8.<br> ><br> > Shamma, S, Lorenzi, C (2013) On the balance of envelope and temporal f= ine<br> > structure in the encoding of speech in the early auditory system, J.<b= r> > Acoust. Soc. Am. 133, 2818=C3=A2=E2=82=AC=E2=80=9C2833.<br> ><br> > Parida S, Bharadwaj H, Heinz MG (2021) Spectrally specific temporal<br= > > analyses of spike-train responses to complex sounds: A unifying framew= ork.<br> > PLoS Comput Biol 17(2): e1008155.<br> > <a href=3D"https://doi.org/10.1371/journal.pcbi.1008155" rel=3D"norefe= rrer" target=3D"_blank">https://doi.org/10.1371/journal.pcbi.1008155</a><br= > ><br> > de Cheveign=C3=83=C2=A9, A. (in press) Harmonic Cancellation, a Fundam= ental of<br> > Auditory Scene Analysis. Trends in Hearing (<a href=3D"https://psyarxi= v.com/b8e5w/" rel=3D"noreferrer" target=3D"_blank">https://psyarxiv.com/b8e= 5w/</a>).<br> <br> <br> -- <br> Prof Leslie Smith (Emeritus)<br> Computing Science & Mathematics,<br> University of Stirling, Stirling FK9 4LA<br> Scotland, UK<br> Tel +44 1786 467435<br> Web: <a href=3D"http://www.cs.stir.ac.uk/~lss" rel=3D"noreferrer" target=3D= "_blank">http://www.cs.stir.ac.uk/~lss</a><br> Blog: <a href=3D"http://lestheprof.com" rel=3D"noreferrer" target=3D"_blank= ">http://lestheprof.com</a><br> </blockquote></div></div>-- <br><div dir=3D"ltr" class=3D"gmail_signature" = data-smartmail=3D"gmail_signature">Ken W. Grant, Ph.D.<br>Chief, Scientific= and Clinical Studies Section<br>America Building, Room 5601<br>Walter Reed= National Military Medical Center<br>4954 North Palmer Road<br>Bethesda, MD= 20889-5630<br>=C2=A0<br>OFFICE:=C2=A0 301-319-7043<br>CELL:=C2=A0 301-919-= 2957<br>=C2=A0<br><a href=3D"mailto:kenneth.w.grant.civ@xxxxxxxx">kenneth.w= .grant.civ@xxxxxxxx</a><br><a href=3D"mailto:ken.w.grant@xxxxxxxx">ken.w.g= rant@xxxxxxxx</a></div> --00000000000000834105cd0a36de--