Re: A new paradigm?(On pitch and periodicity (was "correction to post")) (James Johnston )


Subject: Re: A new paradigm?(On pitch and periodicity (was "correction to post"))
From:    James Johnston  <audioskeptic@xxxxxxxx>
Date:    Tue, 6 Sep 2011 22:12:06 -0700
List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>

--bcaec5215dad4b023104ac52fc53 Content-Type: text/plain; charset=ISO-8859-1 That's interesting. One of the best "tonality estimators" ever written (in order to make good masking models for coders you need some estimate of the tone vs. noise character per critical band, well, actually 1/3 band,but anyhow) was a magnitude-phase predictor (polynomial forward predictor 1 -3 3 ) that predicted magnitude and phase separately, then took the actual data for the current block from an FFT and measured the error divided by the two magnitudes of the current and predicted. This worked remarkably well for estimating masking thresholds that were for neither pure tone nor pure noise. It was very, very ad-hoc. I invented it after trying about a gazillion things. But, now, the comments from Ranjit seem, to me, to have something in common here. Which is interesting. I am not going to assert that either is right, but there is a convergence of 'things that work'. jj Sorry, sent only to Richard first time :( On Tue, Sep 6, 2011 at 7:29 PM, Richard F. Lyon <DickLyon@xxxxxxxx> wrote: > Ranjit, > > I commend you on your bravery. > > Dick > > > At 12:53 PM -0400 9/6/11, Ranjit Randhawa wrote: > >> Dick, >> I felt that it would not only be negligent on my part but also cowardice, >> to end this thread without at least offering a possible approach to a new >> "paradigm", even if I am not capable of describing it in excruciating detail >> at the moment. So, with the greatest trepidation, here goes. >> If one were to consider a pure sinusoid in the phase domain (one where the >> axis are x(t) and dx(t)/dt), the locus would be a circle. The area of this >> circle would give us the magnitude, though how to determine this requires a >> different approach as the integration over 2pi would be zero. >> If we consider the product x(t)*dx(t)/dt as the rate of change of energy >> it would have a sign associated with it, then it is possible to determine >> this area, though the resulting algorithm would be too simple and fall apart >> for more complex signals since we don't know the period. To get a more >> general approach, it would be better to consider the circle in sectors of >> harmonically increasing sizes, thereby converting the sinusoid signal to a >> harmonic series, the area of each sector becomes the magnitude of a related >> tonal harmonic with the smaller sectors associated with magnitudes of higher >> frequencies. >> We see then that we start with a single sinusoid being considered not as a >> single valued entity but as a harmonic series and this therefore immediately >> answers two questions, the first being the reported psycho-acoustical >> behavior whereby some people have indicated an ability to recognize >> harmonics of a pure tone and shown by others to be possible by using beats. >> The second being that the extent of the traveling wave can have an >> explanation in that the stiffness of the BM would limit activity for higher >> frequencies as these frequencies would have smaller areas for lower strength >> signals and the TW would grow as this strength increases. >> More importantly, since we are directly using energy to determine >> magnitudes, the missing fundamental would show a magnitude and this >> magnitude would vary depending upon the relative phases of the components. >> This approach also provides for a much reduced computational method to >> determine the period, an alternative form of auto-correlation. These two >> assertions are based on the method chosen for determining the harmonic >> series and the one chosen by me was picked from the field of psychology and >> called "evaluative bivalence", whereby one makes use of sign associated with >> the rate of change of energy in the summation process. >> The modified auto-correlation does work rather well for quasi-periodic >> signals and would welcome any suggestions for practical use, since I don't >> believe it is used by the auditory system. I believe it would fail in the >> "party room" environment. Other options are possible based on the summation >> process. >> There are many consequences of this approach as it now becomes possible to >> provide a more exact method to explain source location capabilities, pitch >> explanations, and I would like to say cochlear functions but have to admit >> my knowledge at that level is focused only on what has been reported on the >> behavior of the Traveling Wave. The rest of it is a mystery to me. >> I would like to apologize if this blurb causes some kind of angst among >> some in this LIST. It was not the intent. I simply wanted to show that >> sticking with existing mathematics has not made much progress in being to >> explain our original discussion and that was "The Case of the MIssing >> Fundamental". Thanks for your understanding and kindness and sorry for this >> delay, >> Randy Randhawa >> >> >> >> >> On 8/4/2011 1:42 PM, Richard F. Lyon wrote: >> >>> Randy, >>> >>> I'll be the first to agree that linear systems theory is sometimes >>> stretched beyond where it makes sense, and that you need to use nonlinear >>> descriptions to describe pitch perception and most other aspects of hearing, >>> and more so when you get up to cognitive levels. >>> >>> I'm sorry to hear that you "gave up on linear systems", because I don't >>> think it's possible to do much sensible with nonlinear systems when you >>> don't have linear systems as a solid base to build on. Certainly at the >>> level of HRFTs, cochlear function, and pitch perception models, a solid >>> understanding of linear systems theory is in indispensible prerequisite. >>> Then, the nonlinear modifications needed to make better models will seem >>> less "tortured". >>> >>> Dick >>> >>> At 10:33 AM -0400 8/4/11, Ranjit Randhawa wrote: >>> >>>> Dear Dick, >>>> While linear system theories seem to work reasonably well with >>>> mechanical systems, I believe they fail when applied to Biological systems. >>>> Consider that even Helmoholtz had to appeal to non-linear processes (never >>>> really described) in the auditory system to account for the "missing >>>> fundamental" and "combination tones". Both of these psycho-acoustical >>>> phenomenon have been well established and explanations for pitch perception >>>> are either spectral based or time based with some throwing in learning and >>>> cognition to avoid having to make the harder decision that maybe this field >>>> needs a new paradigm. This new paradigm should be able to provide a better >>>> model that explains frequency (sound!) analysis in a fashion such that the >>>> nothing is missing and parameter values can be calculated to explain pitch >>>> salience, a subject that seems to be never discussed in pitch perception >>>> models. >>>> Furthermore, such a new approach should also be able to explain why the >>>> cochlear is the shape it is, which as far as I can see has never been >>>> touched upon by existing signal processing methods. Finally, are these >>>> missing components "illusions" that are filled in so to speak by our higher >>>> level cognitive capabilities? It is remarkable that this so called filling >>>> in process is as robust as it is, to be more or less common to everyone, >>>> and therefore one wonders if all the other illusions are really not >>>> illusions but may have a perfectly good basis for their existence. If they >>>> were "illusions" one would expect a fair amount of variation in the >>>> psycho-acoustic experimental results I would think. >>>> I myself gave up on linear systems early in my study of this field and >>>> have felt that other systems, e.g. switching, may offer a better future >>>> explanatory capability, especially when it comes to showing some commonality >>>> of signal processing between the visual and the auditory system. To this >>>> end, I am quite happy to accept that I do not consider myself an expert in >>>> linear system theory. >>>> Regards, >>>> Randy Randhawa >>>> >>>> >>>> On 8/2/2011 1:49 PM, Richard F. Lyon wrote: >>>> >>>>> At 5:55 PM +0300 8/2/11, ita katz wrote: >>>>> >>>>>> The periodicity is determined by the least-common-multiple of the >>>>>> periodicities of the present harmonics, so if (for example) a sound is >>>>>> composed of sines of frequencies 200Hz, 300Hz, and 400Hz, the periods are >>>>>> 5msec, 3 1/3msec, and 2.5msec, so the least-common-multiple is 10msec (2 >>>>>> periods of 5msec, 3 periods of 3.33msec, and 4 periods of 2.5msec), which is >>>>>> of course the periodicity of the sum of the sines, or in other words 100Hz. >>>>>> (actually it is the same as the greatest-common-divisor of the frequencies). >>>>>> >>>>> >>>>> Ita, that explanation is sort of OK, but as written implies that the >>>>> auditory system has the ability to do number-theory operations on periods >>>>> (or frequencies), and depends on there being harmonics present and >>>>> separately measureable. >>>>> >>>>> It would be much more robust to say that "The pitch is determined based >>>>> on an approximately common periodicity of outputs of the cochlea," which I >>>>> believe is consistent with your intent. >>>>> >>>>> Why is this better? First, it doesn't say the periodicity is >>>>> determined; what is determined is the pitch (even that is a bit of stretch, >>>>> but let's go with it). Second, it doesn't depend on whether the signal is >>>>> periodic, that is, whether harmonics exist. Third, it doesn't depend on >>>>> being able to isolate and separately characterize components, harmonic or >>>>> otherwise. Fourth, it doesn't need "multiples" (or divisors), but relies on >>>>> the property of periodicity that a signal with a given period is also >>>>> periodic at multiples of that period, so it only needs to look for "common" >>>>> periodicities--which doesn't require any arithmetic, just simple neural >>>>> circuits. Fifth, it admits approximation, so that things like "the strike >>>>> note of a chime" and noise-based pitch can be accommodated. Sixth, it >>>>> recognizes that the cochlea has a role in pitch perception. It's still not >>>>> complete or perfect, but I think presents a better picture of how it >>>>> actually works, in a form that can be realistically modeled. >>>>> >>>>> Is this "tortured use of existing signal processing techniques" as >>>>> Randy puts it? I don't think so. Is it "a unique way to do frequency >>>>> analysis and to meet the dictum in biology that 'form follows function'"? >>>>> Sure, why not? But why call it "frequency analysis"? How about "a unique >>>>> way to do sound analysis" (if by "unique" we mean common to many animals)? >>>>> >>>>> I do have some sympathy for Randy's concern that we are far from a >>>>> complete understanding, and that hearing aids are not as good as they would >>>>> be if we understood better, but yes, he sounds way too harsh in overblowing >>>>> it so. I'm wondering what's behind that, and whether it's just confusion >>>>> about all the confusing literature on pitch perception, which I agree is a >>>>> complicated mess -- or is the problem, indicated by Randy's previous posts, >>>>> just that he doesn't understand basic linear systems and signal processing, >>>>> and that's why it all seems "tortured"? >>>>> >>>>> Dick >>>>> >>>> -- James D. (jj) Johnston Independent Audio and Electroacoustics Consultant --bcaec5215dad4b023104ac52fc53 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <p>That&#39;s interesting. One of the best &quot;tonality estimators&quot; = ever written (in order to make good masking models for coders you need some= estimate of the tone vs. noise character per critical band, well, actually= 1/3 band,but anyhow) was a magnitude-phase predictor (polynomial forward p= redictor 1 -3 3 ) that predicted magnitude and phase separately, then took = the actual data for the current block from an FFT and measured the error di= vided by the two magnitudes of the current and predicted.</p> <p>=A0</p><p>This worked remarkably well for estimating masking thresholds = that were for neither pure tone nor pure noise.</p><p>=A0</p><p>It was very= , very ad-hoc. I invented it after trying about a gazillion things.</p><p>= =A0</p> <p>But, now, the comments from Ranjit seem, to me, to have something in com= mon here.</p><p>=A0</p><p>Which is interesting. I am not going to assert th= at either is right, but there is a convergence of &#39;things that work&#39= ;.</p> <p>=A0</p><div>jj</div><div>=A0</div><div>Sorry, sent only to Richard first= time :(<br><br></div><div class=3D"gmail_quote">On Tue, Sep 6, 2011 at 7:2= 9 PM, Richard F. Lyon <span dir=3D"ltr">&lt;<a href=3D"mailto:DickLyon@xxxxxxxx= org">DickLyon@xxxxxxxx</a>&gt;</span> wrote:<br> <blockquote style=3D"margin: 0px 0px 0px 0.8ex; padding-left: 1ex; border-l= eft-color: rgb(204, 204, 204); border-left-width: 1px; border-left-style: s= olid;" class=3D"gmail_quote">Ranjit,<br> <br> I commend you on your bravery.<br> <br> Dick<div><div></div><div class=3D"h5"><br> <br> At 12:53 PM -0400 9/6/11, Ranjit Randhawa wrote:<br> <blockquote style=3D"margin: 0px 0px 0px 0.8ex; padding-left: 1ex; border-l= eft-color: rgb(204, 204, 204); border-left-width: 1px; border-left-style: s= olid;" class=3D"gmail_quote"> Dick,<br> I felt that it would not only be negligent on my part but also cowardice, t= o end this thread without at least offering a possible approach to a new &q= uot;paradigm&quot;, even if I am not capable of describing it in excruciati= ng detail at the moment. So, with the greatest trepidation, here goes.<br> If one were to consider a pure sinusoid in the phase domain (one where the = axis are x(t) and dx(t)/dt), the locus would be a circle. The area of this = circle would give us the magnitude, though how to determine this requires a= different approach as the integration over 2pi would be zero.<br> If we consider the product x(t)*dx(t)/dt as the rate of change of energy it= would have a sign associated with it, then it is possible to determine thi= s area, though the resulting algorithm would be too simple and fall apart f= or more complex signals since we don&#39;t know the period. To get a more g= eneral approach, it would be better to consider the circle in sectors of ha= rmonically increasing sizes, thereby converting the sinusoid signal to a ha= rmonic series, the area of each sector becomes the magnitude of a related t= onal harmonic with the smaller sectors associated with magnitudes of higher= frequencies.<br> We see then that we start with a single sinusoid being considered not as a = single valued entity but as a harmonic series and this therefore immediatel= y answers two questions, the first being the reported psycho-acoustical beh= avior whereby some people have indicated an ability to recognize harmonics = of a pure tone and shown by others to be possible by using beats. The secon= d being that the extent of the traveling wave can have an explanation in th= at the stiffness of the BM would limit activity for higher frequencies as t= hese frequencies would have smaller areas for lower strength signals and th= e TW would grow as this strength increases.<br> More importantly, since we are directly using energy to determine magnitude= s, the missing fundamental would show a magnitude and this magnitude would = vary depending upon the relative phases of the components. This approach al= so provides for a much reduced computational method to determine the period= , an alternative form of auto-correlation. These two assertions are based o= n the method chosen for determining the harmonic series and the one chosen = by me was picked from the field of psychology and called &quot;evaluative b= ivalence&quot;, whereby one makes use of sign associated with the rate of c= hange of energy in the summation process.<br> The modified auto-correlation does work rather well for quasi-periodic sign= als and would welcome any suggestions =A0for practical use, since I don&#39= ;t believe it is used by the auditory system. I believe it would fail in th= e &quot;party room&quot; environment. Other options are possible based on t= he summation process.<br> There are many consequences of this approach as it now becomes possible to = provide a more exact method to explain source location capabilities, pitch = explanations, and I would like to say cochlear functions but have to admit = my knowledge at that level is focused only on what has been reported on the= behavior of the Traveling Wave. The rest of it is a mystery to me.<br> I would like to apologize if this blurb causes some kind of angst among som= e in this LIST. It was not the intent. I simply wanted to show that stickin= g with existing mathematics has not made much progress in being to explain = our original discussion and that was &quot;The Case of the MIssing Fundamen= tal&quot;. Thanks for your understanding and kindness and sorry for this de= lay,<br> Randy Randhawa<br> <br> <br> <br> <br> On 8/4/2011 1:42 PM, Richard F. Lyon wrote:<br> <blockquote style=3D"margin: 0px 0px 0px 0.8ex; padding-left: 1ex; border-l= eft-color: rgb(204, 204, 204); border-left-width: 1px; border-left-style: s= olid;" class=3D"gmail_quote"> Randy,<br> <br> I&#39;ll be the first to agree that linear systems theory is sometimes stre= tched beyond where it makes sense, and that you need to use nonlinear descr= iptions to describe pitch perception and most other aspects of hearing, and= more so when you get up to cognitive levels.<br> <br> I&#39;m sorry to hear that you &quot;gave up on linear systems&quot;, becau= se I don&#39;t think it&#39;s possible to do much sensible with nonlinear s= ystems when you don&#39;t have linear systems as a solid base to build on. = Certainly at the level of HRFTs, cochlear function, and pitch perception mo= dels, a solid understanding of linear systems theory is in indispensible pr= erequisite. =A0Then, the nonlinear modifications needed to make better mode= ls will seem less &quot;tortured&quot;.<br> <br> Dick<br> <br> At 10:33 AM -0400 8/4/11, Ranjit Randhawa wrote:<br> <blockquote style=3D"margin: 0px 0px 0px 0.8ex; padding-left: 1ex; border-l= eft-color: rgb(204, 204, 204); border-left-width: 1px; border-left-style: s= olid;" class=3D"gmail_quote"> Dear Dick,<br> While linear system theories seem to work reasonably well with mechanical s= ystems, I believe they fail when applied to Biological systems. Consider th= at even Helmoholtz had to appeal to non-linear processes (never really desc= ribed) in the auditory system to account for the &quot;missing fundamental&= quot; and &quot;combination tones&quot;. Both of these psycho-acoustical ph= enomenon have been well established and explanations for pitch perception a= re either spectral based or time based with some throwing in learning and c= ognition to avoid having to make the harder decision that maybe this field = needs a new paradigm. This new paradigm should be able to provide a better = model that explains frequency (sound!) analysis in a fashion such that the = nothing is missing and parameter values can be calculated to explain pitch = salience, a subject that seems to be never discussed in pitch perception mo= dels.<br> Furthermore, such a new approach should also be able to explain why the coc= hlear is the shape it is, which as far as I can see has never been touched = upon by existing signal processing methods. Finally, are these missing comp= onents &quot;illusions&quot; that are filled in so to speak by our higher l= evel cognitive capabilities? It is remarkable that this so called filling i= n process is as robust as it is, to =A0be more or less common to everyone, = and therefore one wonders if all the other illusions are really not illusio= ns but may have a perfectly good basis for their existence. If they were &q= uot;illusions&quot; one would expect a fair amount of variation in the psyc= ho-acoustic experimental results I would think.<br> I myself gave up on linear systems early in my study of this field and have= felt that other systems, e.g. switching, may offer a better future explana= tory capability, especially when it comes to showing some commonality of si= gnal processing between the visual and the auditory system. To this end, I = am quite happy to accept that I do not consider myself an expert in linear = system theory.<br> Regards,<br> Randy Randhawa<br> <br> <br> On 8/2/2011 1:49 PM, Richard F. Lyon wrote:<br> <blockquote style=3D"margin: 0px 0px 0px 0.8ex; padding-left: 1ex; border-l= eft-color: rgb(204, 204, 204); border-left-width: 1px; border-left-style: s= olid;" class=3D"gmail_quote"> At 5:55 PM +0300 8/2/11, ita katz wrote:<br> <blockquote style=3D"margin: 0px 0px 0px 0.8ex; padding-left: 1ex; border-l= eft-color: rgb(204, 204, 204); border-left-width: 1px; border-left-style: s= olid;" class=3D"gmail_quote"> The periodicity is determined by the least-common-multiple of the periodici= ties of the present harmonics, so if (for example) a sound is composed of s= ines of frequencies 200Hz, 300Hz, and 400Hz, the periods are 5msec, 3 1/3ms= ec, and 2.5msec, so the least-common-multiple is 10msec (2 periods of 5msec= , 3 periods of 3.33msec, and 4 periods of 2.5msec), which is of course the = periodicity of the sum of the sines, or in other words 100Hz. (actually it = is the same as the greatest-common-divisor of the frequencies).<br> </blockquote> <br> Ita, that explanation is sort of OK, but as written implies that the audito= ry system has the ability to do number-theory operations on periods (or fre= quencies), and depends on there being harmonics present and separately meas= ureable.<br> <br> It would be much more robust to say that &quot;The pitch is determined base= d on an approximately common periodicity of outputs of the cochlea,&quot; w= hich I believe is consistent with your intent.<br> <br> Why is this better? =A0First, it doesn&#39;t say the periodicity is determi= ned; what is determined is the pitch (even that is a bit of stretch, but le= t&#39;s go with it). =A0Second, it doesn&#39;t depend on whether the signal= is periodic, that is, whether harmonics exist. Third, it doesn&#39;t depen= d on being able to isolate and separately characterize components, harmonic= or otherwise. =A0Fourth, it doesn&#39;t need &quot;multiples&quot; (or div= isors), but relies on the property of periodicity that a signal with a give= n period is also periodic at multiples of that period, so it only needs to = look for &quot;common&quot; periodicities--which doesn&#39;t require any ar= ithmetic, just simple neural circuits. =A0Fifth, it admits approximation, s= o that things like &quot;the strike note of a chime&quot; and noise-based p= itch can be accommodated. =A0Sixth, it recognizes that the cochlea has a ro= le in pitch perception. =A0It&#39;s still not complete or perfect, but I th= ink presents a better picture of how it actually works, in a form that can = be realistically modeled.<br> <br> Is this &quot;tortured use of existing signal processing techniques&quot; a= s Randy puts it? =A0I don&#39;t think so. =A0Is it &quot;a unique way to do= frequency analysis and to meet the dictum in biology that &#39;form follow= s function&#39;&quot;? =A0Sure, why not? =A0But why call it &quot;frequency= analysis&quot;? =A0How about &quot;a unique way to do sound analysis&quot;= (if by &quot;unique&quot; we mean common to many animals)?<br> <br> I do have some sympathy for Randy&#39;s concern that we are far from a comp= lete understanding, and that hearing aids are not as good as they would be = if we understood better, but yes, he sounds way too harsh in overblowing it= so. =A0I&#39;m wondering what&#39;s behind that, and whether it&#39;s just= confusion about all the confusing literature on pitch perception, which I = agree is a complicated mess -- or is the problem, indicated by Randy&#39;s = previous posts, just that he doesn&#39;t understand basic linear systems an= d signal processing, and that&#39;s why it all seems &quot;tortured&quot;?<= br> <br> Dick<br> </blockquote></blockquote></blockquote></blockquote> </div></div></blockquote></div><br><br clear=3D"all"><br>-- <br><div>James = D. (jj) Johnston</div><div>Independent Audio and Electroacoustics Consultan= t</div><br> --bcaec5215dad4b023104ac52fc53--


This message came from the mail archive
/var/www/postings/2011/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University