Subject: Re: About importance of "phase" in sound recognition From: "reinifrosch@xxxxxxxx" <reinifrosch@xxxxxxxx> Date: Thu, 7 Oct 2010 09:33:40 +0000 List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>------=_Part_1364_8485894.1286444020374 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello (again) Emad, Today I redid the synthesizer experiment described yesterday, not with sine= tones, but with harmonic complex tones (voice C25, harmonica). First case:= press white key C1 (440.0 Hz), keep it pressed, then press black key C#1 (= 440.0 Hz, too). Now from one try to the next, both the loudness and the tim= bre changed strongly. The period of a 440-Hz-tone is T =3D 1/440 =3D 0.0022= 72727... second. If, e.g., the delay between the two tones happens to amoun= t to 238.5 T =3D 0.542045 second, or to 391.5 T =3D 0.889773 second, then t= he fundamental (also called first partial or first harmonic) is extinguishe= d, but the first overtone (the second partial or harmonic) is enhanced. Sec= ond case: press key C1 (440.0 Hz), keep it pressed, then press key D1 (441.= 6 Hz). There ist no difference from one try to the next (1.6 beats per seco= nd). Reinhart.=20 ----Urspr=C3=BCngliche Nachricht---- Von: emad.burke@xxxxxxxx Datum: 05.10.2010 19:03 An: <AUDITORY@xxxxxxxx> Betreff: Re: About importance of "phase" in sound recognition Hi Kevin, thanks for the reply. the phase definition that I'm talking about is more o= f the third definition of yours. I'm exactly talking about what is called = "in-sensitivity to phase". I'm talking about the phase information that is = discarded in the process of MFCC feature extraction and it has been proven = to be succesfull feature set for speech recognition. The "insensitivity to = phase" that implicitly implies that if you change the order (precedence) of= travelling waves in each cochlear channel among each other, it will not af= fect the perception and you can add random phases to different channels wit= hout affecting the perception(?). Now this was on one hand. On the other hand, couple of years ago there was = a publication by a mathematician (pete-cassaza) that kind of reinforced the= argument of phase insensitivety of speech recognition, but this time mathe= matically; very briefly stating that if you have a redundant set of magnitu= de coeeficients, then phase doesnt matter at all, and as they say in the pa= per this mathematically confirms the belief in the speech recognition commu= nity over the years about phase insensitivity, ... And also there are some papers on the opposition side as well. This basical= ly is the source of my confusion. Emad [...] ------------------------------------------ Reinhart Frosch, Dr. phil. nat., CH-5200 Brugg. reinifrosch@xxxxxxxx=20 ------=_Part_1364_8485894.1286444020374 Content-Type: text/html;charset="UTF-8" Content-Transfer-Encoding: quoted-printable <html><head><style type=3D'text/css'> <!-- div.bwmail { background-color:#ffffff; font-family: Trebuchet MS,Arial,Helv= etica; font-size: 12px; margin:0; padding:0;} div.bwmail p { margin:0; padding:0; } div.bwmail table { font-family: Trebuchet MS,Arial,Helvetica; font-size: 12= px; } div.bwmail li { margin:0; padding:0; } --> </style> </head><body><div class=3D'bwmail'><P>Hello (again) Emad,<BR>Today I redid = the synthesizer experiment described yesterday, not with sine tones, but wi= th harmonic complex tones (voice C25, harmonica). First case: press white k= ey C1 (440.0 Hz), keep it pressed, then press black key C#1 (440.0 Hz, too)= . Now from one try to the next, both the loudness and the timbre changed st= rongly. The period of a 440-Hz-tone is T =3D 1/440 =3D 0.002272727... secon= d. If, e.g., the delay between the two tones happens to amount to 238.5 T = =3D 0.542045 second, or to 391.5 T =3D 0.889773 second, then the fundamenta= l (also called first partial or first harmonic) is extinguished, but the fi= rst overtone (the second partial or harmonic) is enhanced. Second case: pre= ss key C1 (440.0 Hz), keep it pressed, then press key D1 (441.6 Hz). There = ist no difference from one try to the next (1.6 beats per second).<BR>Reinh= art. <BR><BR>----Urspr=C3=BCngliche Nachricht----<BR>Von: emad.burke@xxxxxxxx= COM<BR>Datum: 05.10.2010 19:03<BR>An: <AUDITORY@xxxxxxxx><BR>B= etreff: Re: About importance of "phase" in sound recognition<BR><BR>Hi Kevi= n,<BR><BR>thanks for the reply. the phase definition that I'm talking about= is more of the third definition of yours. I'm exactly talking about what i= s called "in-sensitivity to phase". I'm talking about the phase infor= mation that is discarded in the process of MFCC feature extraction and it h= as been proven to be succesfull feature set for speech recognition. The "in= sensitivity to phase" that implicitly implies that if you change the order = (precedence) of travelling waves in each cochlear channel among each other,= it will not affect the perception and you can add random phases to differe= nt channels without affecting the perception(?).<BR><BR>Now this was on one= hand. On the other hand, couple of years ago there was a publication by a = mathematician (pete-cassaza) that kind of reinforced the argument of phase = insensitivety of speech recognition, but this time mathematically; very bri= efly stating that if you have a redundant set of magnitude coeeficients, th= en phase doesnt matter at all, and as they say in the paper this mathematic= ally confirms the belief in the speech recognition community over the years= about phase insensitivity, ...<BR><BR>And also there are some papers on th= e opposition side as well. This basically is the source of my confusion.<BR= ><BR>Emad [...]</P> <P>------------------------------------------</P> <P>Reinhart Frosch,<BR>Dr. phil. nat.,<BR>CH-5200 Brugg.<BR>reinifrosch@xxxxxxxx= ewin.ch <BR><BR></P></div></body></html> ------=_Part_1364_8485894.1286444020374--