Subject: Re: perceptual evaluation of cochlear models From: Mark Cartwright <mcartwright@xxxxxxxx> Date: Fri, 3 Oct 2014 12:57:31 -0500 List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>--e89a8ffbaecfbcef3e0504887830 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks Josh, John, and Rohr! We'll take a look at these. Cheers, Mark On Thu, Oct 2, 2014 at 11:44 AM, Joshua Reiss <joshua.reiss@xxxxxxxx> wrote: > Hi Mark, > You probably know about this already, but in these two papers we tried to > position sources in a stereo mix by reducing the spatial masking. However= , > we didn't use a formal auditory model of spatial masking. > > E. Perez Gonzalez and J. D. Reiss, A real-time semi-autonomous audio > panning system for music mixing, EURASIP Journal on Advances in Signal > Processing, v2010, Article ID 436895, p. 1-10, 2010. > > S. Mansbridge, S. Finn and J. D. Reiss, "An Autonomous System for > Multi-track Stereo Pan Positioning," 133rd AES Convention, San Francisco, > Oct. 26-29, 2012. > > > From: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxx= A> > on behalf of Mark Cartwright <mcartwright@xxxxxxxx> > Sent: 02 October 2014 00:36 > To: AUDITORY@xxxxxxxx > Subject: Re: perceptual evaluation of cochlear models > > > Hello auditory list, > > > So, in regards to Francesco's third item (the role of space), what is the > latest research in partial loudness and masking in regards to spatial > position/separation? Does a model exist yet for which an angle of > separation is input and its effect on partial loudness and masking is > output (e.g. either a binary yes/no release from masking or the change to > the masking threshold, etc.)? Or any similar function...? > > > Thanks! > > > Mark > > > On Mon, Sep 8, 2014 at 11:44 AM, ftordini@xxxxxxxx <ftordini@xxxxxxxx> > wrote: > > > Hello Joshua, > Thank you for the great expansion and for the further reading suggestions= . > I may add three more items to the list, hoping to be clear in my > formulation. > > (1) A perhaps provocative question could be: is there a loudness or more > loudnesses? (is loudness domain dependant?) Should we continue to tackle > loudness as an invariant percept across classes once we move onto the mor= e > complex domain of real sounds? Rephrasing: once we define an ecologicall= y > valid taxonomy of real world sounds (e.g. starting from Gaver), can we > expect the loudness model we want to improve to be valid across > (sound)classes? Hard to say, I would attempt 'yes', but granting differen= t > paramenters tuning according to the dominant context (say, speech, music= , > or environmental sounds). [hidden question: are we actually, ever, purely > "naive" listeners?] > > (2) A related question: can we jump form the controlled lab environment > into the wild in a single step? I'd say no - The approach followed by > EBU/ITU using real world, long, stimuli is highly relevant to the > broadcasting world, but it is hard to distinguish between energetic and > informational masking effects using real program material mostly made of > speech and music. Sets of less informative sources taken from > environmental, natural sounds may be a good compromise - a starting point > to address basic expansions of the current loudness model(s). Such > stragegies and datasets are missing (to my knowledge). > > (3) The role of space. Psysiologically driven models (Moore, Patterson) > are supported mostly by observations obtained using non-spatialized, or > dichotic, scenes to better reveal mechanisms sorting out the spatial > confound. However, while spatial cues are considered to play a secondary > role in scene alaysis, spatial release from masking is, on the other hand= , > quite important in partial loudness modeling, at least from the energetic > masking point of view and especially for complex sources. This is even mo= re > relevant for asymmetric sources distributions. I feel there is much to d= o > before we can address this aspect with confidence, even limiting the scop= e > to non-moving sources, but more curiosity with respect to spatial variabl= es > may be valuable when designing listening experiments with natural sounds= . > [If one asks a sound engineer working on a movie soundtrack: "where do yo= u > start form?", he will start talking about panning, to set the scene using > his sources (foley, dialogue, music, ...) and **then** adjust levels/eq .= ..] > > > Best, > -- > Francesco Tordini > http://www.cim.mcgill.ca/sre/personnel/ > http://ca.linkedin.com/in/ftordini > > > > > > >----Messaggio originale---- > >Da: joshua.reiss@xxxxxxxx > >Data: 06/09/2014 13.43 > >A: "ftordini@xxxxxxxx"<ftordini@xxxxxxxx>, "AUDITORY@xxxxxxxx"< > AUDITORY@xxxxxxxx> > >Ogg: RE: RE: perceptual evaluation of cochlear models > > > >Hi Francesco (and auditory list in case others are interested), > >I'm glad to hear that you've been following the intelligent mixing > research. > > > >I'll rephrase your email as a set of related questions... > > > >1. Should we extend the concepts of loudness and partial loudness to > complex material? - Yes, we should. Otherwise, what is it good for? That > is, what does it matter if we can accurately predict perceived loudness o= f > a pure tone, or the just noticeable differences between pedestal > increments for white or pink noise, or the partial loudness of a tone in > the presence of noise, etc., if we can't predict loudness outside > artificial laboratory conditions. I suppose it works as validation of an > auditory model, but its still very limited. > >On the other hand, if we can extend the model to complex sounds like > music, conversations, environmental sounds, etc., then we provide robust > validation a general model of human loudness perception. The model can th= en > be applied to metering systems, audio production, broadcast standards, > improved hearing aid design and so on. > > > >2. Can we extend the concepts of loudness and partial loudness to comple= x > material? - Yes, I think so. Despite all the issues and complexity, there= 's > a tremendous amount of consistency in perception of loudness, especially > when one considers relative rather than absolute perception. Take a TV > show and the associated adverts. The soundtracks of both may have dialogu= e, > foley, ambience, music,..., all with levels over time. Yet consistently > people can identify when the adverts are louder than the show. Same is > true when someone changes radio stations, and in music production, sound > engineers are always identifying and dealing with masking when there are > multiple simultaneous sources. > >I think the issues that many issues relating to complex material may hav= e > a big effect on perception of timbre or extraction of meaning or emotion, > but only a minor effect on loudness. > > > >3. Can we extend current auditory models of loudness and partial loudnes= s > to complex material? - Hard to say. The state of the art in those based o= n > deep understanding of the human hearing system (Glasberg, Moore et al... = ; > Fastl, Zwicker, et al...) were not developed with complex material in > mind, though when used with complex material, researchers have reported > good but far from great agreement with perception. Modification, though > still in agreement with auditory knowledge, shows improvement, but more > research is needed. > >On the other hand, we have models based mostly on listening test data, > but incorporating little auditory knowledge. I'm thinking here of the > EBU/ITU loudness standards. They are based largely on Gilbert Soulodre's > excellent listening test results > >(G. Soulodre, Evaluation of Objective Loudness Meters, 116th AES > Convention, 2004.), and represent a big improvement on say, just applying= a > loudness contour to signal RMS. But they are generally for a fixed > listening level, may overfit the data, difficult to generalise, and rare= ly > give deeper insight into the auditory system. Furthermore, like Moore's > model, these have also shown some inadequacies when dealing with a wider > range of content (Pestana, Reiss & Barbosa, "Loudness Measurement of > Multitrack Audio Content Using Modifications of ITU-R BS.1770," 134th AE= S > Convention, 2013). > >So I think rather than just extend, we may need to modify, improve, and > go back to the drawing board on some aspects. > > > >4. How could one develop an auditory model of loudness and partial > loudness for complex material? > >- Incorporate the validated aspects from prior models, but reassess any > compromises. > >- Use listening test results from a wide range of complex material. > Perhaps a metastudy could be performed, taking listening test results fro= m > many publications for both model creation and validation. > >- Build in known aspects of loudness perception that were left out of > existing models due to resources and the fact that they were built for la= b > scenarios (pure tones, pink noise, sine sweeps...). In particular, I'm > thinking forward and backward masking. > > > >5. What about JND? - I would stay clear of this. I'm not even aware of > anecdotal evidence suggesting consistency in just noticeable differences > for say, a small change in the level of one source in a mix. And I think > one can be trained to identify small partial loudness differences. I've > had conversations with professional mixing engineers who detect a problem > with a mix that I don't notice until they point it out. But the concept o= f > extending JND models to complex material is certainly very interesting. > > > >________________________________________ > >From: ftordini@xxxxxxxx <ftordini@xxxxxxxx> > >Sent: 04 September 2014 15:45 > >To: Joshua Reiss > >Subject: R: RE: perceptual evaluation of cochlear models > > > >Hello Joshua, > >Interesting, indeed. Thank you. > > > >So the question is - to what extent can we stretch the concepts of > loudness > >and partial loudness for complex material such as meaningful noise (aka > music), > >where attention and preference is likely to play a role as opposed to > beeps and > >sweeps ? That is - would you feel comfortable to give a rule of a thumb > for a > >JND for partial loudness, to safely rule out other factors when mixing? > > > >I was following your intelligent mixing thread - although I've missed th= e > >recent one you sent me - and my question above relates to the possibilit= y > to > >actually "design" the fore-background perception when you do automatic > mixing > >using real sounds... > >I would greatly appreciate any comment form your side. > > > >Best wishes, > >Francesco > > > > > >>----Messaggio originale---- > >>Da: joshua.reiss@xxxxxxxx > >>Data: 03/09/2014 16.00 > >>A: "AUDITORY@xxxxxxxx"<AUDITORY@xxxxxxxx>, "Joachim > Thiemann" > ><joachim.thiemann@xxxxxxxx>, "ftordini@xxxxxxxx"<ftordini@xxxxxxxx> > >>Ogg: RE: perceptual evaluation of cochlear models > >> > >>Hi Francesco and Joachim, > >>I collaborated on a paper that involved perceptual evaluation of partia= l > >loudness with real world audio content, where partial loudness is derive= d > from > >the auditory models of Moore, Glasberg et al. It showed that the predict= ed > >loudness of tracks in multitrack musical audio disagrees with perception= , > but > >that minor modifications to a couple of parameters in the model would > result in > >a much closer match to perceptual evaluation results. See > >>Z. Ma, J. D. Reiss and D. Black, "Partial loudness in multitrack > mixing," AES > >53rd International Conference on Semantic Audio in London, UK, January > 27-29, > >2014. > >> > >>And in the following paper, there was some informal evaluation of the > use of > >Glasberg, Moore et al's auditory model for loudness and/or partial > loudness > >could be used to mix multitrack musical audio. Though the emphasis was o= n > >application rather than evaluation, it also noticed issues with the mode= l > when > >applied to real world content. See, > >>D. Ward, J. D. Reiss and C. Athwal, "Multitrack mixing using a model of > >loudness and partial loudness," 133rd AES Convention, San Francisco, Oct= . > 26- > >29, 2012. > >> > >>These may not be exactly what you're looking for, but hopefully you fin= d > it > >interesting. > >>________________________________________ > >>From: AUDITORY - Research in Auditory Perception < > AUDITORY@xxxxxxxx> > >on behalf of Joachim Thiemann <joachim.thiemann@xxxxxxxx> > >>Sent: 03 September 2014 07:12 > >>To: AUDITORY@xxxxxxxx > >>Subject: Re: perceptual evaluation of cochlear models > > > >> > >>Hello Francesco, > >> > >>McGill alumni here - I did a bit of study in this direction, you can > >>read about it in my thesis: > >>http://www-mmsp.ece.mcgill.ca/MMSP/Theses/T2011-2013.html#Thiemann > >> > >>My argument was that if you have a good auditory model, you should be > >>able to start from only the model parameters and be able to > >>reconstruct the original signal with perceptual transparency. I was > >>looking at this in the context of perceptual coding - a perceptual > >>coder minus the entropy stage effectively verifies the model. If > >>artefacts do appear, they can (indirectly) tell you what you are > >>missing. > >> > >>I was specifically looking at gammatone filterbank methods, so there > >>is no comparison to other schemas - but I hope it is a bit in the > >>direction you're looking at. > >> > >>Cheers, > >>Joachim. > >> > >>On 2 September 2014 20:39, ftordini@xxxxxxxx <ftordini@xxxxxxxx> > wrote: > >>> > >>> Dear List members, > >>> I am looking for references on perceptual evaluation of cochlear > models - > >>> taken form an analysis-synthesis point of view, alike the work > introduced > >in > >>> Homann_2002 (Frequency analysis and synthesis using a Gammatone > filterbank, > >>> =C2=A74.3). > >>> > >>> Are you aware of any study that tried to assess the performance of > >>> gammatone-like filterbanks used as a synthesis model? (AKA, what ar= e > the > >>> advantages over MPEG-like schemas?) > >>> > >>> All the best, > >>> Francesco > >>> > >>> http://www.cim.mcgill.ca/sre/personnel/ > >>> http://ca.linkedin.com/in/ftordini > > > --e89a8ffbaecfbcef3e0504887830 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">Thanks Josh, John, and Rohr! We'll take a look at thes= e.<div><br></div><div>Cheers,</div><div><br></div><div>Mark</div></div><div= class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Thu, Oct 2, 2014 a= t 11:44 AM, Joshua Reiss <span dir=3D"ltr"><<a href=3D"mailto:joshua.rei= ss@xxxxxxxx" target=3D"_blank">joshua.reiss@xxxxxxxx</a>></span> wro= te:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-= left:1px #ccc solid;padding-left:1ex">Hi Mark,<br> You probably know about this already, but in these two papers we tried to p= osition sources in a stereo mix by reducing the spatial masking. However, w= e didn't use a formal auditory model of spatial masking.<br> =C2=A0<br> E. Perez Gonzalez and J. D. Reiss, A real-time semi-autonomous audio pannin= g system for music mixing, EURASIP Journal on Advances in Signal Processing= ,=C2=A0 v2010, Article ID 436895, p. 1-10, 2010.<br> =C2=A0<br> =C2=A0S. Mansbridge, S. Finn and J. D. Reiss, "An Autonomous System fo= r Multi-track Stereo Pan Positioning," 133rd AES Convention, San Franc= isco, Oct. 26-29, 2012.<br> <br> <br> From: AUDITORY - Research in Auditory Perception <<a href=3D"mailto:AUDI= TORY@xxxxxxxx">AUDITORY@xxxxxxxx</a>> on behalf of Mark Ca= rtwright <<a href=3D"mailto:mcartwright@xxxxxxxx">mcartwright@xxxxxxxx= </a>><br> Sent: 02 October 2014 00:36<br> <span class=3D"im HOEnZb">To: <a href=3D"mailto:AUDITORY@xxxxxxxx">A= UDITORY@xxxxxxxx</a><br> Subject: Re: perceptual evaluation of cochlear models<br> =C2=A0<br> <br> </span><span class=3D"im HOEnZb">Hello auditory list,<br> <br> <br> So, in regards to Francesco's third item (the role of space), what is t= he latest research in partial loudness and masking in regards to spatial po= sition/separation? Does a model exist yet for which an angle of separation = is input and its effect on partial=C2=A0 loudness and masking is output (e.= g. either a binary yes/no release from masking or the change to the masking= threshold, etc.)? Or any similar function...?<br> <br> <br> Thanks!<br> <br> <br> Mark<br> <br> <br> On Mon, Sep 8, 2014 at 11:44 AM,=C2=A0 <a href=3D"mailto:ftordini@xxxxxxxx= ">ftordini@xxxxxxxx</a> <<a href=3D"mailto:ftordini@xxxxxxxx">ftordini= @xxxxxxxx</a>> wrote:<br> <br> <br> </span><div class=3D"HOEnZb"><div class=3D"h5">Hello Joshua,<br> Thank you for the great expansion and for the further reading suggestions.<= br> I may add three more items to the list, hoping to be clear in my formulatio= n.=C2=A0<br> <br> (1) A perhaps provocative question could be: is there a loudness or more lo= udnesses? (is loudness domain dependant?) Should we continue to tackle loud= ness as an invariant percept across classes once we move onto the more comp= lex domain of real sounds? Rephrasing:=C2=A0 once we define an ecologically= valid taxonomy of real world sounds (e.g. starting from Gaver), can we exp= ect=C2=A0the loudness model we want to improve to be valid across (sound)cl= asses? Hard to say, I would attempt 'yes', but granting different p= aramenters tuning=C2=A0 according to the dominant context (say, speech, mus= ic, or environmental sounds). [hidden question: are we actually, ever, pure= ly "naive" listeners?]<br> <br> (2) A related question: can we jump form the controlled lab environment int= o the wild in a single step?=C2=A0 I'd say no - The approach followed b= y EBU/ITU using real world, long, stimuli is highly relevant to the broadca= sting world, but it is hard to distinguish=C2=A0 between energetic and info= rmational masking effects using real program material mostly made of speech= and music. Sets of less informative sources taken from environmental, natu= ral sounds may be a good compromise - a starting point to address basic exp= ansions=C2=A0 of the current loudness model(s).=C2=A0 Such stragegies and d= atasets are missing (to my knowledge).<br> <br> (3) The role of space. Psysiologically driven models (Moore, Patterson) are= supported mostly by observations obtained using non-spatialized, or dichot= ic, scenes to better reveal mechanisms sorting out the spatial confound. Ho= wever, while spatial cues are considered=C2=A0 to play a secondary role in = scene alaysis, spatial release from masking is, on the other hand, quite im= portant in partial loudness modeling, at least from the energetic masking p= oint of view and especially for complex sources. This is even more relevant= for=C2=A0 asymmetric sources distributions. I feel there is much to do bef= ore we can address this aspect with confidence, even limiting the scope to = non-moving sources, but more curiosity with respect to spatial variables ma= y be valuable when designing listening experiments=C2=A0 with natural sound= s.<br> [If one asks a sound engineer working on a movie soundtrack: "where do= you start form?", he will start talking about panning, to set the sce= ne using his sources (foley, dialogue, music, ...) and **then** adjust leve= ls/eq ...]<br> <br> =C2=A0<br> Best,<br> --<br> Francesco Tordini<br> <a href=3D"http://www.cim.mcgill.ca/sre/personnel/" target=3D"_blank">http:= //www.cim.mcgill.ca/sre/personnel/</a><br> <a href=3D"http://ca.linkedin.com/in/ftordini" target=3D"_blank">http://ca.= linkedin.com/in/ftordini</a><br> <br> <br> =C2=A0<br> =C2=A0<br> <br> >----Messaggio originale----<br> >Da: <a href=3D"mailto:joshua.reiss@xxxxxxxx">joshua.reiss@xxxxxxxx<= /a><br> >Data: 06/09/2014 13.43<br> >A: "<a href=3D"mailto:ftordini@xxxxxxxx">ftordini@xxxxxxxx</a>&q= uot;<<a href=3D"mailto:ftordini@xxxxxxxx">ftordini@xxxxxxxx</a>>, &= quot;<a href=3D"mailto:AUDITORY@xxxxxxxx">AUDITORY@xxxxxxxx</= a>"<<a href=3D"mailto:AUDITORY@xxxxxxxx">AUDITORY@xxxxxxxx= LL.CA</a>><br> >Ogg: RE: RE: perceptual evaluation of cochlear models<br> ><br> >Hi Francesco (and auditory list in case others are interested),<br> >I'm glad to hear that you've been following the intelligent mix= ing research.<br> ><br> >I'll rephrase your email as a set of related questions...<br> ><br> >1. Should we extend the concepts of loudness and partial loudness to co= mplex material? - Yes, we should. Otherwise, what is it good for? That is, = what does it matter if we can accurately predict perceived loudness of a pu= re tone, or the just noticeable differences=C2=A0 between pedestal incremen= ts for white or pink noise, or the partial loudness of a tone in the presen= ce of noise, etc., if we can't predict loudness outside artificial labo= ratory conditions. I suppose it works as validation of an auditory model, b= ut its still=C2=A0 very limited.<br> >On the other hand, if we can extend the model to complex sounds like mu= sic, conversations, environmental sounds, etc., then we provide robust vali= dation a general model of human loudness perception. The model can then be = applied to metering systems, audio=C2=A0 production, broadcast standards, i= mproved hearing aid design and so on.<br> ><br> >2. Can we extend the concepts of loudness and partial loudness to compl= ex material? - Yes, I think so. Despite all the issues and complexity, ther= e's a tremendous amount of consistency in perception of loudness, espec= ially when one considers relative rather=C2=A0 than absolute perception. Ta= ke a TV show and the associated adverts. The soundtracks of both may have d= ialogue, foley, ambience, music,..., all with levels over time. Yet consist= ently people can identify when the adverts are louder than the show. Same i= s true=C2=A0 when someone changes radio stations, and in music production, = sound engineers are always identifying and dealing with masking when there = are multiple simultaneous sources.<br> >I think the issues that many issues relating to complex material may ha= ve a big effect on perception of timbre or extraction of meaning or emotion= , but only a minor effect on loudness.<br> ><br> >3. Can we extend current auditory models of loudness and partial loudne= ss to complex material? - Hard to say. The state of the art in those based = on deep understanding of the human hearing system (Glasberg, Moore et al...= ; Fastl, Zwicker, et al...) were not=C2=A0 developed with complex material= in mind, though when used with complex material, researchers have reported= good but far from great agreement with perception. Modification, though st= ill in agreement with auditory knowledge, shows improvement, but more resea= rch=C2=A0 is needed.<br> >On the other hand, we have models based mostly on listening test data, = but incorporating little auditory knowledge. I'm thinking here of the E= BU/ITU loudness standards. They are based largely on Gilbert Soulodre's= excellent listening test results<br> >(G. Soulodre, Evaluation of Objective Loudness Meters, 116th AES Conven= tion, 2004.), and represent a big improvement on say, just applying a loudn= ess contour to signal RMS. But they are generally for a fixed listening lev= el, may overfit the data, difficult=C2=A0 to generalise, and rarely give de= eper insight into the auditory system. Furthermore, like Moore's model,= these have also shown some inadequacies when dealing with a wider range of= content (Pestana, Reiss & Barbosa, "Loudness Measurement of Multi= track Audio=C2=A0 Content Using Modifications of ITU-R BS.1770," 134th= AES Convention, 2013).<br> >So I think rather than just extend, we may need to modify, improve, and= go back to the drawing board on some aspects.<br> ><br> >4. How could one develop an auditory model of loudness and partial loud= ness for complex material?<br> >- Incorporate the validated aspects from prior models, but reassess any= compromises.<br> >- Use listening test results from a wide range of complex material. Per= haps a metastudy could be performed, taking listening test results from man= y publications for both model creation and validation.<br> >- Build in known aspects of loudness perception that were left out of e= xisting models due to resources and the fact that they were built for lab s= cenarios (pure tones, pink noise, sine sweeps...). In particular, I'm t= hinking forward and backward masking.<br> ><br> >5. What about JND? - I would stay clear of this. I'm not even aware= of anecdotal evidence suggesting consistency in just noticeable difference= s for say, a small change in the level of one source in a mix. And I think = one can be trained to identify small partial=C2=A0 loudness differences. I&= #39;ve had conversations with professional mixing engineers who detect a pr= oblem with a mix that I don't notice until they point it out. But the c= oncept of extending JND models to complex material is certainly very intere= sting.<br> ><br> >________________________________________<br> >From: <a href=3D"mailto:ftordini@xxxxxxxx">ftordini@xxxxxxxx</a> <= <a href=3D"mailto:ftordini@xxxxxxxx">ftordini@xxxxxxxx</a>><br> >Sent: 04 September 2014 15:45<br> >To: Joshua Reiss<br> >Subject: R: RE: perceptual evaluation of cochlear models<br> ><br> >Hello Joshua,<br> >Interesting, indeed. Thank you.<br> ><br> >So the question is - to what extent can we stretch the concepts of loud= ness<br> >and partial loudness for complex material such as meaningful noise (aka= music),<br> >where attention and preference is likely to play a role as opposed to b= eeps and<br> >sweeps ? That is - would you feel comfortable to give a rule of a thumb= for a<br> >JND for partial loudness, to safely rule out other factors when mixing?= <br> ><br> >I was following your intelligent mixing thread - although I've miss= ed the<br> >recent one you sent me - and my question above relates to the possibili= ty to<br> >actually "design" the fore-background perception when you do = automatic mixing<br> >using real sounds...<br> >I would greatly appreciate any comment form your side.<br> ><br> >Best wishes,<br> >Francesco<br> ><br> ><br> >>----Messaggio originale----<br> >>Da: <a href=3D"mailto:joshua.reiss@xxxxxxxx">joshua.reiss@xxxxxxxx= .uk</a><br> >>Data: 03/09/2014 16.00<br> >>A: "<a href=3D"mailto:AUDITORY@xxxxxxxx">AUDITORY@xxxxxxxx= .MCGILL.CA</a>"<<a href=3D"mailto:AUDITORY@xxxxxxxx">AUDITOR= Y@xxxxxxxx</a>>, "Joachim Thiemann"<br> ><<a href=3D"mailto:joachim.thiemann@xxxxxxxx">joachim.thiemann@xxxxxxxx= L.COM</a>>, "<a href=3D"mailto:ftordini@xxxxxxxx">ftordini@xxxxxxxx= it</a>"<<a href=3D"mailto:ftordini@xxxxxxxx">ftordini@xxxxxxxx</a= >><br> >>Ogg: RE: perceptual evaluation of cochlear models<br> >><br> >>Hi Francesco and Joachim,<br> >>I collaborated on a paper that involved perceptual evaluation of pa= rtial<br> >loudness with real world audio content, where partial loudness is deriv= ed from<br> >the auditory models of Moore, Glasberg et al. It showed that the predic= ted<br> >loudness of tracks in multitrack musical audio disagrees with perceptio= n, but<br> >that minor modifications to a couple of parameters in the model would r= esult in<br> >a much closer match to perceptual evaluation results. See<br> >>Z. Ma, J. D. Reiss and D. Black, "Partial loudness in multitra= ck mixing," AES<br> >53rd International Conference on Semantic Audio in London, UK, January = 27-29,<br> >2014.<br> >><br> >>And in the following paper, there was some informal evaluation of t= he use of<br> >Glasberg, Moore et al's auditory model for loudness and/or partial = loudness<br> >could be used to mix multitrack musical audio. Though the emphasis was = on<br> >application rather than evaluation, it also noticed issues with the mod= el when<br> >applied to real world content. See,<br> >>D. Ward, J. D. Reiss and C. Athwal, "Multitrack mixing using a= model of<br> >loudness and partial loudness," 133rd AES Convention, San Francisc= o, Oct. 26-<br> >29, 2012.<br> >><br> >>These may not be exactly what you're looking for, but hopefully= you find it<br> >interesting.<br> >>________________________________________<br> >>From: AUDITORY - Research in Auditory Perception <<a href=3D"mai= lto:AUDITORY@xxxxxxxx">AUDITORY@xxxxxxxx</a>><br> >on behalf of Joachim Thiemann <<a href=3D"mailto:joachim.thiemann@xxxxxxxx= AIL.COM">joachim.thiemann@xxxxxxxx</a>><br> >>Sent: 03 September 2014 07:12<br> >>To: <a href=3D"mailto:AUDITORY@xxxxxxxx">AUDITORY@xxxxxxxx= LL.CA</a><br> >>Subject: Re: perceptual evaluation of cochlear models<br> <br> <br> >><br> >>Hello Francesco,<br> >><br> >>McGill alumni here - I did a bit of study in this direction, you ca= n<br> >>read about it in my thesis:<br> >><a href=3D"http://www-mmsp.ece.mcgill.ca/MMSP/Theses/T2011-2013.htm= l#Thiemann" target=3D"_blank">http://www-mmsp.ece.mcgill.ca/MMSP/Theses/T20= 11-2013.html#Thiemann</a><br> >><br> >>My argument was that if you have a good auditory model, you should = be<br> >>able to start from only the model parameters and be able to<br> >>reconstruct the original signal with perceptual transparency.=C2=A0= I was<br> >>looking at this in the context of perceptual coding - a perceptual<= br> >>coder minus the entropy stage effectively verifies the model.=C2=A0= If<br> >>artefacts do appear, they can (indirectly) tell you what you are<br= > >>missing.<br> >><br> >>I was specifically looking at gammatone filterbank methods, so ther= e<br> >>is no comparison to other schemas - but I hope it is a bit in the<b= r> >>direction you're looking at.<br> >><br> >>Cheers,<br> >>Joachim.<br> >><br> >>On 2 September 2014 20:39,=C2=A0 <a href=3D"mailto:ftordini@xxxxxxxx= it">ftordini@xxxxxxxx</a> <<a href=3D"mailto:ftordini@xxxxxxxx">ftordi= ni@xxxxxxxx</a>> wrote:<br> >>><br> >>> Dear List members,<br> >>> I am looking for references on perceptual evaluation of cochle= ar models -<br> >>> taken form an analysis-synthesis point of view, alike the work= introduced<br> >in<br> >>> Homann_2002 (Frequency analysis and synthesis using a Gammaton= e filterbank,<br> >>> =C2=A74.3).<br> >>><br> >>> Are you aware of any study that tried to assess the performanc= e of<br> >>> gammatone-like filterbanks used as a synthesis model?=C2=A0=C2= =A0 (AKA, what are the<br> >>> advantages over MPEG-like schemas?)<br> >>><br> >>> All the best,<br> >>> Francesco<br> >>><br> >>> <a href=3D"http://www.cim.mcgill.ca/sre/personnel/" target=3D"= _blank">http://www.cim.mcgill.ca/sre/personnel/</a><br> >>> <a href=3D"http://ca.linkedin.com/in/ftordini" target=3D"_blan= k">http://ca.linkedin.com/in/ftordini</a><br> <br> =C2=A0 =C2=A0 </div></div></blockquote></div><br></div> --e89a8ffbaecfbcef3e0504887830--