[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sinfa using matlab



Dear All,

My comment is not about HOW to get SINFA working, but WHY you would want to get it working.

Since 1973 we have learned a great deal about phone identification by normal and hearing impaired listeners. Bob Bilger was a good friend, and his work represented
an important stepping stone along the path toward building realistic and correct understanding of human speech processing. But today, in my view, SINFA is not a viable
way to analyze human speech errors. One of the problems with the 1973 analysis was due to the limitations of computers in 1973. All the responses were averaged over
the two main effects, tokens and SNR. This renders the results uninterperateable.

Please share with us your thoughts on what the best methods are today, given what we now know. And I would be happy to do the same.

My view:

I would suggest you look at the alternatives, such as confusion patterns, which is a row of a confusion matrix, as a function of SNR, and most importantly, go down to
the token level. It is time to give up on distinctive features. They are a production concept, great at classifying different types of speech productions, but they
do not properly get at what human listeners do, especially those with hearing loss, when reporting individual consonants. Bilger and Wang make these points in their HSHR article.
They emphasize individual differences of HI listeners (p 737), and the secondary role of distinctive features (p. 724) and of hearing level (p 737). I do not think that multidimentional scaling can give the answers to these questions, as it only works for a limited number of dimensions (2 or 3). Actual confusion data, as a function of SNR, are too complex for a 2-3 dimension analysis.

Here are some pointers I suggest you consider, that describe how humans decode CV sounds as a function of the SNR.

The Singh analysis explains why and how the articulation index (AI) works.
The Trevino article shows the very large differences in consonant perception in impaired ears. Hearing loss leads to large individual differences, that are uncorrelated to hearing thresholds.
The Toscano article is a good place to start.

  • Toscano, Joseph and Allen, Jont B (2014) Across and within consonant errors for isolated syllables in noise, Journal of Speech, Language, and Hearing Research, Vol 57, pp 2293-2307; doi:10.1044/2014_JSLHR-H-13-0244, (JSLHR,pdf, AuthorCopy)

  • Trevino, Andrea C and Allen, Jont B (2012). "Within-Consonant Perceptual Differences in the Hearing Impaired Ear," JASA v134(1); Jul, 2013, pp 607--617 (pdf)

  • Riya Singh and Jont Allen (2012); "The influence of stop consonants’ perceptual features on the Articulation Index model," J. Acoust. Soc. Am., apr v131,3051-3068 (pdf)


  • These two publications describe the speech cues normal hearing listeners use when decoding CV sounds. Each token has a threshold we call SNR_90, defined as the SNR where the errors go form zero to 10%. Most speech sounds are below the Shannon channel capacity limit, below which there are zero errors, until the SNR is at the token error threshold.

    Distinctive features are not a good description of phone perception. The real speech cues are relieved in these papers, and each token has an SNR_90. Bilger and wang discuss this problem on page 724 of their 1973 JSHR article.

  • Li, F., Trevino, A., Menon, A. and Allen, Jont B (2012). "A psychoacoustic method for studying the necessary and sufficient perceptual cues of American English fricative consonants in noise" J. Acoust. Soc. Am., v132(4) Oct, pp. 2663-2675 pdf
  • F. Li, A. Menon, and Jont B Allen, (2010) A psychoacoustic method to find the perceptual cues of stop consonants in natural speech, apr, J. Acoust. Soc. Am. pp. 2599-2610, (pdf)

  • If you want to see another view, other than mine, read this, for starters:

    Zaar, Dau, 2015, JASA vol 138, pp 1253-1267

    http://scitation.aip.org/content/asa/journal/jasa/138/3/10.1121/1.4928142


    Jont Allen


    On 03/26/2016 10:44 AM, gvoysey wrote:

    I have not tried this, but i am willing to bet you can get FIX running on a modern PC with DOSbox, which is a cross-platform MS-DOS emulator. It’s most famous for letting you play very old video games in your web browser (http://playdosgamesonline.com/), but there’s no reason it shouldn’t work just as well for Real Work.

    -graham


    On Sat, Mar 26, 2016 at 5:06 AM, David Jackson Morris <dmorris@xxxxxxxxx> wrote:
    Dear Skyler,

    I have been on a similar search and found an R package by David van Leeuwen that is available at github.  Please let me know if you find any other alternatives?  

    FIX is really awesome, but every time I want to use it I have to go over to Grannies and boot the Win 95 machine, and she makes me eat poppyseed cake which makes me tummy sore. . .


    Cheers

    David Jackson Morris, PhD
     
    Københavns Universitet/University of Copenhagen
    INSS/Audiologopædi/Speech Pathology & Audiology
    Byggning 22, 5 sal
    Njalsgade 120
    2300 København S

    Office 22.5.14
    TLF 35328660 

    From: AUDITORY - Research in Auditory Perception [AUDITORY@xxxxxxxxxxxxxxx] on behalf of Skyler Jennings [Skyler.Jennings@xxxxxxxxxxxx]
    Sent: Friday, March 25, 2016 9:15 PM
    To: AUDITORY@xxxxxxxxxxxxxxx
    Subject: sinfa using matlab

    Dear list,

     

    I am writing in search of MATLAB-based software that performs sequential information transfer (SINFA; Wang and Bilger, 1973). I am impressed with the quality of the DOS-based software maintained by UCL called “FIX;” however, it would be more convenient to do the analysis in MATLAB if possible.

     

    I appreciate any help you can offer, whether it be guiding me to publically-available software, or sharing software that you’ve developed.   

     

    Sincerely,

     

    Skyler

     

    --

    Skyler G. Jennings, Ph.D., Au.D. CCC-A

    Assistant Professor

    Department of Communication Sciences and Disorders

    College of Health University of Utah

    390 South 1530 East

    Suite 1201 BEHS

    Salt Lake City, UT 84112

    801-581-6877 (phone)

    801-581-7955 (fax)

    skyler.jennings@xxxxxxxxxxxx

     




    --
    Graham Voysey
    Boston University College of Engineering
    HRC Research Engineer
    Auditory Biophysics and Simulation Laboratory
    ERB 413