[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: identification test procedure



To Jim Beauchamp:

I do that kind of experiment quite often, and I always use receiver operating
characteristic (ROC) analysis.  If you need Matlab scripts or C programs
that do such analyses (separating sensitivity and bias), I would be
happy to send them to you.

Pierre Divenyi wrote:

> The biggest problem with this method is that it does not distinguish
> between discrimination and response bias.

Yes, this is the crux of the matter.  However, I do not agree that
changing from the identification task to the discrimination task is
a good answer, unless you think you want to know about discrimination
rather than identification.  In musical applications of synthetic timbre,
you are rarely interested in whether the listener can discriminate between
a real and synthetic tone; rather you would like to know whether they will
identify what they are hearing as a bona fide piano or not.  Using the
confidence rating-scale ROC method is good, but maybe not necessary.
You can reduce the problem with correct guesses by including more
tones in a given trial (though this begins to feel like discimination
rather than identification!).  Five tones would reduce the guessing rate
to 20%.

The Green and Swets (1966) Signal Detection Theory and Psychophysics is a
good introduction, but more practical examples of SDT application can be
found in Swets (1964).  For example, the chapter by Clark (see ref below)
seems helpful for the piano tone study: Two tones followed by a judgment
of which is most realistic might work, but you could glean more info from
each trial if you played five and then asked them to give their second
choice as well.  If you do want to include more than two intervals in your
identification test, you might play five tones and ask the listener to
identify which of the five sounded most synthetic and which sounded most
realistic.

Clark, F. R. (1964) Confidence Ratings, Second-Choice Responses, and
Confusion Matrices in Intelligibility Tests.  In: J. A. Swets (Ed.), Signal
Detection and Recognition by Human Observers (pp. 620-648). New York:
John Wiley & Sons.

Pierre Divenyi also wrote:

> Actually, you are trying to evaluate the null hypothesis

I agree also that it is somewhat problematic when proving the null
hypothesis is the proof of the success of your data reduction scheme.
The problem is one of how good people are at discriminating subtle
differences like those between different types of pianos.  So if your
synthetic piano tone is close to one of five different pianos that you could
let them hear, would it satisfy you to know that know that it is as good
a candidate for the concept "piano" as any of the others?  Why should you
regard the failure to identify the difference between one real and one
synthetic set of tones as the primary indicator of success?

I think that the discussion of this issue is quite useful, especially
if you have practical applications in mind.

For your information:

The Swets (1964) and Green and Swets (1966) are available from Amazon.com
for relatively quick delivery:

 "Signal Detection and Recognition by Human Observers"
 John A. Swets; Hardcover; @ $54.95 each
     (Usually available in 4-6 weeks)

 "Signal Detection Theory and Psychophysics"
 David M. Green, John A. Swets; Hardcover; @ $54.95 each
     (Usually available in 4-6 weeks)

Unfortunately, another good one may not be so easy:

 "Detection Theory : A User's Guide"
 Neil A. MacMillan, C. Douglas Creelman; price currently unknown
     (Out of print; availability varies)

Regards,

--
William L. Martens, Ph.D.             EMAIL: wlm@u-aizu.ac.jp
Human Interface Lab                   URL: http://www.u-aizu.ac.jp/~wlm
University of Aizu                    TEL: [+81](242)37-2762
Aizu-Wakamatsu  965-8580, Japan       FAX: [+81](242)37-2549

McGill is running a new version of LISTSERV (1.8d on Windows NT). 
Information is available on the WEB at http://www.mcgill.ca/cc/listserv