[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: AUDITORY Digest - 28 Sep 2006 to 29 Sep 2006 (#2006-215)
What was the original subject of this post?
Do you know of any way to filter email by subject when the usual subject line
contains only a digest number?
This is why it helps those of us who need to sort email at work,
to use the original subject line,
as long as it remains relevant,
instead of the digest number, which tells little.
Using digest as a subject kills my ability to sort and read by subject!
--
If you must quote me, please put your comments first.
I have already listened to mine.
I read email with speech.
So it is not possible to scroll past the quotes without listening to them again,
to quickly get to the new information.
Thanks much again as always.
>From owner-auditory@xxxxxxxxxxxxxxx Sat Sep 30 05:02:11 2006
Return-Path: <owner-auditory@xxxxxxxxxxxxxxx>
Received: from pagent2.arc.nasa.gov (pagent2.arc.nasa.gov [128.102.31.162])
by arc.nasa.gov (8.13.8/8.13.4) with ESMTP id k8UC2AkL010969;
Sat, 30 Sep 2006 05:02:11 -0700 (PDT)
Received: from drizzle.cc.mcgill.ca (drizzle.cc.mcgill.ca [132.206.27.48])
by pagent2.arc.nasa.gov (8.12.11.20060308/8.12.11) with ESMTP id k8UBxAxd002407;
Sat, 30 Sep 2006 04:59:11 -0700
Received: from localhost (mailscan1.CC.McGill.CA [132.216.77.248])
by drizzle.cc.mcgill.ca (8.12.11/8.12.3) with SMTP id k8UBmuYD020222;
Sat, 30 Sep 2006 07:49:25 -0400
Received: from LIST2 (list2.McGill.CA [132.206.27.41])
by mailscan1.cc.mcgill.ca (8.13.6/8.13.0) with ESMTP id k8UA0hG9021074;
Sat, 30 Sep 2006 07:38:37 -0400 (EDT)
Received: by LISTS.MCGILL.CA (LISTSERV-TCP/IP release 14.5) with spool id
14273600 for AUDITORY@xxxxxxxxxxxxxxx; Sat, 30 Sep 2006 07:38:34 -0400
Received: from 132.206.27.49 by LISTS.MCGILL.CA (SMTPL release 1.0m) with TCP;
Sat, 30 Sep 2006 07:38:34 -0400
Received: from mailscan3.cc.mcgill.ca (mailscan3.CC.McGill.CA [132.216.77.250])
by torrent.cc.mcgill.ca (8.12.11/8.12.3) with ESMTP id k8UBcXPa028893
for <AUDITORY@xxxxxxxxxxxxxxx>; Sat, 30 Sep 2006 07:38:33 -0400
Received: from asmx1.McGill.CA (asmx1.mcgill.ca [132.216.46.232]) by
mailscan3.cc.mcgill.ca (8.13.6/8.13.0) with ESMTP id k8UBcSwu002851
for <AUDITORY@xxxxxxxxxxxxxxx>; Sat, 30 Sep 2006 07:38:28 -0400 (EDT)
Received: from asmx1.McGill.CA (localhost.localdomain [127.0.0.1]) by localhost
(Postfix) with SMTP id CA41E2E0027 for <AUDITORY@xxxxxxxxxxxxxxx>;
Sat, 30 Sep 2006 07:38:27 -0400 (EDT)
Received: from bi-staff1.beckman.uiuc.edu (bi-staff1.beckman.uiuc.edu
[130.126.123.169]) by asmx1.McGill.CA (Postfix) with ESMTP id
7D4712E0023 for <AUDITORY@xxxxxxxxxxxxxxx>; Sat, 30 Sep 2006 07:38:27
-0400 (EDT)
Received: from [192.168.1.153] ([12.206.129.135]) (authenticated bits=0) by
bi-staff1.beckman.uiuc.edu (8.12.11.20060308/8.12.8) with ESMTP id
k8UBcQET021410 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA
bits=256 verify=NO) for <AUDITORY@xxxxxxxxxxxxxxx>; Sat, 30 Sep 2006
06:38:26 -0500
User-Agent: Thunderbird 1.5 (X11/20051201)
MIME-Version: 1.0
References: <AUDITORY%200609300005418960.93F1@xxxxxxxxxxxxxxx>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-PMX-Version: 5.2.0.264296, Antispam-Engine: 2.4.0.264935,
Antispam-Data: 2006.9.30.35944
X-PerlMx-Spam: Probability=7%, Report='__CP_URI_IN_BODY 0, __CT 0, __CTE 0,
__CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_TEXT_ONLY 0,
__MIME_VERSION 0, __SANE_MSGID 0, __USER_AGENT 0'
Message-ID: <451E5676.4030109@xxxxxxxx>
Date: Sat, 30 Sep 2006 06:35:18 -0500
Sender: AUDITORY Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx>
From: Jont Allen <jontalle@xxxxxxxx>
Subject: Re: AUDITORY Digest - 28 Sep 2006 to 29 Sep 2006 (#2006-215)
To: AUDITORY@xxxxxxxxxxxxxxx
In-Reply-To: <AUDITORY%200609300005418960.93F1@xxxxxxxxxxxxxxx>
Precedence: list
List-Help: <mailto:LISTSERV@xxxxxxxxxxxxxxx?body=INFO AUDITORY>
List-Unsubscribe: <mailto:AUDITORY-unsubscribe-request@xxxxxxxxxxxxxxx>
List-Subscribe: <mailto:AUDITORY-subscribe-request@xxxxxxxxxxxxxxx>
List-Owner: <mailto:AUDITORY-request@xxxxxxxxxxxxxxx>
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlx=0 adultscore=0 adjust=0 reason=mlx engine=3.0.0-0609190001 definitions=main-0609300001
Status: RO
Content-Length: 5523
Dear Laszlo,
Christine is of course correct. I would like to post 3 of many refs. For
more, look in my book "Articulation and Intelligibility" published by
Morgan Claypool, 2005.
If you are finding differently for an ASR system, then that just shows
that the HMM "Gain" is turned up way too high. By that I mean, its
ignoring the input to some extent, and looking for words that it can put
together that make some sense, and thus have a combined low entropy
(independent of other phones that it recognizes).
Let me give an example: What if the spoken utterance was:
"The make had a blue type"
and assume the recognizer got, at the phone level:
"the make had a blue type" (100% correct),
then the recognizer would report:
"the man had a blue tie"
Get it?
What if your relative calls you on the phone, and leaves a message,
that gets transcribed
"a large ant was fried"
and then you listen to the message, and it is really
"your great aunt just died"
That wouldnt be too good, would it.
Maybe not a very good example. They are better taken off of a real system.
All the example I have are my personal phone messages (text and wav
files), and they have things in them I cant make public.
But they can be pretty funny sets of errors, I'll tell you!
Tell us what really happened, please. I dont care how off topic it is.
Its not off topic, IMO.
A comment: It is my opinion that ASR people will not report the phone
scores because they dont want their funding sources to dry up. Typically
these phone scores are quite low (compared to human scores, that is),
being in the 50-75% range, with no noise. When the SNR gets "down" to
+10, things are falling appart, and at 0 dB SNR, the scores (in one case
I know) are below chance. Yes below chance!
Human phone error rates start at somewhere between 1.5-2 % error in
quiet. At +10 dB SNR (AI~0.5), the Miller Nicely phone error rate was
about 10%. At 0 dB the AI is about 0.2 (Allen 2005, JASA, Fig. 6) which
gives a phone error rate of about 30%. The 50% point is about -6dB SNR,
and an AI of about 0.06.
We have unpublished results (in review) where we repeated some of this
and found 2% error in quiet (consonants scored from CVs), 10% at -6 dB
SNR, and 50% error at -18 dB SNR. However, we found that there are 3
sets of consonants, with one group of 5 consonants, having very high
error. These bias the average numbers way up. The rest of the sounds (11
of them) are much better than what I quote above. One group has an error
of 0.5% error in quiet (5 errors per 1000 presentations).
I have run on too long.
Please tell us more!
Jont Allen
REFS:
@article{Bronkhorst93,
author={Bronkhorst, A. W. and Bosman, A. J. and Smoorenburg, G. F.},
title={A model for context effects in speech recognition},
journal=JASA,
year={1993},
month=jan,
volume={93},
number={1},
pages={499-509},
note_={} }
@article{Boothroyd88,
author={Boothroyd, A. and Nittrouer, S.},
year={1988},
title={Mathematical treatment of context effects in phoneme
and word recognition},
journal={J. Acoust. Soc. Am.},
volume={84},
number={1},
pages={101-114} }
@inproceedings{Boothroyd93,
author={Boothroyd, A.},
title={Speech preception, sensorineural hearing loss, and hearing aids},
booktitle={Acoustical Factors affecting Hearing aid performance},
editor={Studebaker, G. A. and Hochberg, I.},
publisher={Allyn and Bacon},
address={Boston},
year={1993},
pages={277-299},
note_={} }
AUDITORY automatic digest system wrote:
>
Date: Fri, 29 Sep 2006 12:37:34 +0200
From: Toth Laszlo <tothl@xxxxxxxxxxxxxxx>
Subject: reference needed (ASR)
Dear List,
I know that speech recognition is a bit off-topic here, but I don't know
of a more proper place to ask this. A reviewer wrote to a paper of
mine that "the fact that better phone recognition does not necessarily
mean better word recognition is already known, and people have been
talking about it very frequently. This should be made clear and perperly
referenced in the paper". Unfortunately, I'm personally sure that I've
never seen this written down, because it would have saved me a lot of
work -- but, unfortunately, I had to learned it from my own failures,
so I'm sure I won't be able to recall any references for this. I'm also
unable to figure out how to turn this thing into a reasonable Google
search term (actually, I've just managed to find a reference for just the
opposite - that "better phone recognition undoubtedly leads to better word
recognition"). So, if anyone can tell me any paper stating or showing
results that "better phone recognition does not necessarily mean better
word recognition", I would be very grateful.
Thanks,
Laszlo Toth
Hungarian Academy of Sciences *
Research Group on Artificial Intelligence * "Failure only begins
e-mail: tothl@xxxxxxxxxxxxxxx * when you stop trying"
http://www.inf.u-szeged.hu/~tothl *
------------------------------
Date: Fri, 29 Sep 2006 07:16:41 -0400
From: Christine Rankovic <rankovic@xxxxxxxxxxxxxxxx>
Subject: Re: reference needed (ASR)
The statement of the reviewer--that better phone recognition does not mean
better word recognition--is wrong. It is possible that the reviewer could
support this statement with data from poorly conducted speech recognition
tests like, for example, those conducted with an inadequate number of speech
items, or when mean scores comprise scores of too few listeners.
Christine Rankovic