[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] Why is it that joint speech-enhancement with ASR is not a popular research topic?



and, for conventional HMM based systems, you get the best performance when the training data is a good match to the material to be recognised. So if enhancing the speech worsens the match performance will go down.


On 25/06/2018 08:15, Laszlo Toth wrote:
On Sun, 24 Jun 2018, Samer Hijazi wrote:

  It is easy to see that ASR would benefit from speech enhancement, and
speech enhancement would benefit from ASR. But there is very limited
research and publications in this direction vs the 100's of publications on
stand alone ASR, why is that?
The currently dominant directon in ASR is "end-to-end learning".
That is, to drop any hand-crafted feature extraction step from the
processing chain, and let the deep learning algorithm solve the whole
problem "as is". While many people doubt that this is the good direction
(at least, with the current limited-capability learning algorithms), there
is a strong pressure to prefer these end-to-end models over a two-step
model (I mean enhancement+recognition).

                Laszlo Toth
         Hungarian Academy of Sciences         *
   Research Group on Artificial Intelligence   *   "Failure only begins
      e-mail: tothl@xxxxxxxxxxxxxxx            *    when you stop trying"
      http://www.inf.u-szeged.hu/~tothl        *

--
*** note email is now p.green@xxxxxxxxxx ***
Professor Phil Green
SPandH
Dept of Computer Science
University of Sheffield
*** note email is now p.green@xxxxxxxxxx ***