[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] Fwd: [AUDITORY] MUSHRA test with open reference; and then without open reference



Hi Hannes:

Thank you for bringing ITU-R BS.2132-0 to my attention—I wasn't aware of it!

I also appreciate you sharing the observations from your experiments.

Best regards,
Arijit




From: AUDITORY - Research in Auditory Perception on behalf of Hannes Helmholz
Sent: Tuesday, August 20, 2024 3:43 PM
To: AUDITORY@xxxxxxxxxxxxxxx
Subject: Re: Fwd: [AUDITORY] MUSHRA test with open reference; and then without open reference

This is an interesting subject. There is a follow-up recommendation that
could apply to your context. There is no catchy abbreviation in the
recommendation, but its developers (Fraunhofer IDMT) endorse the term
MuSCR (multi-stimulus category rating, in the sense of different
categories, i.e.,  perceptual attributes, being evaluated for several
conditions on separate MUSHRA-like pages/trials).

ITU-R BS.2132-0, “Method for the Subjective Quality Assessment of
Audible Differences of Sound Systems using Multiple Stimuli without a
Given Reference.” International Telecommunications Union, pp. 1–18, 2019.

I'm unaware of studies directly comparing the results from MUSHRA and
MuSCR. We exposed subjects to several conditions without a reference
(although hidden high and low "anchor" conditions were included). They
were asked to evaluate head-tracked binaural reproductions' "overall
quality" from various microphone arrays. I consider this a relatively
complex and open task since subjects must establish their internal
reference on each trial/page of conditions.
We observed trends similar to those in the existing literature that
employed MUSHRA in comparable studies. However, there were more
considerable variations and inconsistencies than we have seen in former
MUSHRA experiments. I would say this is not surprising, considering the
nature of the task.

ITU-R BS.2132-0 recommends using expert listeners as a countermeasure to
large variances, which we could not reasonably implement. An alternative
idea would be employing more subjects and excluding listeners based on
some quantifiable consistency measure (although this would have to be
evaluated and justified very carefully).

Best wishes,
/Hannes
PhD Student, Chalmers University of Technology, Gothenburg, Sweden

On 2024-08-20 11:18, Raul Sanchez-Lopez wrote:
> Dear Arijit,
>
> That's an excellent point. As of yet, I haven't come across any studies
> directly investigating that specific comparison. However, the
> development of MUSHRA likely included such an evaluation. The method is
> defined in ITU-R Recommendation BS.1534-3. The introduction states:
>
> "This Recommendation describes a method for the subjective assessment of
> intermediate audio quality. This method reflects many aspects of
> Recommendation ITU-R BS.1116 and incorporates the same grading scale
> used for picture quality evaluation (i.e. Recommendation ITU-R BT.500)."
>
> While BS.1116 features a hidden reference, it lacks a hidden anchor. To
> assess intermediate audio quality, a new method was developed (more
> details here: https://secure.aes.org/forum/pubs/conferences/?elib=8056).
>
> Crucially, analyzing how the panel utilizes the scale requires both the
> Reference and the Anchor for meaningful difference evaluations. If
> there's no reference, it's simply not MUSHRA but a different type of
> assessment. In MUSHRA, participants perform multi-comparisons and place
> their ratings between the reference and anchor. Without a reference,
> they need to establish one first. This increases the risk of
> inconsistency across repetitions, potentially leading to noisy data.
>
> Would you be willing to share some more details about your experiment?
> I've dealt with similar questions in the past and may be able to offer
> assistance or point you towards someone more experienced, like Force
> Technology, for further guidance.
>
> Best wishes,
>
> ---
> Raul Sanchez-Lopez
> Hearing scientist | Audio Engineer | Technical Audiologist
> Institute of Globally Distributed Open Research and Education
>
> On 2024-08-20 10:52, Raul Sanchez-Lopez wrote:
>> ---------- Forwarded message ---------
>> From: #ARIJIT BISWAS# <000003292f44871c-dmarc-request@xxxxxxxxxxxxxxx>
>> Date: Tue, 20 Aug 2024 at 09:14
>> Subject: [AUDITORY] MUSHRA test with open reference; and then without
>> open reference
>> To: <AUDITORY@xxxxxxxxxxxxxxx>
>>
>>  Dear all:
>>
>>  Is there any research on how subjective ratings in a MUSHRA test
>> might be affected if the same systems (i.e., hidden reference,
>> anchors, codecs under test) are re-evaluated without the ability to
>> compare them against an open reference?
>>
>>  If no such studies/papers exist, can we make any educated guesses
>> about the potential outcomes?
>>  I imagine that, in the absence of a reference, the task for the
>> subjects would become more difficult.
>>  Any other guesses?
>>
>>  Thank you.
>>
>>  Best regards,
>>  Arijit
>>
>>  **Disclaimer** The sender of this email does not represent Nanyang
>> Technological University and this email does not express the views or
>> opinions of the University.

**Disclaimer** The sender of this email does not represent Nanyang Technological University and this email does not express the views or opinions of the University.