[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUDITORY] Software for internet-based auditory testing



Hi Dick,
My lab also has a paper with some recommendations about collecting spoken word recognition data online using mTurk:

Slote, J., & Strand, J. (2016). Conducting spoken word recognition research online: Validation and a new timing method. Behavior Research Methods 48(2), 553-66. doi: 10.3758/s13428-015-0599-7 

https://www.dropbox.com/s/0lniloy1cwit2rh/Slote%2C%20Strand.%202016.%20Conducting%20spoken%20word%20recognition%20research%20online%20Validation%20and%20a%20new%20timing%20method.pdf?dl=0


Hope this is helpful.
Best,
Julia

On Wed, Oct 4, 2017 at 2:52 AM Brecht De Man <b.deman@xxxxxxxxxx> wrote:
Our ‘Web Audio Evaluation Tool’ aims to address several of the points raised here; e.g.
- “inexpensive and simple to program”: free, open source, and with an optional GUI test creator
- "ideally with response times”: all timing information (clicks, plays, …) is logged, and can for instance be visualised as a timeline (https://github.com/BrechtDeMan/WebAudioEvaluationTool/wiki/Features#metrics)
- “good functionality for auditory playback”: based on the Web Audio API (HTML) so no Flash, Java or other 3rd party software needed, very fast response and seamless switching, very widely compatible including mobile devices
- “can be used for all kinds of experiments”: implements a wide variety of standards as presets, based on a few elementary interfaces: vertical and horizontal sliders, Likert, AB(CD…), AB(CD…)X, ranking, and waveform annotation (https://github.com/BrechtDeMan/WebAudioEvaluationTool/wiki/Interfaces). Not so much ‘method of adjustment’ at this time.

We welcome any contributions and feature requests, as we aim to make a maximally comprehensive yet elegant and easy-to-use listening test tool through community effort. 

I am not aware of any published use of it on Mechanical Turk - though it’s something I want to try myself soon - but others have integrated it in systems which track progress of several experiments, for instance. We’ve included some functionality to facilitate this, like the ‘returnURL’ attribute which specifies the page to direct to upon test completion. 

All info on
https://github.com/BrechtDeMan/WebAudioEvaluationTool
and 
Nicholas Jillings, Brecht De Man, David Moffat and Joshua D. Reiss, "Web Audio Evaluation Tool: A Browser-Based Listening Test Environment," 12th Sound and Music Computing Conference, July 2015. (http://smcnetwork.org/system/files/SMC2015_submission_88.pdf

Please send any questions, suggestions or comments you may have to b.deman@xxxxxxxxxx.

Best wishes,

Brecht

________________________________________________

Brecht De Man
Postdoctoral researcher
Centre for Digital Music
Queen Mary University of London

School of Electronic Engineering and Computer Science
Mile End Road
London E1 4NS
United Kingdom

b.deman@xxxxxxxxxx 
Google Scholar | ResearchGate | Academia 




On 4 Oct 2017, at 06:38, Richard F. Lyon <dicklyon@xxxxxxx> wrote:

Many thanks, Sam and Bryan and Kevin and all those who replied privately.

I can see many possible ways forward; just need to get pecking at some...

Dick

On Tue, Oct 3, 2017 at 7:06 PM, kevin woods <kevinwoods@xxxxxxxxxxxxxxx> wrote:
Further to Sam's email, here is a link to a code package we put together to implement our headphone screening task (intended to improve the quality of crowdsourced data): http://mcdermottlab.mit.edu/downloads.html

We have generally found that the quality of data obtained online with our screening procedure is comparable to that of data obtained in the lab on the same experiments. For obvious reasons we have only run experiments where precise stimulus control seems unlikely to be critical. 

Please feel free to contact us at kwoods@xxxxxxx with questions.

Sincerely,

Kevin Woods (on behalf of the McDermott Lab, Department of Brain and Cognitive Sciences, MIT)


On Tue, Oct 3, 2017 at 12:59 AM, Samuel Mehr <sam@xxxxxxxxxxxxxxx> wrote:
Dear Dick,

Lots of folks do successful audio-based experiments on Turk and I generally find it to be a good platform for the sort of work you're describing (which is not really what I do, but experimentally is similar enough for the purposes of your question). I've done a few simple listening experiments of the form "listen to this thing, answer some questions about it", and the results directly replicate parallel in-person experiments in my lab, even when Turkers geolocate to lots of far-flung countries. I require subjects to wear headphones and validate that requirement with this great task from Josh McDermott's lab:

Woods, K. J. P., Siegel, M. H., Traer, J., & McDermott, J. H. (2017). Headphone screening to facilitate web-based auditory experiments. Attention, Perception, & Psychophysics, 1–9. https://doi.org/10.3758/s13414-017-1361-2

In a bunch of piloting, passing the headphone screener correlates with a bunch of other checks on Turker compliance, positively. Things like "What color is the sky? Please answer incorrectly, on purpose" and "Tell us honestly how carefully you completed this HIT". Basically, if you have a few metrics in an experiment that capture variance on some dimension related to participant quality, you should be able to easily tell which Turkers are actually doing good work and which aren't. Depending on how your ethics approval is set up, you can either pay everyone and filter out bad subjects, or require them to pass some level of quality control to receive payment.

best
Sam


-- 
Samuel Mehr
Department of Psychology
Harvard University



On Tue, Oct 3, 2017 at 8:57 AM, Richard F. Lyon <dicklyon@xxxxxxx> wrote:
Five years on, are there any updates on experience using Mechanical Turk and such for sound perception experiments?

I've never conducted psychoacoustic experiments myself (other than informal ones on myself), but now I think I have some modeling ideas that need to be tuned and tested with corresponding experimental data.  Is MTurk the way to go?  If it is, are IRB approvals still needed? I don't even know if that applies to me; probably my company has corresponding approval requirements.

I'm interested in things like SNR thresholds for binaural detection and localization of different types of signals and noises -- 2AFC tests whose relative results across conditions would hopefully not be strongly dependent on level or headphone quality.  Are there good MTurk task structures that motivate people to do a good job on these, e.g. by making their space quieter, paying attention, getting more pay as the task gets harder, or just getting to do more similar tasks, etc.?  Can the pay depend on performance?  Or just cut them off when the SNR has been lowered to threshold, so that people with lower thresholds stay on and get paid longer?

If anyone in academia has a good setup for human experiments and an interest in collaborating on binaural model improvements, I'd love to discuss that, too, either privately or on the list.

Dick



--
Julia Strand, PhD
Assistant Professor of Psychology
Carleton College               
One North College Street
Northfield, Minnesota 55057

507-222-5637 (office)