[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AUDITORY] AW: [AUDITORY] Tool for automatic syllable segmentation



Dear Rémy,


I would recommend the following workflow:


  1. Use the G2P→ MAUS→ PHO2SYL pipeline in webMaus and select TextGrid as an output. If you do not use ASR services, you will need to upload an orthographic transcript of the syllables along with each sound file. This is especially recommended if the syllables are nonsensical. The pipeline results in several layers, one of which is MAS, containing the syllabified chain.
  2. The output of webMaus (especially if no ASR service is used) can be somewhat unprecise, because it is based on HMM probabilities. It is therefore a wise choice to manually review and adjust the syllable segmentation (drag the boundaries) in the Praat TextGrid.
  3. You can then use a script in Praat to automatically count your interval tiers on the MAS layer. This will give you the correct number of syllables. There are several freely available scripts flying around to help you automate the process. You can also use the Python-version of Praat: Parslemouth, which is a bit more flexible in combination with other Python libraries.

Hope this helps.

Best regards,
Cleo

________________________________________________

Cleopatra Christina Moshona, M.A., M.A.

Research Associate

 

Technische Universität Berlin

Faculty V – Mechanical Engineering and Transport Systems

Institute of Fluid Dynamics and Technical Acoustics

Engineering Acoustics - Psychoacoustics Group

Room: HFT-TA 438
Telephone.: +49 (0)30 314-70437

https://www.tu.berlin/akustik


Von: AUDITORY - Research in Auditory Perception <AUDITORY@xxxxxxxxxxxxxxx> im Auftrag von Rémy MASSON <remy.masson@xxxxxxxxxx>
Gesendet: Mittwoch, 18. September 2024 17:52:47
An: AUDITORY@xxxxxxxxxxxxxxx
Betreff: [AUDITORY] Tool for automatic syllable segmentation
 

Hello AUDITORY list,

 

We are attempting to do automatic syllable segmentation on a collection of sound files that we use in an experiment. Our stimuli are a rapid sequence of syllables (all beginning with a consonant and ending with a vowel) with no underlying semantic meaning and with no pauses. We would like to automatically extract the syllable/speech rate and obtain the timestamps for each syllable onset.

 

We are a bit lost on which tool to use. We tried PRAAT with the Syllable Nuclei v3 script, the software VoiceLab and the website WebMaus. Unfortunately, for each of them their estimation of the total number of syllables did not consistently match what we were able to count manually, despite toggling with the parameters.  

 

Do you have any advice on how to go further? Do you have any experience in syllable onset extraction?

 

Thank you for your understanding,

 

Rémy MASSON

Research Engineer

Laboratory "Neural coding and neuroengineering of human speech functions" (NeuroSpeech)

Institut de l’Audition – Institut Pasteur (Paris)

Accueil | Institut de l'audition