Subject: Re: music pitch tracking From: Arturo Camacho <acamacho@xxxxxxxx> Date: Sun, 29 Mar 2009 21:22:26 -0600 List-Archive:<http://lists.mcgill.ca/scripts/wa.exe?LIST=AUDITORY>Last year we published a new monophonic pitch estimator (see reference below) named SWIPE' which was shown to be particularly good (compared to other methods) with musical instruments (our tests showed a reduction in the gross error rate by a factor of approximately 2 compared to the best competitor). SWIPE' is available as Matlab (and Octave) code in the following URL: http://www.cise.ufl.edu/~acamacho/publications/swipep.m. The output is a pitch time series in hertz, which can be easily transformed to MIDI numbers and graphed using any plotting software. It works off-line and not real-time, but with a little of work it can be made real-time (with some delay, of course). Briefly, SWIPE' is similar to autocorrelation when analyzed in the frequency domain (i.e., it performs inner products between the power-spectrum and cosine kernels), but (1) the spectrum is computed using a (well defined) window size that is proportional to the pitch period (to make match the width of the main spectral lobe with the width of the positive part of the cosine lobe); (2) The square root of the amplitude spectrum is used instead of its square (hard to explain why in one sentence); (3) The cosine is multiplied by a decaying envelope that gives to each harmonic a weight inversely proportional to its order (i.e., lower order harmonics are given more weight than higher order harmonics); (4) the non-prime lobes of the cosine are removed (to reduce the score of subharmonics of the pitch); (5) the cosine is divided by its norm to have a norm-1 kernel (benefit is hard to explain in one sentence). Arturo Reference: Camacho, A., Harris, J. G. (2008). “A sawtooth waveform inspired pitch estimator for speech and music,” Journal of the Acoustical Society of America, vol. 124, pp. 1638-1652. On Sun, Mar 29, 2009 at 11:08 AM, James W. Beauchamp <jwbeauch@xxxxxxxx> wrote: > Dear List, > > While we're on the subject of musical pitch tuning and tracking, since > this is a subject I've been interested in for a long time, I would like > to put in my 2 cents (maybe more). > > Musical pitch tracking is an old subject. Seashore published some in > 1932 and Obata and Kobayashi in 1937 and 1938. Tove et al. described > a system consisting of transistor electronics and a fast chart > recorder in 1966. The idea is to plot melodies graphically to see > what kinds of pitch and rhythmic changes occur over a substantial > duration. As Tove et al. said, > > "The need for objective notation of time variations of frequency and > amplitude in theoretical studies of musical phenomena is obvious and > should have many applications in investigations of style, rythm (sic), > and variations and deviations of key, in the study of conventional, > modern, and folk music, as well as in basic studies of musical > perception and creation." > > InterOcean Systems developed and marketed a real-time music analyzer > called the Melograph in the 1970s, based on research by Charles > Seeger at UCLA. It was contained in a compact rack-mounted package > with an internal chart recorder and sold for about $8000. > Unfortunately I couldn't afford to buy one. > > Since the advent of the computer, a plethora of pitch detectors/ trackers > have been developed, and there's been too many to mention all of them. > In 1989 at our lab at Univ. of Illinois at Urbana-Champaign, Rob Maher > developed a nice one for music based on the short-time Fourier transform > and a method called the "two-way mismatch (TWM) method", as part of his > PhD thesis on musical sound source separation. This non-real-time program > is contained in the SNDAN suite of programs that is available for free > download and compilation on Unix systems (e.g., Linux or Mac OS X). A > Windows/DOS version is also available. (See > http://ems.music.uiuc.edu/beaucham/software/sndan/ ) This method also > generates a chart-recorder-like image of musical pitch vs. time. > > Recently Ugar Guney developed a real-time version of the TWM method, > which again is a free download and is platform independent if you > have Java installed, called "freqazoid". This is definitely in beta form, > but, again, it's free. > > I've also used the autocorrelation pitch detector in Praat and, after > converting the frequency output to log form and graphing, have gotten > similar results. This is also a free download. > > What is needed is a system that is accurate to a few cents but can also > cover a wide range of pitch, at least 3 octaves, but 7 would be great. > It should be able to handle a wide variety of waveforms, drop outs, a > fair amount of noise and inharmonicity, and it should be able to handle > very fast changes in pitch, i.e., it should be able to accurately > transcribe virtuosic passages (64th notes, etc.), as well as glides, > vibrato, and portamento. > > Besides displaying the data on a log(f0) vs. time chart, the system > should also be able to generate the data to a file for subsequent > post-processing research. Conversion to MIDI and musical notation are > nice features, but these are already available in programs once the > log(f) data is provided. > > I see that G-tune, which has now merged with Peterson Electro-Musical > Products, has morphed into StroboSoft 2.0. This contains a vast number > of tuning features, but the feature that I'm most interested in is > its "pitch graph", which is only provided in the deluxe version ($100). > Unfortunately, they don't spec the accuracy, range, and speed of this > graphing tool. > > I would also like to mention that our group is also interested in > polyphonic pitch detection, i.e., simultaneous pitch transcription of > more than one voice at a time. Some progress in this field has been > made during the last few years. Recently Anssi Klapuri and Tuomas > Virtanen have made a good summary of these efforts (see below). > > Jim Beauchamp > Univ. of Illinois at Urbana-Champaign > > References > > Seashore, C. E., "The Vibrato", in Studies in the Pschology of Music, > Vol. 1, U. of Iowa (1932). > > Obata, J. and R. Kobayashi, "A Direct Reading Pitch Recorder and its > Applications to Music and Speech", J.A.S.A. Vol 9 (1937). > > Obata, J. and R. Kobayahsi, "An Apparatus for Direct Recording the > Pitch and Intensity of Sound", J.A.S.A. Vol 10 (1938). > > Seeger, J., "Toward a universal music sound-writing for musicology", > Int. Folk Music Council, Vol. 9 (1957). > > Tove., P. A., B. Norman, L. Isaksson, and J. Czekajewski, "Direct- > Recording Frequency and Amplitude Meter for Analyzing of Musical and > Other Sonic Waveforms", J. A. S. A. Vol. 39 (1966). > > Boersma, P., "Accurate short-term analysis of the fundamental frequency > and the harmonics-to-noise ratio of a sampled sounds", Proc. Inst. > Phonetic Sciences, Vol 17, Amsterdam (1993). > > Maher, R. C. and J. W. Beauchamp, "Fundamental frequency estimation of > musical signals using a Two-Way Mismatch procedures", J. A. S. A., > Vol. 95 (1994). > > Klapuri, A. and T. Virtanen, "Automatic Music Transcription", Handbook > of Signal Processing in Acoustics, Vol. 1, Springer (2008). > -- __________________________________________________ Arturo Camacho L., Ph.D. Profesor Escuela de Ingeniería Eléctrica Universidad de Costa Rica __________________________________________________