Subject: Re: The natural spectrogram, From: Julius Smith <jos(at)CCRMA.STANFORD.EDU> Date: Tue, 27 Jan 2004 10:52:05 -0800>At 10:05 AM 1/27/2004, Eckard Blumschein wrote: >There are many variants of desinging the windows and also many designs of >wavelets but there is only one physiological function of the inner ear and >only one corresponding natural spectrogram. Yes, but I believe it is possible to configure and suitably process a short-time Fourier transform (STFT) to approach this ideal. What's wrong with "one corresponding natural STFT"? Something along these lines is done in the best model of time-varying loudness perception I am aware of: (at)ARTICLE{GlasbergAndMoore02, AUTHOR = "Brian R. Glasberg and Brian C. J. Moore", TITLE = "A Model of Loudness Applicable to Time-Varying Sounds", JOURNAL = "Journal of the Audio Engineering Society", VOLUME = 50, NUMBER = 5, MONTH = "May", PAGES = {331--342}, YEAR = 2002 } At 10:05 AM 1/27/2004, Eckard Blumschein wrote: >At 09:13 27.01.2004 -0800, you wrote: > >Yes, a "sliding cosine transform" can be used in place of the usual > >"hopping short-time Fourier transform", and in that case, phase information > >is contained in the time variation of the sliding transform > >coefficients. I didn't realize you were doing something like that, > >I claim, you are doing the same, at least twice unconsciously in your inner >ears. I would however argue that neither magnitude-phase representation nor >time-frequency representation omit information while the usual spectrogram >is a faulty design that strips off phase. In other words, phase information >is merely a fictitious component that belongs to an inappropriate model of >the inner ear. I do not see any justification for attributing it to the >actual real-valued analysis. > >so my > >argument was based on different assumptions. Even the short-time Fourier > >transform hopping by half its window length each frame can be stripped of > >all phase information and still be used as the basis of a convincing sound > >synthesis, at least for smoothly changing sounds. > >Yes, this is what the usual spectrogram does. Short-time means acceptable >with respect to temporal resolutiontoo while too short as to resolve low >frequency. Do you not believe that the natural spectrogram overcomes such >discrepancy, too? It is distinguished by: "no arbitrary window and no >trade-off". > >There are many variants of desinging the windows and also many designs of >wavelets but there is only one physiological function of the inner ear and >only one corresponding natural spectrogram. > >Eckard > > > >At 03:08 AM 1/26/2004, Eckard Blumschein wrote: > >>At 12:06 23.01.2004 -0800, Julius Smith wrote: > >> >At 11:16 AM 1/23/2004, Eckard Blumschein wrote: > >> >>First of all, forget the wrong idea that the cochlea performs a complex > >> >>Fourier transform. > >> > > >> >This implies phase is discarded. > >> > >>No! Do not consider me a moron. You and largely the rest of the world grew > >>up with the erroneous believe that there is no equivalent alternative to > >>complex spectral analysis. Complex calculus is indeed tremendously useful. > >>No matter whether one prefers magnitude and phase or real and imaginary > >>part, one always has to consider both constituents except for the case one > >>of them equals zero. Given, a function of time like 2A cos(omega t) does > >>not have any imaginary part at all. Entrance into complex plane is payed by > >>mandatory arbitrary omission of A exp(- i omega t) or A exp(i omega t). > >>Neither the magnitude A nor the phase omega t can be discarded. > >>At that point, you will object: Aren't anti-symmetrical functions, i.e. > >>functions of time with odd symmetry like sinus, also needed in frequency > >>analysis? > >> > >>No again, on condition, causality has been taken into account. In brief: > >>Future signals cannot be analyzed yet. Even sin(omega t) can be continued > >>as its mirror into fictive future time like an even function. Of course, > >>this wouldn't hold for its derivative or antiderivative. However, our topic > >>is just frequency analysis within cochlea. > >> > >> >However, phase information does exist as > >> >the phase of the basilar membrane vibration,... > >> > >>I don't take amiss this fallacy. It has to do with the missing natural > >>justification for fixing any reference point on the time scale. Our ears > >>are not synchronized with anything. When Descartes introduced Cartesian > >>coordinates, he imagined a spatially infinite world. Time is > >>correspondingly believed to also expand from minus infinite to plus > >>infinite. However, elapsed time definitely ends at the 'NOW' being the only > >>clever choice for a natural time scale. Take subsequent snapshots of a > >>sinusoid at NOW each. Try the same with any cochlear pattern. By chance, > >>you might observe sin or cos. In other words, so called linear phase is > >>arbitrary as is time. I don't deny that delay or according phase difference > >>is reasonable with respect to a second signal or a different reference. > >>Without such reference, a sinusoidal function cannot be a identified as > >>sin, cos or something complex in between, and the reference is lacking in > >>nature. The only natural reference is the NOW, which is steadily on the > >>move. This causes the trouble of permanently lagging window position in > >>case of arbitrarily centered complex Fourier transform. > >> > >> > >> >Since basilar membrane filtering is generally > >> >modeled as linear, any corresponding short-time-Fourier-transform would > >> >have to be complex to model basilar membrane filtering. Subsequent > >> >half-wave rectification does not eliminate all phase information, > >> > >>An old specialist of power electronics like me cannot retrace how you > >>imagine rectification of a complex-valued function of time. > >> > >>My wife is a teacher for adults. Perhaps she would more heedfully > >>anticipate what you and many others are feeling rather than thinking. I > >>will try and elucidate how engineers handle a similar case: Consider an > >>ideal sinusoidal voltage as a real input into a circuit that may also > >>contain a first (small) resistor and a reactance in series. Parallel to the > >>first resistor there are a diod and a much larger second impedance in > >>series. The voltage across the first resistor is a complex quantity with > >>respect to the source but pretty independent of the diod. However, > >>piecewise linear calculation requires to refer to the current through the > >>diod as a real one. In case of hearing, phase of the stimulus does not > >>matter since it anyway relates to an arbitrary reference. > >> > >>As a rule, recognized experts like you tend to be cautious against > >>radically uncommon views. Therefore I would like to ask you: Look at > >>pattern of BM motion (e.g. T. Ren's) or of firing in the auditory nerve. > >>They do not resemble magnitude, nothing to say about phase. As far as I can > >>judge, they resemble the pattern of the natural (real-valued) spectrogram. > >>More in detail: Magnitude cannot account for the different patterns with > >>rarefaction vs. condensation clicks while positve and negative amplitudes > >>of the natural spectrogram clearly differ from each other. > >> > >>In all, I didn't find any tenable argument in favor of complex cochlear > >>function. On the other hand, Fourier cosine transform, the natural > >>spectrogram and joint autocorrelation already resolved a lot of so far > >>poorly understood questions. > >> > >>Incidentally, I recall a textbook denying any difference between time > >>domain and frequency domain. I do not fully share this opinion. In > >>particular, I consider it necessary to clearly distinguish between real > >>world and fictitious complex domain. _____________________________ Julius O. Smith III <jos(at)ccrma.stanford.edu> Assoc. Prof. of Music and (by courtesy) Electrical Engineering CCRMA, Stanford University http://www-ccrma.stanford.edu/~jos/