Re: Speech(or Phase) Reconstruction from Magnitude Spectrum (Mark Hasegawa-Johnson )


Subject: Re: Speech(or Phase) Reconstruction from Magnitude Spectrum
From:    Mark Hasegawa-Johnson  <jhasegaw(at)UIUC.EDU>
Date:    Tue, 1 Feb 2005 15:45:28 -0600

Hi, I'm not sure if it's already been mentioned, but this article demonstrates the (relatively easy) conditions under which exact signal reconstruction from the magnitude STFT is possible. I think they also give an algorithm: (at)article{NawQua83, abstract="any signal can be reconstructed from magnitude of STFT if overlap >= half window length using linear equations based on |X|^2=FFT(autocorrelation). Interesting, but pedagogically dangerous because it obscures the more general but less efficient DCT reconstruction theorem.", author="S.~Hamid Nawab and Thomas F. Quatieri and Jae S. Lim", journal=tassp, keywords="speech coding, digital signal processing", number="4", pages="986-998", title="Signal Reconstruction from Short-Time Fourier Transform Magnitude", volume="31", year="1983" } Matt Flax wrote: > Hi, > > This topic is very signal processing, or DSP. You will find efficient > solutions by discussing this on the music-dsp e-mail list : > http://ceait.calarts.edu/mailman/listinfo/music-dsp > > Yes you are correct. You do want to 'complexify' the magnitude only > signal. You are now going down a road which is well tread, let me > propose another approach ... > > Rather then think about the instantaneous phase of the signal, consider > how the signal will be processed in sequential blocks .... how do you > combine blocks (windows) of processed signal ? > You may want to look into the standard overlap add technique and combine > it with your current direction. > > Back to your topic .... and in a slightly different approach ... > This complexification can come in many standard forms. They > include minimum phase, maximum phase, zero phase and also mixed phase. > The 'phase' relates to how the signal energy is centered in the time domain. > > Say you do a zero phase realisation, then the overall signal power will > fluctuate according to the STFT power in each Fourier block of data. So > if you keep your block resolution small enough, you should be able to > get a pretty good signal in the end .... this is in some way connected > to the question ... "What is the best sized window required to represent > speech ... ". The answer to that question must be, well, what do you > want to represent best ?!(at)# and can be quite a complex issue ... > > I attach the opposite of what you want to do ... if you invert this one > line algorithm then you will find your answer !!! Pretend the signal in > the script is not in the time domain, but the frequency domain ... > in other words whatever domain you put into the signal, you get out of > the algorithm ... time -> time, frequency -> frequency, f(freq) -> > f(freq) and so on.... > > Be careful and remember some signals are energy and some signals are > power ... these are non-linearly related ... so step your > algorithm carefully from reading in the data to writing it out ... > > Matt > -- > http://www.flatmax.org > > MFFM Bit Stream : > http://sourceforge.net/projects/mffmbitstream/ > Other Projects : > http://sourceforge.net/search/?type_of_search=soft&words=mffm > > > > ------------------------------------------------------------------------ > > %# Copyright 2004 Matt Flax <flatmax(at)ieee.org> > %# This file is a stand alone tool for generating a zero phase > %# signal from a complex time signal > %# > %# It is free software; you can > %# redistribute it and/or modify > %# it under the terms of the GNU General Public License as published by > %# the Free Software Foundation; either version 2 of the License, or > %# (at your option) any later version. > %# > %# This file is distributed in the hope that it will be useful, > %# but WITHOUT ANY WARRANTY; without even the implied warranty of > %# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > %# GNU General Public License for more details. > %# > %# You have received a copy of the GNU General Public License > %# along with this file, if not then please refer to www.gnu.org > %# to gain access to the GNU GPL license. > > function [rSig,cSig]=complexSigToRealSig(complexSig) > > %# converts complexSig to rSig, with zero vector cSig returned. > %# This function is a zero phase implementation. > > signal=ifft(sqrt(2*(abs(fft(imag(complexSig))).^2+abs(fft(real(complexSig))).^ > 2))); > rSig=real(signal); > cSig=imag(signal); > > > endfunction


This message came from the mail archive
http://www.auditory.org/postings/2005/
maintained by:
DAn Ellis <dpwe@ee.columbia.edu>
Electrical Engineering Dept., Columbia University