Let me describe my reasoning again with more detail. To facilitate the
explanation let's assume we have infinite length signals and infinitely
narrow filters. Applying the filterbank to the signal leave us with a
decomposition of the signal into its sinusoidal components. Since there is
only one sinusoid per channel, the spectrum at each channel consists of a
single pulse (possibly of zero magnitude) at the central frequency of the
channel. Computing autocorrelation at each channel corresponds to squaring
the magnitude of the spectrum of the signal (a single pulse) and
synthesizing a cosine at that frequency (by Wiener?Khinchin theorem). The
summary autocorrelation just adds those cosines over channels.