Home │ Audio
Home Page |

Copyright
©
2009
by
Wayne
Stegall

Updated December 2, 2015. See Document History at end for
details.

Having had a turntable as the center of my first
hi-fi
system, I was disappointed to hear my first CD player. Like
others, I thought
the treble was coarse and grainy. Although I was immediately
hooked to the
absence of noise, the quality of the sound was not what I expected.

My first CD player had an 18 bit digital filter on
the front
end of its DAC. I suspect round-off error to be the primary
reason for initial
disappointment with CD playback. That upsampling has recently
been more
favored than oversampling, suggests the importance of bit
resolution. In my
efforts to determine the difference between these two processes, I
found
synchronous upsampling and oversampling to be the same. The
difference in
sound may be due bit resolution. Oversampling is usually done in
the DAC
presumably at its resolution of 16, 18, 20, or 24 bits.
Currently, the best
external asynchronous resamplers (which are used to do synchronous
upsampling)
have an internal resolution of 28 bits. Jitter reduction may add
to the
improvement in the case of asynchronous upsampling, but that subject
has been
thoroughly covered by others. It is also worthy of note that a CD
player using
a new 32-bit AKM DAC is highly acclaimed now. Some hold that the
higher noise
of analog and the resulting limit on dynamic range diminish or limit
the need
for higher bit resolutions. I believe the resolution of analog is
greater than
its dynamic range. This is the subjective opinion of many.
That a signal can be
extracted from much louder noise background by correlation algorithms
is a very
objective support of the same conclusion.

It is evident that a half sampling frequency
waveform (f_{s}/2
= 22.05kHz) is forced into phase with the digital clock and its
amplitude
diminished from its true value by an amount related to the phase
compression.
To what degree are lower frequencies subject to the same effect?
Because of
three-phase ac motors’ reputation for smoothness, I occurred to me that
with a
one third sampling rate signal (f_{s}/3 = 14.7kHz) the 120
degree
separation of sampling vectors would certainly allow for proper phase
and
magnitude representation. From this frequency down, all would be
well in this
regard. Above this frequency, there is the potential for
trouble.
Mathematically, the digital filter was expected to eliminate all of
these
effects, because presumably the error in the signal is contained in the
out-of-band digital images created by sampling. The phase and
magnitude
anomalies would be filtered out along with the unwanted images.
This math
depends on the signal being predictable and steady, however.
Music is
transient by nature. If the signal does not hold steady long
enough for a
reasonable representation of the beat frequencies created by sampling
process,
there may be no certainty that the exact phase and magnitude can be
extracted
from frequencies above f_{s}/3. The importance of the f_{s}/3
frequency
of
14.7
kHz
may
explain
why
some
have noticed the CD medium
to be
deficient and others not. Many do not have hearing above this
frequency due to
the abuse of concerts and other loud music.

Figure 1: F_{s}/2
Polar
Plot - Black arrows show sampling vectors at f_{s}/2
(22.05kHz.) Full phase compression, vectors cannot add to any
other phase. Magnitude trigonometrically related to phase
alignment. |

Figure 2: F _{s}/3 Polar Plot - Black arrows
show sampling vectors at f_{s}/3 (14.7kHz.) Pink and red
vectors show one example of how two sampling vectors can add to any
desired magnitude and phase angle. |

Mention of the transient nature of music affecting
digital
reproduction, suggests another limitation. First consider the
nature of the
digital signal. The time sampled signal is related through a
discrete Fourier
transform to a discrete frequency spectrum. Because the Fourier
transform is
linear, we can break up our music into separate Fourier transforms all
corresponding to different transient segments of the entire
recording. This is
true for the continuous or analog Fourier Transform as well. For
clarity,
let’s define a musical impulse as the smallest transient of music
separated
from the whole that has mathematical significance. Now, having
each musical
impulse in separate sequences of samples, we can analyze each
separately. Now
it is apparent from the properties of the transform and the digital
domain that
the frequency resolution of each musical impulse or transient depends
on its
duration. This is because there are half as many frequencies
represented as
time domain samples. Specifically the lowest represented
frequency and the
separation of discrete frequencies and the resulting frequency
resolution is
the inverse of the transient duration. I.e. a 1ms^{1}
transient can only represent frequencies from 1kHz to 22kHz in 1kHz
steps. A
5ms transient can only resolve frequencies from 200Hz up in 200Hz
steps. A 1s
musical impulse could resolve 1Hz frequency distinctions. This
also implies
that the frequencies of a transient are ambiguous at the first instant
and become
more defined as the transient proceeds. This is an observation
that can be
made by tuning a string on a musical instrument, whether by ear or by
an
electronic tuner. All of the musical impulses in a recording
would have
different resolutions depending on their duration. Because the
information
contained in a transient is proportional to its length, this limitation
is at
first a natural one. These natural signal limitations then are
approximated in
the number of samples allowed by the sampling rate during the
transient. Upon
sampling, it becomes very clear that the limited number of samples
translates
to a similar granularity of frequency resolution. A musical
impulse having
more frequency information than its duration would allow would compress
the extra
frequency information, creating the sort of digital graininess that
many
actually claim to hear. A cymbal crash, resembling white noise of
continuous
frequency, is a prime example of a transient that has more frequency
information than its length will allow after sampling. These
results may seem
odd, but represent the breakdown of wide ranging music into its most
fundamental building blocks. That is, as a whole, the sum of all
the impulses
conveys the sound with all of its apparent frequencies plus added
digital
artifacts.

These thoughts suggest that greater bit resolution and higher sampling rates are indeed desirable after all.

^{1}s is the metric abbreviation
for second; ms is short for millisecond =
1/1000 of a second

Document History

May 23, 2009 Created.

May 23, 2009 Revised.

April 19, 2012 Changed seconds unit from S to s to avoid
confusion with the Siemens unit.

December 2, 2015 Improved formatting.