Spacious Labs » Articles


Hostess Twinkies and the Music Industry

If you love food or music, this is a great time to be alive.

Following customer demand, segments of the food industry have been emerging over the past 30 years with a focus on quality and variety, rather than just sheer quantity.  Even in a modestly urban place like Salt Lake City, the food choices are astonishing, and constantly improving.  From natural and specialty food stores, to innovative new restaurants, microbreweries, farmer’s markets, cooking classes, and affordable new cooking technology, the landscape is amazing and growing quickly.

Meanwhile, other segments of the food industry have remained stagnant.  The recently announced (second) bankruptcy of Hostess Brands, Inc. has been blamed on labor unions and corporate mismanagement, among other things.  However, the recent history of the company reflects a slow downward spiral, driven by an increasing mismatch between the food landscape, customer values, and the products the company chose to produce and market.  In the 1960s, Wonder Bread and Twinkies were great products with large and growing markets.  At the time, even “supermarkets” didn’t have anything near the variety of products available today, and our understanding of the relationship between food and health was not particularly well-developed.  Sliced puffy bread and tasty treats with essentially infinite shelf life were just the thing.  Success generates momentum, and the company moved forward like a freight train.  Unfortunately for the company’s 18,500 employees and their families, nobody seemed to notice when the fuel was running out.

Similarities with the music industry are hard to ignore.  Anyone interested in music, either as a listener or as a performer/producer, is met today with a practically endless variety of products, services, and technology.  The quality is there, too, although it may not always be easy to find.  The internet has made it possible for practically anyone to understand and experience music in far greater depth and detail than ever before.  Learning an instrument has never been easier, and producing a track (or an album) is a task easily within the grasp of a sufficiently motivated amateur.  Innovative companies are serving customer demand and developing new technologies, immersing themselves in this new environment, having fun, and making a good living.

And then there is the music recording industry.  Locked in an outmoded business model, lacking innovative ideas or technical expertise, and overstaffed with lawyers, their freight train is running out of fuel.  Their traditional role of identifying, nurturing, and developing talent, while providing a strong marketing platform, has degenerated into that of a marketing-driven producer of banal wonder bread.  Nickelback, anyone?


Random thoughts on noise

Generally we think of noise as an unwanted, irrelevant artifact.  Almost always in acoustics, electrical engineering, and applied physics, the goal is to reduce noise, or at least to extract useful information from a noisy signal.  In our daily lives, it seems that acoustic noise is a constant, if not conscious presence.

Human brains are amazingly adept at filtering noise from both auditory and visual sources.  No current computational algorithm can even come close to matching the brain’s performance on distinguishing noise from relevant information.

In applied mathematics, the term white noise has a very precise definition.  In fact many equivalent definitions–perhaps the simplest being the derivative of the Wiener process (also known as Brownian motion).  One of its characteristics is a constant spectral power density.  This noise turns out to be an extremely important concept which appears in many different contexts, including finance, diffusion processes, quantum mechanics, electronics, and communications.  In all of these areas, an analytical understanding of noise and its properties leads to great insights into the behavior of real systems.

Audible filtered white noise can actually be quite pleasant.  Think of the sound of a soft rain, the rustling of dry leaves in the fall, wind in the trees, or waves on the beach–each sound has a lot in common with filtered white noise.

It may be that one way of distinguishing between annoying noise and pleasant noise is asking whether there is a possibility that some portion of the sound may contain something important.  Processing noise that may contain interesting information, like the randomly overlapping voices at a party, requires much more mental energy than simply enjoying the natural sound of a waterfall.  It is very hard to ignore voices, even when they are not speaking to us.  Presumably this is more distracting.

Taking this idea a little further, information theory relates information to changes in entropy.  Audible white noise (and pink noise, and brown noise) has greater entropy, and contains less information, than speech, rock concerts, and the sound of your neighbor’s lawnmower.   It would be interesting to study whether or not there is a substantial correlation between information content, and the perceived level of annoyance of sounds.  (If any readers are aware of such studies, we would love to hear about them.)

In certain very specific situations, noise can actually make signals easier to detect and interpret.  This is called stochastic resonance.  It happens when there is a nonlinear detection process (such as a simple threshold), and the addition of noise to a weak signal brings the sum, on average, into a more easily detectable range.

Clearly noise has different meanings in different contexts.  Furthermore, there are large variations among individuals as to what constitutes noise.  Of course, even the same person might define noise differently depending on the time of day or the circumstances under which a particular sound is heard.

No real conclusion here.  Okay, maybe just one: the single word “noise” is inadequate to convey all of the meanings and connotations we ascribe to it.


The history of sound

This article describes some interesting points in the early history of the science of sound and how it relates to our modern understanding.

People have been interested in the sense of hearing and the phenomenon of sound since ancient times.  Unlike many other fundamental aspects of our physical world, like light, electricity and magnetism, the ancients had a pretty good idea of what sound is and how it behaves.

Many people believe that the Greek philosopher and mathematician Pythagoras of Samos (famous for the Pythagorean Theorem) was the first to understand the relationship between the frequency of vibration of a string and the perceived pitch of the generated sound.  He also noticed that a larger amplitude in the vibration of the string created louder sounds.  This would have been in the 6th century BC, more than 2500 years ago.  Whether or not Pythagoras himself is responsible for the theory, it is clear that his contemporaries had a good understanding of musical scales and pitch, based on numbers and ratios.

Although it is not entirely clear who actually wrote the text, sometime around 350 BC, Aristotle or one of his contemporaries put forth the idea that sound travels through the air when the motion of an object causes the neighboring air to move, and that this disturbance makes it’s way to the listener’s ear, which is perceived as a sound.  This is described in the discourses Sound and Hearing:

Sound takes place when bodies strike the air, not by the air having a form impressed upon it, as some think, but by it being moved in a corresponding manner; the air being contracted and expanded and overtaken, and again struck by the impulses of the breath and strings.  For when the air falls upon and strikes the air which is next to it, the air is carried forward with an impetus, and that which is contiguous to the first is carried onward; so that the same voice spreads every way as far as the motion of the air takes place.

A wave theory of sound, building on the ideas of Aristotle, and containing many of the basic elements of our modern understanding, seems to have taken shape by about the first century BC, or perhaps earlier.  The ideas were probably developed by the Greek Stoics.  Writings from that time correctly describe sound propagating outward from a source in spherical or circular waves, in analogy to water waves moving out from a disturbance on a pond.  Unfortunately, not much of the writings of the Stoics survived, with most of our information coming from secondary sources.  It is hard to know exactly how much they understood.

Some of the most interesting records from this period came from the Roman architect and engineer, Marcus Vitruvius Pollio (see reference [1]).  His ten-volume book De Architectura, probably written around 15 BC, contains some amazing insights on acoustics, and clearly demonstrates a fairly sophisticated understanding of sound.  Many of the same issues encountered in modern architectural acoustics were already known to Vitruvius more than 2000 years ago.  The complete text of a 1914 English translation of De Architectura, by M. H. Morgan of Harvard University, is in the public domain and available from the Gutenberg Project.  The entire text is worth reading, but a few passages from Book V are particularly noteworthy.  First, from Chapter III:

6. Voice is a flowing breath of air, perceptible to the hearing by contact. It moves in an endless number of circular rounds, like the innumerably increasing circular waves which appear when a stone is thrown into smooth water, and which keep on spreading indefinitely from the centre unless interrupted by narrow limits, or by some obstruction which prevents such waves from reaching their end in due formation. When they are interrupted by obstructions, the first waves, flowing back, break up the formation of those which follow.

7. In the same manner the voice executes its movements in concentric circles; but while in the case of water the circles move horizontally on a plane surface, the voice not only proceeds horizontally, but also ascends vertically by regular stages. Therefore, as in the case of the waves formed in the water, so it is in the case of the voice: the first wave, when there is no obstruction to interrupt it, does not break up the second or the following waves, but they all reach the ears of the lowest and highest spectators without an echo.

Apparently Vitruvius understood not only the spherical expansion of wave fronts, but also the effects of reflections via interference with the incoming wave.  Even more interesting is this quote from Chapter VIII,  Acoustics of the Site of a Theater:

1. All this having been settled with the greatest pains and skill, we must see to it, with still greater care, that a site has been selected where the voice has a gentle fall, and is not driven back with a recoil so as to convey an indistinct meaning to the ear. There are some places which from their very nature interfere with the course of the voice, as for instance the dissonant, which are termed in Greek κατηχουντεϛ; the circumsonant, which with them are named περιηχουντες; again the resonant, which are termed ἁντηχουντες; and the consonant, which they call συνηχουντες. The dissonant are those places in which the first sound uttered that is carried up high, strikes against solid bodies above, and, being driven back, checks as it sinks to the bottom the rise of the succeeding sound.

2. The circumsonant are those in which the voice spreads all round, and then is forced into the middle, where it dissolves, the case-endings are not heard, and it dies away there in sounds of indistinct meaning. The resonant are those in which it comes into contact with some solid substance and recoils, thus producing an echo, and making the terminations of cases sound double. The consonant are those in which it is supported from below, increases as it goes up, and reaches the ears in words which are distinct and clear in tone. Hence, if there has been careful attention in the selection of the site, the effect of the voice will, through this precaution, be perfectly suited to the purposes of a theatre.

While relying on ideas from the Greeks, Vitruvius knew both the causes and the effects of echo and reverberation, and seems to have understood the importance of passive sound reinforcement in large auditoriums.  The Greeks and the Romans relied mainly on tradition and intuition for guidance in theater design.  The popularity and availability of the books of Vitruvius made the Greek wave theory of sound available to western thinkers for many centuries.  Boethius, around 500 AD, promoted the wave theory, and also used the analogy of ripples on water, but never seemed to mention earlier sources for his ideas.  His book De Musica was used as a text in universities for hundreds of years [1].

Incidentally, Julius Caesar is said to have issued the first law regarding noise.  Irritated by chariots disturbing his sleep, he restricted traffic in the area surrounding his residence at night.

With the Renaissance in Europe, scientific progress accelerated on all fronts, including acoustics.  While in the late fifteenth century, Leonardo da Vinci was still describing sound waves much like Vitruvius did, by the seventeenth century, careful experiments in acoustics were being carried out following the modern scientific method.  Many of the great scientists of the time had interests in acoustics, and work on the subject was widespread all the way through the 1800s.  Certainly there was too much interesting work to summarize in a brief article such as this.  We will only mention a few highlights.

Marin Mersenne (1588-1648) made the first experimental estimation of the speed of sound, and published the influential book Harmonie Universelle, on sound and music.  Marsenne was a friend of Descartes, who also had interests in music.  Isaac Newton‘s discovery of the Calculus in the late 1600s enabled the formation of a truly modern theory of sound, as scientists now had the mathematical language necessary to describe the physical phenomena they studied.  Newton himself made an estimate of the speed of sound, based entirely on theory.  Newton’s calculation was later corrected by the French mathematician Pierre-Simon Laplace in the early 1800s.  In 1863, the German physicist Hermann von Helmholtz published the manuscript Die Lehre von den Tonempfindungen als physiologische Grundlage für die Theorie der Musik (On the Sensations of Tone as a Physiological Basis for the Theory of Music).  This work remains influential to this day, as does the 1877 two-volume work The Theory of Sound by Lord Rayleigh.

The history of sound is unusual among the sciences, given it’s amazing longevity and obvious interest to so many great scientific minds of the past.  The story is not over yet.  While the physics of sound propagation is very well understood, a long list of difficult practical problems remain, as we struggle to control sound and create better acoustical environments.  Current scientific research in acoustics does not enjoy the popularity (or funding) of subjects like photonics or particle physics, but acoustics will always be a direct and fundamental part of the human experience.


[1] Frederick G. Kilgour, “Vitruvius and the Early History of Wave Theory”, Technology and Culture, vol. 4, no. 3, Summer 1963, pp. 282-286.


Articulation index

The articulation index is a number used in architectural acoustics to measure the ease with which a listener can understand speech in a given environment.  The articulation index takes on values between 0.0 and 1.0, with larger numbers indicating better speech intelligibility.  This article explains what the articulation index means, how it is calculated or measured, and how it can be used in practice to design or improve listening spaces.

The classical experiment to determine speech intelligibility is called an articulation test.  It involves placing a human speaker and one or more human listeners (with normal hearing) in the environment to be measured.  The speaker reads a list of words or sentences, while the listener writes them down.  Lists of 100 single-syllable words, the so-called R lists, which were developed in the 1940s by Professors Egan and Stevens of the Harvard University Psycho-Acoustic Laboratory for evaluating electronic communication systems, have been widely used in articulation tests.  The percentage of correct responses by the listener provides a good indication of how easy or difficult it is to understand speech in the given environment.

Although articulation tests provide a direct measurement of speech intelligibility, they have some problems.  First, the results depend upon the human participants in the experiment, and the particular word lists being used, and tend to not be perfectly repeatable.  Second, the tests do not provide any information about why one environment may score higher than another.  If the goal is to control the speech intelligibility of a space, it is important to have a clear connection between the physical characteristics of the sound field and the measure of intelligibility.

These concerns led to the development of the articulation index (AI).  The articulation index is a single number which can be interpreted very similarly to the results of an articulation test, but is based on physically measurable quantities, and is quite easy to compute.  The factors used to compute the AI also give insight into how to control speech intelligibility in a given environment.

The articulation index is calculated using signal-to-noise ratios (S-N ratios).  A signal-to-noise ratio is just what it says: it is the ratio of the level of a signal (in this case, the voice we are trying to hear) divided by the level of noise (which constitutes all other sounds besides the voice).  Since we typically measure sound pressure levels in decibels (dB), a unit proportional to a logarithm of sound intensity ratios, the S-N ratio can be calculated as the difference between dB levels of signal and noise, and expressed in dB.  For example, if we have a signal at 63 dB with background noise at 42 dB, the S-N ratio is 63 – 42 = 21 dB.  The larger the S-N ratio, the easier it is to understand the speech.  The articulation index refines the S-N ratio by taking into account both the average spectral content of human speech, and the relative importance of each frequency band in contributing to speech intelligibility.  First, the background noise level is measured in five octave bands, centered at 250 Hz, 500 Hz, 1000 Hz, 2000 Hz, and 4000 Hz.  Second, the sound pressure level at the listener’s location is calculated or measured in each of the same frequency bands.  Average sound pressure levels for human speech within each frequency band are well-known, and can be used in the calculation.  Next, the S-N ratio for each frequency band is calculated.  If the S-N ratio is negative (meaning the noise level is higher than the signal), it is recorded as zero.  If the S-N ratio is greater than 30 dB, it is recorded as 30 dB, since a greater S-N generally does not further improve intelligibility.  Finally, the five S-N ratios are multiplied by weighting factors, and the results summed.  The result is the articulation index.  The weighting factors are chosen to reflect the relative importance of each frequency band in contributing to speech intelligibility.  The weighting factors are as follows:

Center frequency (Hz) Weighting factor
250 0.0024
500 0.0048
1000 0.0074
2000 0.0109
4000 0.0078

A little arithmetic shows that if the weighting factors are multiplied by 30 and summed, the result is 1.0.  So if the S-N ratio in each of the five frequency bands meets or exceeds 30 dB, the highest possible articulation index of 1.0 is attained.

The value of the articulation index can be broken down into several important ranges.  An AI value greater than 0.7 reflects “very good” speech intelligibility.  AI values between 0.5 and 0.7 are “good”, between 0.3 and 0.5 “marginal”, and less than 0.3 are “poor”.  Designs for classrooms and lecture halls generally aim for articulation indexes of at least 0.7 throughout the room.

Although the articulation index is calculated using only S-N ratios, there are some additional considerations which are often taken into account to modify the AI value.  First, room reverberation plays a role.  While early sound reflections tend to reinforce sounds (see the Haas effect), later reflections are more like noise and decrease speech intelligibility.  Hence, in reverberant rooms, the articulation index is usually corrected by decreasing AI by about 0.1 for each second of reverberation time (see footnote 1).  Second, listeners generally have an easier time understanding speakers when they can see them clearly.  In these cases, poor and marginal AI values can be increased by about 0.1 (see footnote 1).

The articulation index is useful not only when the goal is improved speech intelligibility, but also when speech privacy is desired, as between offices.  In this case, the lower end of the AI scale is used, with low speech intelligibility implying good privacy.  AI scores of 0.0 to 0.05 are rated as providing “very good” speech privacy, 0.05 to 0.15 “good” privacy, 0.15 to 0.2 “marginal”, and greater than 0.2 are “poor”.  Speech privacy can be improved by decreasing the S-N ratios across the five frequency bands, either by increasing background noise (see masking noise), or by decreasing the level of the speech signal through improved sound isolation.

Like many of the single-number measures used in acoustics, the articulation index is not a perfect tool.  It is impossible to condense all of the subtleties of complex acoustical phenomena into a single number.  However, when used properly, the articulation index can provide useful information and practical design targets.


1There are well-defined formulaic approaches for these corrections, which are beyond the scope of this introduction.  See American National Standards Institute, New York, Methods for the Calculation of the Articulation Index, ANSI S3.5-1969.


Sound transmission class

The sound transmission class (STC) is a single-number rating of how effective certain materials are at controlling transmission of sound.  This article explains how the sound transmission class is defined, what it means, and how to use it.  A few examples of the sound transmission class for various building materials are provided at the end of the article.

As described in our previous article on the noise reduction coefficient, sound energy hitting a planar slab of material is partly reflected, partly transmitted, and partly absorbed.  The sound transmission class (STC) is a single number that characterizes the loss of energy as sound is transmitted from one side of the material through to the other side.  The STC is used to describe how effective a piece of construction will be at isolating sounds.  This is fundamentally different from the noise reduction coefficient (NRC), which is used to characterize and predict sound reflection behavior, determining acoustical properties within a room.

The definition of sound transmission class is based on the measurement of transmission loss (TL).  Transmission loss is the difference in sound pressure level (SPL) in decibels (dB) as sound is transmitted from a source on one side of a panel to a receiver on the other.  For example, if a sound source in a room on one side of a wall is creating a 90 dB sound pressure level, and the SPL on the receiver side is 60 dB, the transmission loss of the wall is in principle (see footnote 1) 90 – 60 = 30 dB.  Although this sounds like a simple measurement, it requires a specially constructed and expensive setup with two rooms (reverberation chambers) which are acoustically isolated except at the common wall or panel.  Because transmission depends on the angle of sound incidence to the wall, reverberation chambers are used to “mix up” the sound incidence, so that the transmission loss measurement is actually a kind of average over many different incidence angles.

Transmission loss is strongly dependent upon the frequency of the sound.  To understand the transmission loss across a range of frequencies, one would have to consult a table of numbers, or a graph.  Such detailed information is necessary (or at least useful) for projects involving critical listening spaces, like recording studios or home theaters.  However, for general purpose construction projects, a simpler, more comprehensive rating is more appropriate.  The sound transmission class (STC) fills that role.  Laboratories compute the STC by first measuring the transmission loss of a panel at a range of frequencies between 125 Hz and 4000 Hz (a predominant frequency range for speech).  The TL versus frequency graph is then compared to a standard graph, called the STC contour, shown to the right.  The idea is to shift the scale on the y-axis of the STC contour up in a very specific way (see footnote 2) until it looks similar to the TL graph for the sample being rated.  The STC rating of the sample is then defined to be the value at 500 Hz on the STC contour.  Although the STC rating number itself is not in dB, it can be thought of roughly as a kind of “characteristic” transmission loss in dB, across the mid-frequency range.

The most important material property affecting STC (and TL) is mass, or actually density as mass per unit area.  The heavier the material, the better it is at sound isolation.  By doubling the mass of a panel, one can generally expect an increase in STC of about 6.

The table below shows some representative numbers for the sound transmission class of various materials.  These numbers can vary widely depending on the properties of the specific materials and construction, and who measures them.  Always consult manufacturers or reliable laboratory data before specifying materials for an actual project.

Material Assembly STC
Plywood, 3/8 inch 22
Gypsum drywall, 1/2 inch thick 28
Gypsum drywall, 1/2 inch thick, on both sides of 16 inch on center wood 2×4 studs 33
Same as above, but with fiberglass insulation between the gypsum panels 37
Gypsum drywall, 5/8 inch thick, on both sides of staggered 2×4 studs, with 2×6 plate 47
Solid concrete, 6 inches thick 55
8 inch lightweight hollow concrete block 45
1 3/4 inch hollow wood interior door, no seals 19
1 3/4 inch solid wood exterior door, with seals 34
Glass, 1/8 inch thick 26
Glass, 1/4 inch thick 31
Insulating double glass window unit (consult manufacturer) 28 – 42


1 It’s actually a bit more complicated than this. The noise level in the receiving room also depends on the area of the panel and the sound absorption in the room.  Both of these factors can be determined independently, and the true TL can be calculated from the difference in SPL, with a minor modification.
2 The scale on the y-axis of the STC contour is shifted upward as far as possible until two conditions are satisfied: (a) the maximum difference (STC – TL) at any frequency is less than or equal to 8 dB, and (b) the average difference (STC – TL) across all sixteen measured frequencies is less than or equal to 2 dB.


Noise reduction coefficient

The noise reduction coefficient (NRC) is a measure of how effective certain building materials are at controlling sound within a room.  This article describes briefly how the NRC is defined, what it does and does not mean, and how to use it.  Some examples of noise reduction coefficients for typical building materials are also provided.

When a sound wave in air hits one side of a planar material element, a portion of the incoming sound energy is reflected, a portion is absorbed, and a portion is transmitted through to the other side.  In room acoustics, the primary concern is usually with how sound behaves within the room.  From this point of view, when sound hits a wall or ceiling, it doesn’t really matter whether it is absorbed or transmitted; either way sound energy leaves the room.  So, the sound absorption coefficient of a material is usually defined to be the proportion of energy absorbed by or transmitted through the element.  In other words, the absorption coefficient is 1 minus the proportion of reflected energy.  It is therefore always a number between zero and one.  The absorption coefficient depends both on the frequency of the sound  and on the angle that the sound hits the surface.  Since one can not generally predict beforehand from which directions the sound in a room will strike the surface, the absorption coefficient can be averaged over random directions.  This is accomplished by measuring the absorption coefficient in a reverberation chamber, which is a room specially constructed to disperse sound in many different directions.

The noise reduction coefficient is defined to be the average of the random incidence absorption coefficient at four specific frequencies: 250 Hz, 500 Hz, 1000 Hz, and 2000 Hz, and then rounded off in increments of 0.05.  So for example, if the four-frequency average of the absorption coefficients is 0.27, the NRC would be 0.25.

Here are some things to keep in mind when considering NRC values.  First, it is important to understand that the NRC is merely an average at four different frequencies.  The 250, 500, 1000, 2000 Hz frequencies were chosen for the rating because these cover the range where interior noise generated by human speech is mostly concentrated.  The rating was designed for specification of materials for offices, waiting areas, restaurants, and so on.  Because NRC is an average, it gives absolutely no information about how effective the material is at controlling sound at specific frequencies.  For example, a material with absorption coefficients of  0.00 at 250 Hz, 0.00 at 500 Hz, 0.25 at 1000 Hz and 0.75 at 2000 Hz would have an NRC value of 0.25, even though it apparently provides no absorption at all at 500 Hz and below.  Furthermore, the frequency range of 250 Hz to 2000 Hz is a small fraction of the range of human hearing (usually taken to be about 20 Hz to 20,000 Hz).  The NRC provides no information at all about high frequencies or low frequencies.  When considering materials for a musical listening area or home theater, NRC ratings simply do not provide enough information to make informed decisions.

Second, remember that NRC is defined in terms of the sound absorption coefficient, which unfortunately in room acoustics does not measure sound absorption.  The classic example is an open window.  From a room acoustics point of view, the window has an absorption coefficient of 1.00 at all frequencies, because all of the sound incident upon the opening leaves the room.  The NRC rating for the open window would be 1.00.  But the window does nothing to reduce outside noise entering the room, and it also does not reduce the amount of noise escaping the room, perhaps to a room next door.  If a brick wall is built just outside the open window, suddenly the window’s NRC rating of 1.00 means nothing, since the brick will reflect most of the sound back into the room.  A common situation where this distinction is important is with dropped acoustical ceilings.  If the ceiling is suspended below a reflective surface, like a concrete slab, ceiling tiles with a large NRC rating would not be a good choice for sound control, unless some other form of sound insulation is introduced in the air space.

Finally, keep in mind that the NRC does not measure total absorption in a room, only absorption per unit area.  This is probably obvious, but a 1 square meter section of carpeted flooring will only be roughly half as effective at reducing sound as a 2 square meter section.

Here is a list of a few approximate NRC ratings for some common building materials.  These numbers can vary depending on the properties of the specific materials being used and who measures them.  Consult manufacturers or laboratory data before beginning an actual project.

Material NRC value
Gypsum drywall, 1/2 inch thick, nailed to 16 inch on center 2x4s 0.05
Plaster on lath 0.05
Plywood, 3/8 inch 0.15
Window glass 0.15
Medium weight drapery, pleated 0.55
Concrete floor 0.00
Marble or glazed tile floor 0.00
Linoleum over concrete floor 0.05
Wood floor 0.10
Heavy carpet on concrete 0.30
Heavy carpet on foam padding 0.55
Fiberboard suspended ceiling, 3/4 inch, medium texture 0.55
Thin, porous suspended ceiling, 3/4 inch 0.75

Harmonic distortion

In audio circles, the terms harmonic distortion, and total harmonic distortion, are often heard, especially in reference to electronic equipment.  In this article, we would like to explain, hopefully in a simple and intuitive way, exactly what these terms mean, and what importance should be ascribed to them.  The formal definition of total harmonic distortion is very precise and easy to state, but it requires some technical knowledge that the average consumer of sound equipment may not possess.  Our goal here is to discuss the main ideas, with some audio and visual examples, and without the technical overhead.

Distortion, in an acoustical context, refers to a change in some characteristic of a sound.  So harmonic distortion is a change in the harmonics of a sound.  To describe harmonic distortion, then, we first need to talk a little bit about harmonics.  To begin the discussion, we’ll start with a seemingly unrelated example, the mass-spring system, also known as the simple harmonic oscillator.  This is a model that occurs over and over again in countless applications from electronics, to mechanics, to atomic physics.  Luckily, it is very easy to understand.  Hang a mass from a spring.  Pull the mass down, just a little bit, and let it go.  It will move up and down in a regular motion.  In an idealized world, with no mechanical loss and no friction (which can never occur in practice, but hey, it’s a model), the motion would continue indefinitely.  This is called simple harmonic motion.  If we keep track of the position of the mass at each time and plot it on a piece of graph paper, we get a characteristic shape called a sine wave, shown below.

Actually, this is a cosine wave, which is just a sine wave shifted to the left by a quarter of a cycle.  For the purposes of this article, we will disregard shifts to the right or left, and call everything with this characteristic shape a sine wave.  A sine wave is sort of a mathematical miracle.  If it is used to represent the position of an object, as in this example, it turns out that the velocity of the object also looks like a sine wave.  So does the acceleration.  From the picture above, we see that the object goes through one complete cycle per unit of time.  If time is represented in seconds, we would say that the object is oscillating at the frequency of one cycle per second, also known as 1 Hertz (written 1 Hz).

What does a bouncing mass have to do with sound?  A lot.  For example, the picture above might also represent a voltage potential at the output of your amplifier, on it’s way to drive your speakers.  This time, let’s assume that time is represented in milliseconds, so that the signal makes 1000 cycles per second, or 1000 Hz.  Ideally (neglecting room effects), the speakers will produce sound waves whose pressure variations look pretty much like the same sine wave when they reach your ears.  When that pressure wave does reach your ears, you will hear a “pure” tone, something like this:

Listen to the same sine wave two octaves lower, at 250 Hz:

Music would be quite boring if the notes were composed of sine waves, don’t you think?  Well, actually, all music is composed of sine waves, along with all other sounds and signals.  It is a mathematical fact that every “finite energy” signal can be represented by a sum (or integral) of sine waves (and cosine waves).  Read about Fourier series and the Fourier transform if you are interested in the math.  In a very real and practical sense, every sound we hear is a combination of sine waves.

A signal is called periodic if it repeats itself indefinitely.  The sine wave in the picture above is periodic, since it looks the same every multiple of one time unit.  If a periodic signal repeats itself 100 times per second (at frequency 100 Hz), then a sample of the signal 1/100th of a second long can be repeated to re-create the entire signal.  Periodic signals are perceived by the ear as having a distinct pitch.  For example, you would hear a periodic signal at 440 Hz as an ‘A’ on the musical scale.  The frequency of the signal determines the pitch.  What about the shape of the signal?  This determines what is usually called “timbre”; it is the characteristic tone.  A flute at 440 Hz sounds different than a violin at 440 Hz, because the timbre is different.  If you were to record each instrument with a microphone and plot the voltage versus time, you would see two signals with the same frequency, but with very different wave shapes.  Two artificial examples follow.  First, consider a “triangle wave”.  Three cycles of the wave looks like this:

Listen to this waveform played at 250 Hz:

Next, here are three cycles of a “square wave”, followed by audio at 250 Hz:

Compare the sounds of the sine wave, the triangle wave, and the square wave.  Although they all have the same pitch and amplitude, the sine wave sounds more mellow than the triangle wave, and the square wave sounds rough or fuzzy.  It is harmonics that determine the distinct tonal qualities of the triangle and square waves.  So what are harmonics?

As mentioned above, all sounds are composed of sums or integrals of sine waves.  For periodic signals, the sum (called a Fourier series) is particularly simple.  Any periodic signal with frequency x can be decomposed into a sum of sine waves of frequency x, 2x, 3x, 4x, and so on.  These sine waves are the harmonics of the signal.  Although there are in principle an infinite number of harmonics, humans are unable to hear sounds at frequencies larger than 20,000 Hz.  It turns out that the square wave only has odd harmonics x, 3x, 5x, and so on.  So the 250 Hz square wave above would have harmonics at 250 Hz, 750 Hz, 1250 Hz, 1750 Hz, etc, but those above 20,000 Hz are not audible (and in fact, audio equipment usually does not reproduce frequencies beyond the audible range).

Ideally, audio equipment reproduces sine waves (harmonics) perfectly.  A perfect audio amplifier is linear, meaning that the output is a uniformly rescaled version of the input.  For example, if the triangle wave above represents the input to a perfect linear amplifier, the output waveform would look exactly the same, except that the numbers on the vertical scale would be larger.  Unfortunately, amplifiers are not perfect.  All amplifiers exhibit some nonlinearity.  Nonlinearity in this context means that the input waveform is not reproduced perfectly.  In particular, a sine wave at the input might not look exactly like a sine wave at the output.  Since it is no longer a sine wave, it must contain additional harmonics.  This is harmonic distortionTotal harmonic distortion is measured (in principle–there are trickier ways) by presenting a sine wave of a particular frequency at the input, and measuring the total magnitude (RMS) of additional harmonics present at the output.  This number is divided by the magnitude of the original sine wave (the fundamental frequency) at the output, to give total harmonic distortion (THD) as a percent.  Usually, manufacturers measure THD across a range of frequencies, at a particular gain, and report the maximum THD.  Sometimes, the figure THD + N (total harmonic distortion plus noise) is reported.  This figure, similar to THD, is the ratio of output power without the fundamental frequency to the power of the fundamental frequency.  This is actually a preferable figure, since noise is often more of a problem than distortion in modern audio gear.

So, how important is THD?  Well, it depends.  In the early days of audio, when amplification was accomplished with vacuum tubes, equipment tended to produce a lot of distortion.  Sometimes, distortion is actually desirable, as with guitar amplifiers.  The amplifier distortion is part of the distinctive electric guitar sound.  Even today, many people prefer the sound of tube-powered audio gear, distortion and all.  The reasons are debatable, but it seems that tube amps typically produce primarily even-order harmonics, which sound “more musical”, since they occur in octaves of the fundamental.  When transistor amplifiers first came along, the distortion tended to be very harsh, due to signal “clipping”, which produces odd harmonics as in the square wave example above.  For these amplifiers, a much lower level of THD was tolerable.  Modern, high-quality audio equipment generally has very respectable, if not amazing, THD figures.  In our experience, THD below about 1% for an amplifier is generally not audible in anything short of a clinical listening environment.  Generally, speakers and room acoustics will completely overwhelm the effects of harmonic distortion present in the amplifier output.  However, people are different, and so are amplifiers.  The best thing a consumer can do is to compare different amp possibilities side-by-side, with the speakers they intend to use, playing their preferred type of music.  Forget about the THD numbers and choose the amp that sounds best to you.

Finally, it’s worth mentioning that harmonic distortion can in some cases by caused by the listening environment itself.  Our favorite example is the ubiquitous teenager driving around in a 30 year-old compact car with 2000 watt amplifiers in the trunk.  In addition to whatever distortion is produced by the amplifiers and speakers, the audible rattle of the car itself at high volume is harmonic distortion.  We love it.


Backing tracks

Backing tracks are prerecorded, or pre-programmed, pieces of music sometimes used by bands to supplement or replace the sounds of instruments or vocals when playing live.  A wide range of practices, implemented with a variety of technologies, fall under the category of backing tracks.  Perhaps the most minimal backing tracks are simply pre-programmed sequences of notes played through a keyboard or synthesizer, usually an arpeggio or other complex or rapid run.  Many small-scale bands use “drum machines”, which are synthesizers programmed to play percussion parts, often replacing a live drummer.  Singers sometimes use prerecorded vocal harmonies when performing live.  This usually happens when the vocal harmonies are too complex to pull off perfectly live, or when other band members are unable to sing them.  This is technically different than using an autotuner or other voice processors (which can provide harmonies), since the input to these devices is provided in real time through the microphone, although improving technology continues to blur the line.  More maximal uses of backing tracks could include adding a recorded instrumental piece, replacing entire instrumental parts with recordings, or even replacing lead vocals with recordings.  At the most extreme, the band just pretends to play, while the singer lip-syncs to a CD.

Fans are often surprised to learn that most major touring bands use backing tracks in one form or another during their live shows.  The reasons given in favor of this practice are that (1) fans expect the live show to sound like the recorded version of the song, (2) a larger, more complex, and more complete sound can be obtained with fewer musicians on stage (thus reducing the cost of presenting the show), (3) offloading some of the musical performance duties onto backing tracks allows band members to focus more on their visual performance, (4) some major labels actually contractually require their bands to use backing tracks on tour, and (5) most fans never know the difference.  Along with the increasing acceptance of backing tracks in major bands, and as the technology continues to become better and cheaper, more and more small-scale bands (especially cover bands) are also using backing tracks.

Needless to say, backing tracks are not without controversy.  Lip-syncing has been going on at least since the 1950s.  Fans generally feel betrayed by a “faked” lead vocal performance, and singers caught lip-syncing can face criticism or humiliation.  Faked instrumental performances seem to draw less criticism.  Maybe the more attention focused on the performer, the less acceptable it is to not actually be playing.  A faked lead guitar solo would probably not go over well (if it was discovered), but adding an extra instrumental track is deemed okay.

A lot of musicians are offended by backing tracks.  First, they argue that a live performance should be a Live Performance.  In other words, the beauty and magic of a musical performance comes from sounds being created and enjoyed live and completely in the moment.  A backing track cannot respond to the emotions of the audience or to the chemistry of the band on a given night.  The sense of unlimited possibility in a live performance cannot come through when the band is forced to stick to a script.  Fans come to a live show to see the band actually play and perform, even if the performance is not flawless.  Even if it doesn’t sound exactly like the CD.  Second, backing tracks are seen by some as an attack on the craft of being a skilled live musician and artist.  Why sweat though countless hours of practice and study, when it requires no performance skills to sit down at the computer and cook up a decent track to be played later at the show?  Finally, some musicians argue that backing tracks put musicians out of business.  One less person on stage is one less paycheck.

There seems to be some generational difference in feelings on this issue.  Younger people, having grown up with DJs, techno, house music, hip-hop, and rap, are much less concerned with whether or not a band or act has a real bass player.  The focus at live shows seems to be more on the crowd experience and the visual performance.  Older people, who grew up marveling at the technical skills of their favorite artists, tend to have more negative feelings about backing tracks.

We are interested to learn what you think.  What do you want to hear at a live performance?  Do backing tracks bother you, or not?  Where do you draw the line on acceptable uses of backing tracks?  At $100 per ticket, do you expect a live band, or karaoke?  What use of backing tracks would make you feel ripped off or deceived?  As live music continues to become integrated with computer technology, what is the future of musical performance?


Masking noise

For most of us, the world we live in is noisier than what we would imagine as ideal, so that we are generally seeking quieter spaces.  Suppose you lived in a world in which you had access to a “master volume” control, allowing you to control your surrounding noise level at all times (without affecting your ability to hear other sounds).  In this situation, what noise level do you think you would choose?  In other words, how quiet is too quiet?

There is a specific type of listening room, called an anechoic chamber, designed for making acoustic laboratory measurements.  The walls of an anechoic chamber are designed to absorb all sound waves originating from inside the room, and to block all outside sound from entering the room.  In short, it is very, very quiet inside an anechoic chamber.  Most people find it uncomfortable to spend a long time in such a room.  With background noise reduced far below the normal level, personal noises like breathing, clothes rustling, and even the heartbeat, may become noticeable, perhaps even annoying.  The absence of background noise can be eerie and unsettling.

So, it seems that some level of noise makes us more comfortable, in part by masking other sounds that we don’t need to pay attention to.  Acoustical engineers caught on to this phenomenon a long time ago, and subsequently took it to the next level: if a little bit of background noise is good, why not supply even more?  This can be used to improve privacy in an office setting, by making it harder to understand or notice conversations in neighboring offices or cubicles.  The main advantage of this approach is that it is much less expensive than direct sound isolation strategies.  Some buildings actually implement masking directly, by mounting loudspeakers in the ceiling, attached to an electronic noise generator.  Usually the building occupants have no idea that this system is in place.  The masking noise typically has somewhat different spectral content than white noise sounds.  Here is an example of electronic masking noise:

Other buildings generate controlled noise indirectly through the HVAC system, or by providing specially designed “open” spaces, in which the sounds generated by a large group of people combine to provide a relatively uniform masking noise.

In most industrialized countries, the government regulates the permissible noise exposure level for workers.  In the US, these regulations were originally written with industrial environments in mind, where workers are regularly exposed to damaging noise from heavy equipment.  Generally, workers are only allowed to be exposed to high levels of noise for short periods of time.  The higher the noise level, the shorter the allowed exposure time.  The catch is that below a certain noise level (90 dBA), the exposure time is unlimited.  For reference, 90 dBA is really loud, comparable to somebody shouting at you from a couple of feet away, or freeway traffic from 30 feet.  Hearing damage has been documented from prolonged exposure at 85 dB, and the US EPA has identified 70 dB as the maximum allowed level to protect against damaging effects of noise.  Acoustical engineers generally strive to stay far away from these limits when designing noise masking strategies.  Nevertheless, the legal limits are seldom a concern, and there is always a financial incentive for builders or owners to choose more masking over other, more expensive noise control strategies.

It turns out that using music as masking noise is not very effective for improving speech privacy.  Almost all music has brief quiet periods, and furthermore the spectral content is not well-matched to that of speech.  For best speech privacy, designers aim to produce steady noise, similar in spectral content to generic human speech.  Studies have shown that most people are unaware of artificially generated masking noise in an office setting, up to a level of about 35 dBA.  Masking noise above about 45 dBA is loud enough that people start raising their voices in order to be heard.  At a 50 dBA noise level, it becomes difficult to communicate with someone more than 6 feet (2 m) away.  While this may be (almost) tolerable in certain environments, it is certainly not ideal.  In any kind of work setting requiring periods of concentration, like a software development or research lab, workers will undoubtedly not be performing at their optimal level, and will likely suffer stress and fatigue in addition to having a hard time communicating.  Clearly, it is possible to have too much background noise.  In this situation, business owners would be well served to turn off the masking noise generator and invest more in noise isolation and control.

If too little noise is a problem, and too much noise is also a problem, what is the “best” background noise level?  As you might guess, the answer depends greatly on the intended purpose of the space, and also varies widely between individuals.  For a dedicated listening room or home theater, it is probably safe to say that in practice, less outside noise is always better.  Achieving an eerie level of silence is difficult and expensive.  In other home living spaces, many people routinely create background noise with the television or stereo, but like to have the option of turning it off without being disturbed by neighbors, traffic, and other outside sounds.  And in the office, if workers ever find themselves distracted or irritated by the background noise, it is obviously too loud.  We have yet to hear of an office worker complaining that their working environment is too quiet.

1 2 3 Page 1 of 3

Call Us + 1 801 449 1405

Experience Spacious Sound.