Pro Engineer School Vol. 1

101
Volume 1

Transcript of Pro Engineer School Vol. 1

Page 1: Pro  Engineer  School  Vol. 1

Volume 1

Page 2: Pro  Engineer  School  Vol. 1

Contents

3. Microphone Technology

19. The Use of Microphones

35. Loudspeaker Drive Units

42. Loudspeaker Systems

51. Analog Recording

64. Digital Audio

75. Digital Audio Tape Recording

86. Appendix 1 – Sound System Parameters

Copyright Notice

This work is copyright © Record-Producer.com

You are licensed to make as many copies as you reasonably require foryour own personal use.

Page 3: Pro  Engineer  School  Vol. 1

Chapter 1: Microphone Technology

The microphone is the front-end of almost all sound engineeringactivities and, as the interface between real acoustic sound travelling inair and the sound engineering medium of electronics, receives animmense amount of attention. Sometimes one could think that the statusof the microphone has been raised to almost mythological proportions. Itis useful therefore to put things in their proper perspective: there are agreat many microphones available that are of professional quality.Almost any of them can be used in a wide variety of situations to recordor broadcast sound to a professional standard. Of course different makesand types of microphones sound different to each other, but thedifferences don't make or break the end product, at least as far as thelistener is concerned.

Now, if you want to talk about something that really will make or breakthe end product, that is how microphones are used. Two sound engineersusing the same microphones will instinctively position and direct themdifferently and there can be a massive difference in sound quality. Givethese two engineers other mics, whose characteristics they are familiarwith, and the two sounds achieved will be identifiable according toengineer, and not so much to according to microphone type.

There are two ways we can consider microphones, by construction andby directional properties. Let's look at the different ways a microphonecan be made, to start off with.

Microphone Construction

There are basically three types of microphone in common use:piezoelectric, dynamic and capacitor. The piezoelectric mic, it has to besaid, has evolved into a very specialized animal, but it is still commonlyfound under the bridge of an electro-acoustic guitar so it is worthknowing about.

Piezoelectric

The piezoelectric effect is where certain crystalline and ceramic materialshave the property of generating an electric current when pressure or abending force is applied. This makes them sensitive to acoustic vibrationsand they can produce a voltage in response to sound. Piezo mics (or

Page 4: Pro  Engineer  School  Vol. 1

transducers as they may be called - a transducer is any device thatconverts one form of energy to another) are high impedance. This meansthat they can produce voltage but very little current. To compensate forthis, a preamplifier has to be placed very close to the transducer. Thiswill usually be inside the body of the electro-acoustic guitar. The preampwill run for ages on a 9 volt alkaline battery, but it is worth rememberingthat if an electro-acoustic guitar, or other instrument with a piezotransducer, sounds distorted, it is almost certainly the battery that needsreplacing, perhaps after a year or more of service.

Dynamic

This is ‘dynamic’ as in ‘dynamo’. The dynamo is a device for convertingrotational motion into an electric current and consists of a coil of wirethat rotates inside the field of a magnet. Re-configure these componentsand you have a coil of wire attached to a thin, lightweight diaphragm thatvibrates in response to sound. The coil in turn vibrates within the field ofthe magnet and a signal is generated in proportion to the acousticvibration the mic receives. The dynamic mic is also sometimes known asthe moving coil mic, since it is always the coil that moves, not themagnet - even though that would be possible.

The dynamic mic produces a signal that is healthy in both voltage andcurrent. Remember that it is possible to exchange voltage for current, andvice versa, using a transformer. All professional dynamic micsincorporate a transformer that gives them an output impedance ofsomewhere around 200 ohms. This is a fairly low output impedance thatcan drive a cable of 100 meters or perhaps even more with little loss ofhigh frequency signal (the resistance of a cable attenuates all frequenciesequally, the capacitance of a cable provides a path between signalconductor and earth conductor through which high frequencies can‘leak’). It is not necessary therefore to have a preamplifier close to themicrophone, neither does the mic need any power to operate. Examplesof dynamic mics are the famous Shure SM58 and the Electrovoice RE20.The characteristics of the dynamic mic are primarily determined by theweight of the coil slowing down the response of the diaphragm. Thesound can be good, particularly on drums, but it is not as crisp and clearas it would have to be to capture delicate sounds with complete accuracy.Dynamic microphones have always been noted for providing good value

Page 5: Pro  Engineer  School  Vol. 1

for money, but other types are now starting to challenge them on thesegrounds.

Ribbon Mic

There is a variation of the dynamic mic known as the ribbon microphone.In place of the diaphragm and coil there is a thin corrugated metal ribbon.The ribbon is located in the field of a magnet. <imgsrc="/graphics/coles4038.jpeg" border=0 width=69 height=114align=RIGHT hspace=5 vspace=5 alt="">When the ribbon vibrates inresponse to sound it acts as a coil, albeit a coil with only one turn. Sincethe ribbon is very light, it has a much clearer sound than the conventionaldynamic, and it is reasonable to say that many engineers could identifythe sound of a ribbon mic without hesitation. If the ribbon has a problem,it is that the output of the single-turn ‘coil’ is very low. The ribbon doeshowever also have a low impedance and provides a current which theintegral transformer can step up so that the voltage output of a modernribbon mic can be comparable with a conventional dynamic. Examples ofribbon mics are the Coles 4038 and Beyerdynamic M130.

Capacitor

The capacitor mic, formerly known as the ‘condenser mic’, works in acompletely different way to the dynamic. Here, the diaphragm isparalleled by a ‘backplate’. Together they form the plates of a capacitor.A capacitor, of any type, works by storing electrical charge. Electricalcharge can be thought of as quantity of electrons (or the quantity ofelectrons that normally would be present, but aren't). The greater thedisparity in number of electrons present – i.e. the amount of charge – thehigher will be the voltage across the terminals of the capacitor. There isthe equation:

Page 6: Pro  Engineer  School  Vol. 1

Q = C x V

or:

charge = capacitance x voltage

Note that charge is abbreviated as ‘Q’, because ‘C’ is already taken bycapacitance.

Putting this another way round:

V = Q/C

or:

voltage = charge / capacitance

Now the tricky part: capacitance varies according to the distance betweenthe plates of the capacitor. The charge, as long as it is either continuouslytopped up or not allowed to leak away, stays constant. Therefore as thedistance between the plates is changed by the action of acousticvibration, the capacitance will change and so must the voltage betweenthe plates. Tap off this voltage and you have a signal that represents thesound hitting the diaphragm of the mic.

Sennheiser MKH 40

The great advantage of the capacitor mic is that the diaphragm isunburdened by a coil of any sort. It is light and very responsive to themost delicate sound. The capacitor mic is therefore much more accurateand faithful to the original sound than the dynamic. Of course there is a

Page 7: Pro  Engineer  School  Vol. 1

downside too. This is that the impedance of the capsule (the part of anymic that collects the sound) is very high. Not just high - very high. It alsorequires continually topping up with charge to replace that whichnaturally leaks away to the atmosphere. A capacitor mic therefore needspower for these two reasons: firstly to power an integral amplifier, andsecondly to charge the diaphragm and backplate.

Old capacitor mics used to have bulky and inconvenient power supplies.These mics are still in widespread use so you would expect to comeacross them from time to time. Modern capacitor mics use phantompower. Phantom power places +48 V on both of the signal carryingconductors of the microphone cable actually within the mixing console orremote preamplifier, and 0 V on the earth conductor. So, simply byconnecting a normal mic cable, phantom power is connectedautomatically. That's why it is called ‘phantom’ – because you don't seeit! In practice this is no inconvenience at all. You have to remember toswitch in on at the mixing console but that's pretty much all there is to it.Dynamic mics of professional quality are not bothered by the presence ofphantom power in any way, One operational point that is importanthowever is that the fader must be all the way down when a mic isconnected to an input providing phantom power, or when phantom poweris switched on. Otherwise a sharp crack of speaker-blowing proportionsis produced.

A capacitor microphone often incorporates a switched -10 dB or -20 dBpad, which is an attenuator placed between the capsule and the amplifierto prevent clipping on loud signals.

Electret

The electret mic is a form of capacitor microphone. However the chargeis permanently locked into the diaphragm and backplate, just as magneticenergy is locked into a magnet. Not all materials are suited to formingelectrets, so it is usually considered that the compromises involved inmanufacture compromise sound quality. However, it has to be said thatthere are some very good electret mics available, most of which are back-electrets, meaning that only the backplate of the capacitor is an electrettherefore the diaphragm can be made of any suitable material. Electretmics do still need power for the internal amplifier. However, this cantake the form of a small internal battery, which is sometimes convenient.

Page 8: Pro  Engineer  School  Vol. 1

Electret mics that have the facility for battery power can also usually bephantom powered, in case the battery runs down or isn’t fitted.

Page 9: Pro  Engineer  School  Vol. 1

Directional Characteristics

The directional characteristics of microphones can be described in termsof a family of polar patterns. The polar pattern is a graph showing thesensitivity in a full 360 degree circle around the mic. I say a family ofpolar patterns but it really is a spectrum with omnidirectional at oneextreme and figure-of-eight at the other. Cardioid and hypercardioid aresimply convenient way points.

To explain these patterns further, fairly obviously an omnidirectional micis equally sensitive all round. A cardioid is slightly less obvious. Thecardioid is most sensitive at the front, but is only 6 dB down in responseat an angle of 90 degrees. In fact it is only insensitive right at the back. Itis not at all correct, as commonly happens, to call this a unidirectionalmicrophone. The hypercardioid is a more tightly focussed pattern thanthe cardioid, at the expense of a slight rear sensitivity, known as a lobe inthe response. The figure-of-eight is equally sensitive at front and back,

Page 10: Pro  Engineer  School  Vol. 1

the only difference being that the rear produces an inverted signal, 180degrees out of phase with the signal from the front.

All of this is nice in theory, but is almost never borne out in practice.Take a nominally cardioid mic for example. It may be an almost perfectcardioid at mid frequencies, but at low frequencies the pattern will spreadout into omni. At high frequencies the pattern will tighten intohypercardioid. The significant knock-on effect of this is that thefrequency response off-axis – in other words any direction but head on –is never flat. In fact the off-axis response of most microphones is nothingshort of terrible and the best you can hope for is a smooth roll-off ofresponse from LF to HF. Often though it is very lumpy indeed. We willsee how this affects the use of microphones at another time.

Omnidirectional

Looking at directional characteristics from a more academic standpoint,the omnidirectional microphone is sensitive to the pressure of the soundwave. The diaphragm is completely enclosed, apart from a tiny slow-acting air-pressure equalizing vent, and the mic effectively compares thechanging pressure of the outside air under the influence of the soundsignal with the constant pressure within. Pressure acts equally in alldirections, therefore the mic is equally sensitive in all directions, intheory as we said. In practice, at higher frequencies where the size of themic starts to become significant in comparison with the wavelength, thediaphragm will be shielded from sound approaching from the rear andrearward HF response will drop.

Figure-of-Eight

At the other end of the spectrum of polar patterns the figure-of-eightmicrophone is sensitive to the pressure gradient of the sound wave. Thediaphragm is completely open to the air at both sides. Even though it isvery light and thin, there is a difference in pressure at the front and rearof the diaphragm, and the microphone is sensitive to this difference. Thepressure gradient is greatest for sound arriving directly from the front orrear, and lessens as the sound source moves round to the side. When thesound source is exactly at the side of the diaphragm it produces equalpressure at front and back, therefore there is no pressure gradient and themicrophone produces no output. Therefore the figure-of-eightmicrophone is not sensitive at the sides. (You could also imagine that a

Page 11: Pro  Engineer  School  Vol. 1

sound wave would find it hard to push the diaphragm sideways –sometimes the intuitive explanation is as meaningful as the scientificone).

All directional microphones exhibit a phenomenon known as theproximity effect or bass tip-up. The explanation for this is sufficientlycomplicated to fall outside of the required knowledge of the workingsound engineer. The practical consequences are that close miking resultsin enhanced low frequency. This produces a signal that is not accurate,but it is often thought of as being ‘warmer’ than the more objectivelyaccurate sound of an omnidirectional microphone.

Cardioid and Hypercardioid

To produce the in-between polar patterns one could consider theomnidirectional microphone where the diaphragm is open on one sideonly, and the figure-of-eight microphone where the diaphragm iscompletely open on both sides. Allowing partial access only to one sideof the diaphragm would therefore seem to be a viable means ofproducing the in-between patterns, and indeed it is. A cardioid orhypercardioid mic therefore provides access to the rear of the diaphragmthrough a carefully designed acoustic labyrinth. Unfortunately the effectof the acoustic labyrinth is difficult to equalize for all frequencies,therefore one would expect the polar response of cardioid andhypercardioid microphones to be inferior to that of omnidirectional andfigure-of-eight mics.

Page 12: Pro  Engineer  School  Vol. 1

Multipattern Microphones

There are many microphones available that can produce a selection ofpolar patterns. This is achieved by mounting two diaphragms back-to-back with a single central backplate. By varying the relative polarizationof the diaphragms and backplate, any of the four main polar patterns canbe created. It is often thought that the best and most accuratemicrophones are the true omnidirectional and the true figure-of-eight,and that mimicking these patterns with a multipattern mic is less thenoptimal. Nevertheless, in practice multipattern mics are so versatile thatthey are commonly the mic of first choice for many engineers.

AKG C414

Page 13: Pro  Engineer  School  Vol. 1

Special Microphone Types

Stereo Microphone

Two capsules may be combined into a single housing so that one mic cancapture both left and right sides of the sound field. This is much moreconvenient than setting two mics on a stereo bar, but obviously lessflexible. Some stereo mics use the MS principle where one cardioidcapsule (M) captures the full width of the sound stage while the otherfigure-of-eight capsule (S) captures the side-to-side differences. The MSoutput can be processed to give conventional left and right signals.

Neumann stereo microphones

Interference Tube Microphone

This is usually known as a shotgun or rifle mic because of its similarityin appearance to a gun barrel. The slots in the barrel allow off-axis soundto cancel giving a highly directional response. The longer the mic, themore directional it is. The sound quality of these microphones is inferiorto normal mics so they are only used out of necessity.

Sennheiser interference tube microphone

Page 14: Pro  Engineer  School  Vol. 1

A close relation of the interference tube microphone is the parabolicreflector mic. This looks like a satellite dish antenna and is used forrecording wildlife noises, and at sports events to capture comments fromthe pitch.

Boundary Effect Microphone

The original boundary effect microphone was the Crown PZM (PressureZone Microphone) so the boundary effect microphone is often referred togenerically as the PZM. In this mic, the capsule is mounted close to a flatmetal plate, or inset into a wooden or metal plate. Instead of mounting iton a stand, it is taped to a flat surface. One of the main problems in theuse of microphones is reflections from nearby flat surfaces entering themic. By mounting the capsule within around 7 mm from the surface,these reflections add to the signal in phase rather than interfering with it.The characteristic sound of the boundary effect microphone is thereforevery clear (as long as there are no other nearby reflecting surfaces). It canbe used for many types of recording, and can also be seen in policeinterview rooms where obviously a clear sound has to be captured for theinterview recording. The polar response is hemispherical.

Crown PZM microphone

Miniature Microphone

This is sometimes known as a ‘tie-clip’ mic, although it is rarely everclipped to the tie these days. This type of mic is usually of the electretdesign, which lends itself to very compact dimensions, and is almostalways omnidirectional. Miniature microphones are used in televisionand in theater, where there is a requirement for microphones to beunobtrusive. Since the diaphragm is small and not in contact with manyair molecules, the random vibration of the molecules does not cancel out

Page 15: Pro  Engineer  School  Vol. 1

as effectively as it does in a microphone with a larger diaphragm.Miniature microphones therefore have to be used close to the soundsource; otherwise noise will be evident.

Beyerdynamic MCE5

Vocal Microphone

For popular music vocals it is common to use a large-diaphragm mic,often an old tube model. A large diaphragm mic generally has a lessaccurate sound than a mic with a diaphragm 10-12 mm or so in diameter.The off-axis response will tend to be poor. Despite this, models such asthe Neumann U87 are virtually standard in this application due to theirenhanced subjective ‘warmth’ and ‘presence’.

Microphone Accessories

First in the catalogue of microphone accessories is the mic support.These can range from table stands, short floor stands, normal boomstands, tall stands up to 4 meters for orchestral recording, fishpoles asused by video and film sound recordists, and long booms with cableoperated mic positioning used in television studios. Attaching the mic tothe stand is a mount that can range from a basic plastic clip, to an elasticsuspension or cradle that will isolate the microphone from floor noise.

The other major accessory is the windshield or pop-shield. A windshieldmay be made out of foam and slipped over the mic capsule, or it maylook like a miniature airship covered with wind-energy dissipatingmaterial. For blizzard conditions windshield covers are available that

Page 16: Pro  Engineer  School  Vol. 1

look as though they are made out of yeti fur. The pop-shield, on the otherhand, is a fine mesh material stretched over a metal or plastic hoop, usedto filter out the blast of air cause by a voice artist's or singer's ‘P’ and ‘B’sounds.

Page 17: Pro  Engineer  School  Vol. 1

Check Questions

• What is the piezoelectric effect?

• Where would you find a piezo-electric transducer?

• What is attached to the diaphragm of a dynamic microphone?

• What passive circuit component is incorporated in the output stageof all professional microphones? (Note that some microphones usean active circuit to imitate the action of this component).

• Describe the sound of a dynamic microphone.

• How does a ribbon microphone differ from an ordinary dynamicmicrophone?

• What is the old term for 'capacitor microphone'?

• Why does the capacitor microphone have a more accurate soundthan a dynamic microphone?

• Why does a capacitor microphone need to be powered (tworeasons)?

• What precaution should you take when switching on phantompower?

• Can dynamic microphones of professional quality be used withphantom power switched on?

• What is a pad?

• Why does an electret microphone need to be powered?

• Describe the actual polar response of a typical nominallyomnidirectional microphone.

• Describe the proximity effect.

• What is an 'acoustic labyrinth', as applied to microphones?

Page 18: Pro  Engineer  School  Vol. 1

• Why does a boundary effect microphone give a clear sound?

• Why are large-diaphragm microphones used for popular musicvocals?

• Describe the differences between wind shields and pop shields.

Page 19: Pro  Engineer  School  Vol. 1

Chapter 2: The Use of Microphones

Use of Microphones for Speech

In sound engineering, as opposed to communications which will not beconsidered here, there are commonly considered to be three classes ofsound: speech or dialogue, music and effects. Each has its ownconsiderations and requirements regarding the use of microphones.

There are a number of scenarios where speech may be recorded,broadcast or amplified:

• Audio book• Radio presentation, interview or discussion• Television presentation, interview or discussion• News reporting• Sports commentary• Film and television drama• Theatre• Conference

In some of these, the requirement is for speech that is as natural aspossible. In an ideal world perhaps it should even sound as though a realperson were in the same room. The audio book is in this category, as aremany radio programs. There is a qualification however on the term‘natural’. Sometimes what we regard as a natural sound is the sound thatwe expect to hear via a loudspeaker, not the real acoustic sound of thehuman voice. We have all been conditioned to expect a certain quality ofsound from our stereos, hifis, radio and television receivers, and when weget it, it sounds natural, even if it isn’t in objective terms. In the recordingand most types of broadcasting of speech there are some definiterequirements:

• No pops on ‘P’ or ‘B’ sounds.• No breath noise or ‘blasting’• Little room ambience or reverberation• A pleasing tone of voice

Popping and blasting can be prevented in two ways. One is to positionthe microphone so that it points at the mouth, but is out of the direct lineof fire of the breath. So often we see microphones used actually in the

Page 20: Pro  Engineer  School  Vol. 1

line of fire of the breath that it seems as though it is simply the ‘correct’way to use a microphone. It can be for public address, but it isn’t forbroadcasting or recording. The other way is to use a pop shield. Ideallythis is an open mesh stocking-type material stretched over a metal orplastic hoop. This can be positioned between the mouth and themicrophone and is surprisingly effective in absorbing potential pops andblasts. Sometimes a foam windshield of the type that slips over the end ofthe microphone is used for this purpose. A windshield is really what itsays, and is not 100% effective for pops, although its unobtrusivenessvisually has value, for example, for a radio discussion where hoop-typepop shields would mar face-to-face visual communication among theparticipants.

The requirement for little room ambience or reverberation is handled byplacing the microphone quite close to the mouth – around 30 to 40 cm. Ifthe studio is acoustically treated, this will work fine. Special acoustictables are also available which absorb rather than reflect sound from theirsurface.

‘A pleasing tone of voice’? Well, first choose your voice talent. Second,it is a fact that some microphones flatter the voice. Some workparticularly well for speech, and there are some classic models such asthe Electrovoice RE20 that are commonly seen in this application.Generally, one would be looking for a large-diaphragm capacitormicrophone, or a quality dynamic microphone for natural or pleasingspeech for audio books or radio broadcasting.

In television broadcasting, one essential requirement is the microphoneshould be out of shot or unobtrusive. The usual combination for a newsanchor, for example, is to have a miniature microphone attached to theclothing in the chest area, backed up by a conventional mic on a deskstand. Often the conventional mic is held on stand-by to be brought onquickly if the miniature mic fails, as they are prone to through constanthandling. Oddly enough, the use of microphones on television variesaccording to geography. In France for example, it is quite common for atelevision presenter to hand hold a microphone very close to the mouth.Even a discussion can take place with three or four people each holding amicrophone. The resultant sound quality is in accordance with Frenchsubjective requirements. Radio microphones are commonly used intelevision to give freedom of movement and also freedom from cables on

Page 21: Pro  Engineer  School  Vol. 1

the floor, leaving plenty of free space for the cameras to roll aroundsmoothly.

News Reporting

For news reporting, a robust microphone – perhaps a short shotgun – canbe used with a general-purpose foam windshield for both the reporter andinterviewee, should there be one. Such a microphone is easily pointable(the reporter isn’t a sound engineer) and brings home good resultswithout any trouble. The sound quality of a news report may not be allthat could be imagined, but a little bit of harshness or degradationsometimes, oddly, makes the report more ‘authentic’.

Sports Commentary

Sports commentary is a very particular requirement. This often takesplace in a noisy environment so the microphone must be adapted to copewith this. The result is a mic that has a heavily compromised soundquality, but this has come to be accepted as the sound of sportscommentary so it is now a requirement. The Coles 4104 is an example ofa 1950s design that is still widely used. It is a noise-cancellingmicrophone that almost completely suppresses background noise, and thepositioning bar on the top of the mic ensures that the commentatoralways holds it in the correct position (as, indeed it is always held - sportscommentators often like to move around in their commentary box as theywork).

Film and Television Drama

For film and television drama, a fishpole (or boom as it is sometimesknown) topped by a shotgun or rifle mic with a cylindrical windshield isthe norm. The operator can position and angle the mic to get the bestquality dialogue (while monitoring on headphones), while keeping themic – and the shadow of the mic – out of shot. Miniature microphonesare also used in this context, often with radio transmitters. Obviouslythey must not be visible at all. However, concealing the mic in thecostume can affect sound quality so care must be taken.

Sometimes in the studio a microphone might be mounted on a large floormounted boom that can extend over several meters (we’re not in fishing

Page 22: Pro  Engineer  School  Vol. 1

country anymore). In this case the boom operator has winches to pointand angle the microphone.

Theatre

In theatre the choice is between personal miniature microphones withradio transmitters, or area miking from the front and sides of the stage.Personal microphones allow a higher sound level before feedback sincethey are close to the actor’s mouth. For straight drama, it isn’t necessaryto have a high sound level in the auditorium. In fact in most theatres it isperfectly acceptable for the sound of the actors’ voices to be completelyunamplified. However if amplification, or reinforcement, is to be usedthen area miking is usually sufficient. Shotgun or rifle mics arepositioned at the front of the stage (an area sometimes known fortraditional reason as ‘the floats’, therefore the mics are sometimes called‘float mics’) to create sensitive spots on stage from which the actors caneasily be heard. The drawback is that there will be positions on the stagefrom which the actors cannot be heard. The movements of the actorshave to be planned to take account of this.

Conference

I use this term loosely to cover everything from company boardrooms topolitical party conferences. You will see that there can be a vastdifference in scale. In the boardroom it has become common to usegooseneck microphones or boundary effect microphones that arespecifically designed for that purpose. This lies beyond what wenormally consider to be sound engineering and is categorized in thespecialist field of sound installation. The party conference is anothermatter. To achieve reasonably high sound levels the microphone has tobe close to the mouth, yet the candidate – for obvious reasons – does notwant to look like a microphone-swallowing rock star. Therefore themicrophone has to be unobtrusive so that it can be placed fairly close tothe mouth without drawing undue attention to itself (the cluster ofbroadcasters’ microphones in front of the lectern is another matter, butthey don’t have to be so close). The AKG C747 is very suitable for thisapplication.

You will have noticed that in this context microphones are often used inpairs. There are two schools of thought on this issue. One is that themicrophones should point inwards from the front corners of the lectern.

Page 23: Pro  Engineer  School  Vol. 1

This allows the speaker to turn his or her head and still receive adequatepickup. Unfortunately, as the head moves, both microphones can pick upthe sound while the sound source – the mouth – is moving towards onemic and away from the other. The Doppler effect comes into play andtwo slightly pitch shifted signals are momentarily mixed together. Itsounds neither pleasant nor natural. The alternative approach is to mountboth microphones centrally and use one as a backup. The speaker willlearn, through not hearing their voice coming back through the PAsystem, that they can only turn so far before useful pickup is lost.

It is worth saying that in this situation, the person speaking must be ableto hear their amplified voice at the right level. If their voice seems tooloud, to them, they will instinctively back away from the mic. If theycan’t hear their amplified voice they will assume the system isn’tworking. I once saw the chairman of a large and prestigious organisationstand away from his microphone because he thought it wasn’t working. Ithad been, and at the right level for the audience. But unfortunately, apartfrom the front few rows, they were unable to hear a single unamplifiedword he said.

Page 24: Pro  Engineer  School  Vol. 1

Use of Microphones for Music

The way in which microphones are used for music varies much moreaccording to the instrument than it possibly could for speech where thesource of sound is of course always the human mouth. First, somescenarios:

• Recording• Broadcast• Public address• Recording studio• Location recording• Concert hall• Amplified music venue• Theatre

The requirements of recording and broadcasting are very similar, exceptthat broadcasting often works to a more stringent timescale, and intelevision broadcasting microphones must be invisible or at leastunobtrusive. There are two golden rules:

Point the microphone at the sound source from the direction of the bestnatural listening position.

The microphone will always be closer than a natural comfortablelistening distance.

So, wherever you would normally choose to listen from is the rightposition for the microphone, except that the microphone has to be closerbecause it can’t discriminate direct sound from reflected sound in theway the human ear/brain can. It is always a good starting point to followthese two rules, but of course it may not always be possible, practical, ora natural sound may not be wanted for whatever reason. Broadcasters, bythe way, tend to place the microphone closer than recording engineers.They need to get a quick, reliable result, and a close mic position issimply safer for this purpose. Ultimate sound quality is not of suchimportance.

The recording studio is a very comfortable environment for microphones.The engineer is able to use any microphone he or she desires and has

Page 25: Pro  Engineer  School  Vol. 1

available. The mic may be old, large and ugly, cumbersome to useperhaps with an external power supply (not phantom) and patternselector, prone to faults etc., but if it gets the right sound, then it will beused. Location recording is not quite so comfortable and you need to besure that the microphones are reliable and easy to use, preferably withoutexternal power supplies and with a simple stand mount rather than acomplicated elastic suspension.

As far as comfort goes, the concert hall is a reasonably good place torecord in as at least they are used to the requirements of music (theowners of many good recording venues often have higher priorities –religious worship being a prominent example). There are howeverrestrictions on the placement of microphones during a concert. Usually itis against fire regulations to have microphones among the audience,unless the mics are positioned in such a way that they don’t impedeegress and cables are very securely fixed. Generally therefore there willbe a stereo pair of mics slung from the ceiling, supplemented by anumber of mics on stage, which are closer than the engineer wouldprobably prefer them to be under ideal circumstances.

For amplified music, the problem is always in getting sufficient levelwithout feedback. This necessitates that microphones are very muchcloser than the natural listening position, to the point that naturaldirection has very little meaning. The ultimate example would be amicrophone clipped to the bridge or sound hole of a violin. It wouldn’teven be possible to listen from this position. In rock music PA,microphones are used as close to the singer’s lips as possible, rightagainst the grille cloth of a guitarist’s speaker cabinet and withinmillimetres of the heads of the drums. Primarily this is to achieve levelwithout risk of feedback. However this has also come to be understood asthe ‘rock music sound’ because it is what the audience expects. In thiscontext, the most distant mics would be the drum overhead mics, whichdon’t need much gain anyway. For string and wind instruments there area variety of clip-on mics available. There are also contact mics that pickup vibrations directly from the body of the instrument, although eventhese are not entirely immune to feedback.

In theatre musicals, the best option for the lead performers is to useminiature microphones with radio transmitters. The placement of the micis significant. The original ‘lavalier’ placement, named for Mme Lavalier

Page 26: Pro  Engineer  School  Vol. 1

who reportedly wore a large ruby from her neck, has long gone. Thechest position is great for newsreaders but it suffers from the shadow ofthe chin and boominess caused by chest resonance. The best place for aminiature microphone is on a short boom extending from behind the ear.Mics and booms are available in a variety of flesh colours so they are notvisible to the audience beyond the second or third row. If a boom is notconsidered acceptable, then the mic may protrude a short distance fromabove the ear, or descending from the hairline. This actually captures avery good vocal sound. It has to be tried to be believed. One of thebiggest problems with miniature microphones in the theatre is that theybecome ‘sweated out’ after a number of performances and have to bereplaced. Still, no-one said that it was easy going on stage. For theorchestra in a theatre musical, clip on mics are good for stringinstruments. Wind instruments are generally loud enough forconventional stand mics, closely placed. So-called ‘booth singers’ canuse conventional mics.

Page 27: Pro  Engineer  School  Vol. 1

Stereo Microphone Techniques

Firstly, what is stereo? The word ‘stereophonic’ in its original meaning itsuggests a ‘solid’ sound image and does not specify how manymicrophones, channels or loudspeakers are to be used. However, it hascome to mean two channels and two loudspeakers using as few or asmany microphones that are necessary to get a good result. When itworks, you should be able to sit in an equilateral triangle with thespeakers, listen to a recording of an orchestra and pinpoint where everyinstrument is in the sound image. (By the way, some people complainthat ‘stereophonic’, as a word, combines both Greek and Latin roots. Justas well perhaps, because if it had been exclusively Latin it would havebeen ‘crassophonic’!)

When recording a group of instruments or singers, it is possible to usejust two or three microphones to pick up the entire ensemble in stereo,and the results can be very satisfying. There are a number of techniques:

• Coincident crossed pair• Near-coincident crossed pair• ORTF• Mercury Living Presence• Decca Tree• Spaced omni• MS• Binaural

The coincident crossed pair technique traditionally uses two figure-of-eight microphones angled at 90 degrees pointing to the left and right ofthe sound stage (and, due to the rear pickup of the figure-of-eight mic, tothe left and right of the area where the audience would be also). Morepractically, two cardioid microphones can be used. They would be angledat 120 degrees were it not for the drop off in high frequency response atthis angle in most mics. A 110-degree angle of separation is a reasonablecompromise. This system was originally proposed in the 1930s andmathematically inclined audio engineers will claim that this gives perfectreproduction of the original sound field from a standard pair of stereoloudspeakers. However perfect the mathematics look on paper, the resultsdo not bear out the theory. The sound can be good, and you can with

Page 28: Pro  Engineer  School  Vol. 1

effort tell where the instruments are supposed to be in the sound image.The problem is that you just don’t feel like you are in the concert hall, orwherever the recording was made. The fact that human beings do nothave coincident ears might have something to do with it.

Coincident crossed pair

Separating the mics by around 10 cm tears the theory into shreds, but itsounds a whole lot better.

Near-coincident crossed pair

The ORTF system, named for the Office de Radiodiffusion TelevisionFrancaise, uses two cardioid microphones spaced at 17 cm angledoutwards at 110 degrees, and is simply an extended near-coincidentcrossed pair.

Page 29: Pro  Engineer  School  Vol. 1

The redeeming feature of the coincident crossed pair is that you can mixthe left and right signals into mono and it still sounds fine. Mono, butfine. We call this mono compatibility and it is important in manysituations – the majority of radio and television listeners still only haveone speaker. The further apart the microphones are spaced, the worse themono compatibility, although near-coincident and ORTF systems are stillusable.

ORTF

Mercury Living Presence was one of the early stereo techniques of the1950s, used for classical music recordings on the Mercury label. If youimagine trying to figure out how to make a stereo recording when therewas no-one around to tell you how to do it, you might work out that onemicrophone pointing left, another pointing center and a third pointingright might be the way to do it. Record each to its own track on 35mmmagnetic film, as used in cinema audio, and there you have it! Nominallyomnidirectional microphones were used, but of course the early omnimics did become directional at higher frequencies. Later recordings weremade to two-track stereo. These recordings stand up remarkable welltoday. They may have a little noise and distortion, but the sound iswonderfully clear and alive.

The same can be said of the Decca tree, used by the Decca recordcompany. This is not dissimilar from the Mercury Living Presencesystem but baffles were used between the microphones in some instancesto create separation, and additional microphones might be used wherenecessary, positioned towards the sides of the orchestra.

Page 30: Pro  Engineer  School  Vol. 1

Decca tree

Another obvious means of deploying microphones in the early days ofstereo was to place three microphones spaced apart at the front of theorchestra, much more distant from each other than in the above systems.If only two microphones are used spaced apart by perhaps as much astwo meters or more, what happens on playback is that the sound seems tocluster around the loudspeakers and there is a hole in the middle of thesound image. To prevent this, a centre microphone can be mixed in at alower level so that the ‘hole’ is filled. There is no theory on earth toexplain why this works - being so dissimilar to the human hearing system- but it can work very well. The main drawback is that a recording madein such a way sounds terrible when played in mono.

The MS system, as explained previously, uses a cardioid microphone topick up an all-round mono signal, and a figure-of-eight mic to pick up thedifference between left and right in the sound field. The M and S signalscan be combined without too much difficulty to provide conventional leftand right signals. This is of practical benefit when it is necessary torecord a single performer in stereo. With a coincident crossed pair, onemicrophone would be pointing to the left of the performer, the otherwould be pointing to the right. It just seems wrong not to point amicrophone directly at the performer, and with the MS system you do,getting the best possible sound quality from the mic. It is sometimesproposed as an advantage of MS than it is possible to control the width ofthe stereo image by adjusting the level of the S signal. This is exactly thesame as adjusting the width by turning the mixing console’s panpots for

Page 31: Pro  Engineer  School  Vol. 1

the left and right signals closer to the centre. Therefore it is in reality noadvantage at all.

Binaural stereo attempts to mimic the human hearing system with adummy head (sometimes face, shoulders and chest too) with twoomnidirectional microphones placed in artificial ears just like a realhuman head. It works well, but only on headphones. A binaural recordingplayed on speakers doesn’t work because the two channels mix on theirway to the listener, spoiling the effect. There have been a number ofsystems attempting to make binaural recordings work on loudspeakersbut none has become popular.

In addition to the stereo miking system, it is common to mic up everysection of an orchestra, whether it is a classical orchestra, film music, orthe backing for a popular music track. Normally the stereo mic system,crossed pair or whatever, is considered the main source of signal, withthe other microphones used to compensate for the distance to the rear ofthe orchestra, and to add just a little presence to instruments whereappropriate. Sectional mics shouldn’t be used to compensate for poorbalance due to the conductor or arranger. Sometimes however classicalcomposers don’t get the balance quite right and it is not acceptable tochange the orchestration. A little technical help is therefore called for.

Instruments

We come back to the two golden rules of microphone placement, asabove. It is worth looking at some specific examples:

Saxophone

There are two fairly obvious ways a saxophone can be close miked. Oneis close to the mouthpiece, another is close to the bell. The difference insound quality is tremendous. The same applies to all close miking. Smallchanges in microphone position can affect the sound quality enormously.There are many books and texts that claim to tell you how and where toposition microphones for all manner of instruments, but the key is toexperiment and find the best position for the instrument – and player –you have in front of you. Experience, not book learning, leads to success.Of the two saxophone close miking positions, neither will capture thenatural sound of the instrument, if that’s what you want. Close micpositions almost never do. If you move the mic further away, up to

Page 32: Pro  Engineer  School  Vol. 1

around a meter, you will be able to capture the sound of the whole of theinstrument, mouthpiece, bell, the metal of the instrument, and the holesthat are covered and uncovered during the normal course of playing. Alsoas you move away you will capture more room ambience, and that is acompromise that has to be struck. Natural sound against room ambience.It’s subjective.

Piano

Specifically the grand piano – it is common to place the microphone (ormicrophones) pointing directly at the strings. Oddly enough no-one everlistens from this position and it doesn’t really capture a natural sound, butit might be the sound you want. The closer the microphones are to thehigher strings, the brighter the sound will be. You can position themicrophones all the way at the bass end of the instrument, spaced apartby maybe 30 cm, and a rich full sound will be captured. Move themicrophones below the edge of the case and angle them so that they pickup reflected sound from the lid and a more natural sound will bediscovered. You can even place a microphone under a grand piano tocapture the vibration of the soundboard. It can even sound quite good,but listen out for noise from the foot pedals.

Drums

The conventional setup is one mic per drum, a mic for the hihat perhaps,and two overhead mics for the cymbals. Recording drums is an art formand experience is by far the best guide. There are some points to bear inmind:

You can’t get a good recording of a poor kit, particularly cymbals, or akit that isn’t well set up. It is often necessary to damp the drums bytaping material to the edge of the drum head to get a shorter, morecontrolled sound.

The mics have to be placed where the drummer won’t hit them, or thestands.

Dynamic mics generally sound better for drums, capacitor mics forcymbals.

Page 33: Pro  Engineer  School  Vol. 1

The kick drum should have its front head removed, or there should be alarge hole cut out so that a damping blanket can be placed inside.Otherwise it will sound more like a military bass drum than the dull thudthat we are used to. The choice of beater – hard or soft - is important, asis the position of the kick drum mic either just outside, or some distanceinside the drum.

The snares on the underside of the snare drum may rattle when otherdrums are being played. Careful adjustment of the tension of the snares isnecessary, and perhaps even a little damping.

Microphones should be spaced as far apart from each other as possibleand directed away from other drums. Every little bit helps as thecombination of two mics picking up the same drum from differentdistances leads to cancellation of groups of frequencies. The brute forcetechnique is to use a noise gate on every microphone channel, and this iscommonly done. Noise gates will be covered later.

Perhaps this is a brief introduction to the use of microphones, but it’s astart. And to round off I’ll give away the secret of getting good soundfrom your microphones:

Listen!

Page 34: Pro  Engineer  School  Vol. 1

Check Questions

• What problem is commonly found in live sports commentary?

• What does a fishpole operator concentrate on while working?

• In theater, what is 'area miking'?

• How is feedback avoided in live sound (the simplest technique)?

• Why must the speaker at a conference hear his or her ownamplified voice at the right level?

• Write down, copy if you wish, the two golden rules formicrophone positioning

• Why do microphones have to be placed closer than a naturallistening position?

• Where are personal mics worn in the theater?

• What is stereo?

• Describe the coincident crossed pair.

• What is the benefit of separating the microphones (relate this to thehuman hearing system)?

• What is the value of mono compatibility?

• Why is it desirable to mic up every section of an orchestraindependently?

• Pick an instrument other than those mentioned in the text. Describethe effect of two alternative close miking positions.

• When you look at a grand piano, performed solo, on stage, doesthe pianist sit on the left or the right? Why?

• Why do drums often need to be damped?

Page 35: Pro  Engineer  School  Vol. 1

Chapter 3: Loudspeaker Drive Units

Loudspeakers are without doubt the most inadequate component of theaudio signal chain. Everything else, even the microphone, is as close tothe capabilities of human hearing as makes hardly any difference at all.However, amplify the signal and convert it back into sound and you willknow without any hesitation whatsoever that you are listening to aloudspeaker, not a natural sound source.

Loudspeakers can be categorized by method of operation and byfunction:

• Method of operation:• Moving coil• Electrostatic• Direct radiator• Horn• Function:• Domestic• Hi-fi• Studio• PA

In this context we will use ‘PA’ to mean concert public address ratherthan announcement systems that are beyond the scope of this text.

The moving coil loudspeaker, or I should say ‘drive unit’ as this is onlyone component of the complete system, is the original and still mostwidely used method of converting an electric signal to sound. Thecomponents consist of a magnet, a coil of wire (sometimes called the‘voice coil’) positioned within the field of the magnet and a diaphragmthat pushes against the air. When a signal is passed through the coil, itcreates a magnetic field that interacts with the field of the permanentmagnet causing motion in the coil and in turn the diaphragm. It isprobably fair to say that 99.999% of the loudspeakers you will ever comeacross use moving coil drive units.

The electrostatic loudspeaker (and this time it is a loudspeaker rather thanjust a drive unit) uses electrostatic attraction rather than magnetism. Theelectrostatic loudspeaker has the most natural sound quality, but is not

Page 36: Pro  Engineer  School  Vol. 1

capable of high sound levels. Hence it is rarely used in professional audiooutside of, occasionally, classical music recording.

A moving coil drive unit can be constructed as either a direct radiator ora horn. In a director radiator drive unit, the diaphragm pushes directlyagainst the air. This is not very efficient as the diaphragm and the airhave differing acoustic impedance, which creates a barrier for the soundto cross. A horn makes the transition from vibration in the diaphragm tovibration in the open air more gradual, therefore it is more efficient, andfor a given input power the horn will be louder.

Let's look at these in more detail:

Moving Coil Drive Unit

Perhaps the best place to start is a 200 mm drive unit intended for lowand mid frequency reproduction. This isn't the biggest drive unitavailable, so why are larger drive units ever necessary? The answer is toachieve a higher sound level. A 200 mm drive unit only pushes against somuch air. Increase the diameter to 300 mm or 375 mm and many moreair molecules feel the impact. The next question would be, why are 300

Page 37: Pro  Engineer  School  Vol. 1

mm or 375 mm drive units not used more often, when space is available?The answer to that is in the behavior of the diaphragm:

The diaphragm must not bend in operation otherwise it will producedistortion. It is sometimes said that the diaphragm should operate as a‘rigid piston’.

The diaphragm could be flat and still produce sound. However, since themotor is at the center and vibrations are transmitted to the edges, thediaphragm needs to be stiff. The cone shape is the best compromisebetween stiffness and large diameter.

High frequencies will tend to bend the diaphragm more than lowfrequencies. It takes a certain time for movement of the coil to propagateto the edge of the diaphragm. Fairly obviously, at high frequencies thereisn't so much time and at some frequency the diaphragm will start todeviate from the ideal rigid piston.

200 mm is a good compromise. It will produce enough level at lowfrequency for the average living room, and it will produce reasonablydistortion-free sound up to around 4 kHz or so. When the diaphragmbends, it is called break up, due to the vibration ‘breaking up’ into anumber of different modes. ‘Break up’, in this context, doesn't meansevere distortion or anything like that. In fact most low frequency driveunits are operated well into the break up region. It is up to the designer toensure that the distortion created doesn't sound too unpleasant. By theway, it is often thought that a larger drive unit will operate down to lowerfrequencies. This isn't quite the right way to look at it. Any size of driveunit will operate down to as low a frequency as you like, but you need abig drive unit to shift large quantities of air at low frequency. At highfrequency, the drive unit vibrates backwards and forwards rapidly,moving air on each vibration. At low frequencies there are feweropportunities to move air, therefore the area of the drive unit needs to begreater to achieve the desired level.

The material of the diaphragm has a significant effect on its stiffness.Early moving coil drive units used paper pulp diaphragms, which werenot particularly stiff. Modern drive units use plastic diaphragms, or pulpdiaphragms that have been doped to stiffen them adequately. Of course,the ultimate in stiffness would be a metal diaphragm. Unfortunately, itwould be heavy and the drive unit would be less efficient. Carbon fiber

Page 38: Pro  Engineer  School  Vol. 1

diaphragms have also been used with some success. (It is worth notingthat in drive units used for electric guitars, the diaphragm is designed tobend and distort. It is part of the sound of the instrument and a distortion-free sound would not meet a guitarist's requirements).

Moving up the frequency range: as we have said, the diaphragm willbend and produce distortion. Even if it didn't, there would still be theproblem that a large sound source will tend to focus sound over a narrowarea, becoming narrower as the frequency increases. In fact, this is thecharacteristic of direct radiator loudspeakers: that their angle of coveragedecreases as the frequency gets higher. This is significant in PA, where asingle loudspeaker has to cover a large number of people. (It is perhapscounter-intuitive that a large sound source will focus the sound, but it iscertainly so. A good acoustics text will supply the explanation).

Because of these two factors, higher frequencies are handled by a smallerdrive unit. A smaller diaphragm is more rigid at higher frequencies, andbecause it is smaller it spreads sound more widely. Often the diaphragmis dome shaped rather than conical. This is part of the designer's art andisn't of direct relevance to the sound engineer, as long as it sounds good.

It might be stating the obvious at this stage, but a low frequency driveunit is commonly known as a woofer, and a high frequency drive unit asa tweeter.

In loudspeakers where a low frequency drive unit greater than 200 mm isused, it will not be possible to use the woofer up to a sufficiently highfrequency to hand over directly to the tweeter. Therefore a mid frequencydrive unit has to be used (sometimes known as a squawker!). Thefunction of dividing the frequency band among the various drive units ishandled by a crossover, more on which later.

Damage

There are two ways in which a moving coil drive unit may be damaged.One is to drive it at too high a level for too long. The coil will get hotterand hotter and eventually will melt at one point, breaking the circuit(‘thermal damage’). The drive unit will entirely cease to function. Theother is to ‘shock’ the drive unit with a loud impulse. This can happen ifa microphone is dropped, or placed too close to a theatrical pyrotechniceffect. The impulse won't contain enough energy to melt the coil, but it

Page 39: Pro  Engineer  School  Vol. 1

may break apart the turns of the coil, or shift it from its central positionwith respect to the magnet (‘mechanical damage’). The drive unit willstill function, but the coil will scrape against the magnet producing a veryharsh distorted sound. Many drive units can be repaired, but of coursedamage is best avoided in the first place. The trick is to listen to theloudspeaker. It will tell you when it is under stress if you listen carefullyenough.

One common question regarding damage to loudspeakers is this: Whatshould the power of the amplifier be in relation to the rated power of theloudspeaker? In fact, although the power of an amplifier can be measuredvery accurately, the capacity of a loudspeaker to soak up this power isonly an intelligent guess, at best. During the design process, themanufacturer will test drive units to destruction and arrive at a balancebetween a high rating (in watts) that will impress potential buyers, and alow number of complaints from people who have pushed their purchasestoo hard. The rating on the cabinet is therefore only a guide. To get thebest performance from a loudspeaker, the amplifier should be ratedhigher in terms of watts. It wouldn't be unreasonable to connect a 200 Wamplifier to a 100 W speaker, and it won't blow the drive units unless youpush the level too high. It is up to the sound engineer to control the level.Suppose, on the other hand, that a 100 W amplifier was connected to a200 W loudspeaker (two-way, with woofer and tweeter). The soundengineer might push the level so high that the amplifier started to clip.Clipping produces high levels of high frequency distortion. In a 200 Wloudspeaker, the tweeter could be rated at as little as 20-30 W, as undernormal circumstances that is all it would be expected to handle. Butunder clipping conditions the level supplied to the tweeter could bemassively higher, and it will blow.

Impedance

Drive units and complete loudspeaker systems are also rated in terms oftheir impedance. This is the load presented to the amplifier, where a lowimpedance means the amplifier will have to deliver more current, andhence ‘work harder’. A common nominal impedance is 8 ohms.‘Nominal’ means that this is averaged over the frequency range of thedrive unit or loudspeaker, and you will find that the actual impedancedeparts significantly from nominal according to frequency. Normally thisisn't particularly significant, except in two situations:

Page 40: Pro  Engineer  School  Vol. 1

At some frequency the impedance drops well below the nominalimpedance. The power amplifier will be called upon to deliver perhapsmore power than it is capable of, causing clipping, or perhaps theamplifier might even go into protection mode to avoid damage to itself.

The output impedance of a power amplifier is very low – just a smallfraction of an ohm. You could think of the output impedance of theamplifier in series with the impedance of the loudspeaker as a potentialdivider. Work out the potential divider equation with R1 equal to zeroand you will see that the output voltage is equal to the input voltage.However, give R1 some significant impedance, as would happen with along run of loudspeaker cable, and you will see a voltage loss. Make R2 -the loudspeaker impedance - variable with frequency and you will nowsee a rather less than flat frequency response.

To be honest, the above points are not always at the forefront of theworking sound engineer's mind, but they are significant and worthknowing about.

Page 41: Pro  Engineer  School  Vol. 1

Check Questions

• What is the difference between the terms 'loudspeaker' and 'driveunit'?

• How does a moving coil drive unit work?

• Comment on the two qualities of an electrostatic loudspeaker.

• What is a director radiator drive unit?

• What is the function of a horn?

• Why are drive units larger than 200 mm sometimes used?

• What is meant by the phrase 'rigid piston'?

• Why is the diaphragm of a moving coil loudspeaker normally coneshaped?

• Why does the diaphragm bend more at higher frequencies?

• What is 'break up'?

• Does breakup occur in a woofer in normal operation?

• Why should a guitar drive unit distort intentionally?

• Comment on the 'beaming' effect of a large drive unit.

• When is a separate midrange drive unit necessary?

• Comment on the two damage modes of moving coil drive units.

• If a loudspeaker is rated at 100 W, what should be the power of theamplifier, according to the text?

Page 42: Pro  Engineer  School  Vol. 1

Chapter 4: Loudspeaker Systems

Cabinet (Enclosure)

The moving coil drive unit is as open to the air at the rear as it is to thefront, hence it emits sound forwards and backwards. The backward-radiated sound causes a problem. Sound diffracts readily, particularly atlow frequencies, and much of the energy will 'bend' around to the front.Since the movement of the diaphragm to the rear is in the oppositedirection to the movement to the front, this leaked sound is inverted (orwe can say 180 degrees out of phase) and the combination of the two willtend to cancel each other out. This occurs at frequencies where thewavelength is larger than the diameter of the drive unit. For a 200 mmdrive unit the frequency at which cancellation would start to becomesignificant is 1700 Hz, the cancellation getting worse at lowerfrequencies.

The simple solution to this is to mount the drive unit on a baffle. A baffleis simply a flat sheet of wood with a hole cut out for the drive unit.Amazingly, it works. But to work well down to sufficiently lowfrequencies it has to be extremely large. The wavelength at 50 Hz, forexample, is almost 7 meters. The baffle can be folded around the driveunit to create an open back cabinet, which you will still find in use forelectric guitar loudspeakers. The drawback is that the partially enclosedspace creates a resonance that colors the sound.

The logical extension of the baffle and open back cabinet is to enclosethe rear of the drive unit completely, creating an infinite baffle. It wouldnow seem that the rear radiation is completely controlled. However, thereare problems:

The diaphragm now has to push against the air 'spring' that is trappedinside the cabinet. This present significant opposition to the motion of thediaphragm.

Sound will leak through the cabinet walls anyway.

The cabinet will itself vibrate and is highly unlikely to operate anythinglike a rigid piston or have a flat frequency response. (Of course, thishappens with the open back cabinet too).

Page 43: Pro  Engineer  School  Vol. 1

At this point it is worth saying that the bare drive unit is often used intheater sound systems where there is a need for extreme clarity in thehuman vocal range. Low frequencies can be bolstered with conventionalcabinet loudspeakers.

Despite these problems, careful design of the drive unit to balance thespringiness of the trapped air inside the cabinet against the springiness ofthe suspension can work wonders. The infinite baffle, properly designed,is widely regarded as the most natural sounding type of loudspeaker(electrostatics excepted). The only real problem is that the compromisesthat have to be made to make this design work result in poor lowfrequency response.

Points of order:

'Springiness' is more properly known as compliance.

Another term for 'infinite baffle' is acoustic suspension.

You would need a very deep understanding of loudspeakers (startingwith the Thiele-Small parameters of drive units) to be able to design aloudspeaker that would work well for studio or PA use. Electric guitarloudspeakers are not so critical.

The next step in cabinet design is the bass reflex enclosure. You willoccasionally hear of this as a ported or vented cabinet.

The bass reflex cabinet borrows the theory of the Helmholtz resonator. AHelmholtz resonator is nothing more than an enclosed volume of airconnected to the outside world by a narrow tube, called the port. The portcan stick out of the enclosure as in a beer bottle - a perfect example of theprinciple - or inwards. The small plug of air in the port bounces againstthe compliance of the larger volume of air inside and resonates readily.Try blowing across the top of the beer bottle (when empty) and you willsee.

The Helmholtz resonator can be designed via a relatively simple formulato have any resonant frequency you choose. In the case of the bass reflexenclosure, the resonant frequency is set just at the point where anequivalently sized infinite baffle would be losing low end response.Thus, the resonance of the enclosure can assist the drive unit just at the

Page 44: Pro  Engineer  School  Vol. 1

point where its output is weakening, this extending the low frequencyresponse usefully.

There is of course a cost to this. Whereas an infinite baffle loudspeakercan be designed with a low-Q resonance, meaning essentially that whenthe input ceases the diaphragm returns straight away to its rest position,in a bass reflex loudspeaker the drive unit will overshoot the rest positionand then return. Depending on the quality of the design, it may do thismore than once creating an audible resonance. This can result in so-called 'boomy' bass, which is generally undesirable. Additionally, aloudspeaker with boomy bass will tend to translate any low frequencyenergy into output at the resonant frequency. This a carefully tuned andrecorded kick drum will come out as a boom at the loudspeaker'sresonant frequency. The competent loudspeaker designer is in control ofthis and a degree boominess will be balanced against a subjectively'good' - if not accurate - bass response.

There are other cabinet designs, notably the transmission line, but theseare not generally within the scope of professional sound engineer so theywill be excluded from this text.

Horns

We have covered horns to some degree already. There is a whole theoryto horns that deserves consideration, but here we will simply list some ofthe basics:

Whereas a direct radiator drive unit may be only 1% efficient (i.e. 100 Wof electrical power converts to just 1 W of sound power), a horn driveunit may be up to 5% efficient.

The air in the throat of the horn becomes so compressed at high levelsthat significant distortion is produced. However, some people - includingthe writer of this text! - can on occasion find the distortion quite pleasant.

To make any significant difference to the efficiency of a loudspeaker atlow frequencies, the length and area of the horn have to be very large.However, folded horn cabinets can be constructed that make enough of adifference to be worthwhile. These are sometimes known as 'bass bins'.

Page 45: Pro  Engineer  School  Vol. 1

The most important application of the horn is in high quality PA systemssuch as those used for theater musicals. The problem in theater musicalsis that the sound has to be intelligible otherwise the story won't beunderstood by the audience (many of whom in a London West Endtheater would be European tourists who wouldn't have English as theirfirst language). Also, the whole of the auditorium has to be covered withhigh quality sound.

if director radiator loudspeakers were used in the theater, then peoplewho were on-axis would received good quality sound. Those members ofthe audience who were further from the 'straight ahead' position wouldreceived lower levels at high frequency and therefore a duller sound. Thesolution is the constant directivity horn. (More information ondirectivity...). The shape of the curvature of the horn can be one of anynumber of mathematical functions, or even just an arbitrary shape. Withcareful calculation and design it is possible to produce a constantdirectivity horn which has an even frequency response over an angle ofup to 60 degrees. This means that one loudspeaker can cover a sizablesection of the audience, all with pretty much the same quality of sound.This leads to the concept of the center cluster loudspeaker system that iswidely used wherever intelligibility is a prime requirement in a PAsystem. A number of constant directivity horn loudspeakers are arrayedso that where the coverage of one is just starting to fall off, the adjacentloudspeaker takes over. Next time you are in a theater, or large place ofworship, with a quality sound system, take a look at the loudspeakers.Apart from any loudspeakers that are dedicated to bass, wheredirectionality isn't significant, there should be one cabinet pointingalmost directly at you, plus or minus 30 degrees or so, and there shouldbe no other loudspeaker pointing at you from any other location in thebuilding, other than for special theatrical effects. There will be more onthis when we cover PA system specifically.

Crossover

The function of the crossover is to separate low, mid and highfrequencies according to the number of drive units in the loudspeaker. Acrossover can be passive or active. A passive crossover is generallyinternal to the cabinet and consists of a network capacitors, inductors andresistors. Having no active components, it doesn't need to be powered.An active crossover on the other hand does contain transistors or ICs and

Page 46: Pro  Engineer  School  Vol. 1

requires mains power. It sits between the output of the mixing consoleand a number of power amplifiers - one for each division of thefrequency band. A system with a three-band active crossover wouldrequire three power amplifiers.

Crossovers have two principal parameter sets: the cut off frequencies ofthe bands, and the slopes of the filters. It is impractical, and actuallyundesirable, to have a filter that allows frequencies up to, say, 4 kHz topass and then cut off everything above that completely. So frequenciesbeyond the cutoff frequency (where the response has dropped by 3 dBfrom normal) are rolled off at a rate of 6, 12, 18 or 24 dB per octave. Inother words, in the band of frequencies where the slope has kicked in, asthe frequency doubles the response drops by that number of decibels. Theslopes mentioned are actually the easy ones to design. A filter with aslope of, say, 9 dB per octave would be much more complex.

As it happens, a slope of 6 dB per octave is useless. High frequencieswould be sent to the woofer at sufficient level that there would be audibledistortion due to break up. Low frequencies would be sent to the tweeterthat could damage it. 12 dB/octave is workable, but most systems thesedays use 18 dB/octave or 24 dB/octave. There are issues with the phaseresponse of crossover filters that vary according to slope, but this is anadvanced topic that few working sound engineers would contemplate toany great extent.

Passive crossovers have a number of advantages:

• Inexpensive• Convenient• Usually matched by the loudspeaker manufacturer to the

requirements of the drive units• And the disadvantages:• Not practical to produce a 24 dB/octave slope• Can waste power• Not always accurate & component values can change over time

Likewise, active crossovers have advantages:

• Accurate• Cutoff frequency and slope can be varied

Page 47: Pro  Engineer  School  Vol. 1

• Power amplifier connects directly to drive unit - no wastage ofpower & better control over diaphragm motion

• Limiters can be built into each band to help avoid blowing driveunits

And the disadvantages:

• Expensive• It is possible to connect the crossover incorrectly and send LF to

the HF driver and vice versa.• A third-party unit would not compensate for any deficiencies in the

driver units.

Some loudspeaker systems come as a package with a dedicatedloudspeaker control unit. The control unit consists of three components:

• Crossover• Equalizer to correct the response or each drive unit• Sensing of voltage (and sometimes) current to ensure that each

drive unit is maximally protected

Page 48: Pro  Engineer  School  Vol. 1

Use of Loudspeakers

As mentioned earlier, there are four main usage areas of loudspeakers:domestic, hi-fi, studio and PA. We will skip non-critical domestic usageand move directly on to hi-fi. The hi-fi market is significant in that this iswhere we will find the very best sounding loudspeakers. The living roomenvironment is generally fairly small, and listening levels are generallywell below what we call 'rock and roll'. This means that the loudspeakercan be optimized for sound quality, and the best examples can be verysatisfying to listen to with few objectionable features, although it still hasto be said that moving coil loudspeakers always sound like loudspeakersand never exactly like the original sound source.

Recording studio main monitors have to be capable of higher soundlevels. For one thing, the producer, engineer and musicians might justlike to monitor at high level, although for the sake of their hearing theyshould not do this too often. Another consideration is that theacoustically treated control room will absorb a lot of the loudspeaker'senergy, so that any given loudspeaker would seem quieter than it wouldin a typical living room. It is generally true that a loudspeaker that isoptimized for high levels won't be as accurate as one that has beenoptimized for sound quality. PA speakers are the ultimate example ofthis. There has been a trend over the last couple of decades for PAspeakers to be smaller and hence more cost effective to set up. This hasresulted in an intense design effort to make smaller loudspeakers louder.Obviously the quality suffers. If you put an expensive PA loudspeakernext to a decent hi-fi loudspeaker in a head-to-head comparison at amoderate listening level, the hi-fi loudspeaker will win easily.

The most fascinating use of loudspeakers is the near field monitor. Nearfield monitors are now almost universally used in the recording studio forgeneral monitoring purposes and for mixing. This would seem oddbecause twenty-five years ago anyone in the recording industry wouldhave said that studio monitors have to be as good as possible so that theengineer can hear the mix better than anyone else ever will. That way, allthe detail in the sound can be assessed properly and any faults ordeficiencies picked up. Mixes were also assessed on tiny Auratoneloudspeakers just to make sure they would sound good on cheapdomestic systems, radios or portables.

Page 49: Pro  Engineer  School  Vol. 1

That was until the arrival of the Yamaha NS10 - a small domesticloudspeaker with a dreadful sound. It must have found its way into thestudio as cheap domestic reference. A slightly upmarket Auratone if youlike. However, someone must have used it as a primary reference for amix, and found that by some magical an indefinable means, the NS10made it easier to get a great mix - and not only that but a mix that would'travel well' and sound good on any system. The NS10 and later NS10Mare now no longer in production, but every manufacturer has a nearfieldmonitor in their range. Some actually now sound very good, althoughtheir bass response is lacking due to their small size. The success ifnearfield monitoring is something of a mystery. It shouldn't work, but thefact is that it does. And since so little is quantifiable, the bestrecommendation for a nearfield monitor is that it has been used by manyengineers to mix lots of big-selling records. That would be the YamahaNS10 then!

Page 50: Pro  Engineer  School  Vol. 1

Check Questions

• What problem is caused by sound coming from the rear of thedrive unit?

• What is a baffle?

• How large does a baffle have to be to work well at lowfrequencies?

• What is an 'open back' cabinet?

• What is an 'infinite baffle' cabinet?

• What problem in an infinite baffle cabinet is caused by the trappedair inside?

• What is 'compliance'?

• What is a 'bass reflex' enclosure?

• What is the advantage of a bass reflex loudspeaker compared to aninfinite baffle?

• What is the disadvantage of a bass reflex loudspeaker compared toan infinite baffle?

• Briefly describe a horn drive unit in comparison with a directradiator drive unit.

• What is the advantage of the horn regarding efficiency?

• What is the (greater) advantage of the constant directivity horn?

• What is a 'center cluster'?

• What is meant by the 'slope' of a crossover?

• Contrast some of the principal features of active and passivecrossovers.

• Comment on the use of nearfield monitors

Page 51: Pro  Engineer  School  Vol. 1

Chapter 5: Analog Recording

Contrary to what you might read in home recording magazines, analogrecording is not dead. Top professional studios still have analog recordersbecause they have a sound quality that digital just can't match. This isn'treally to say that they sound better; in fact their faults are easilyquantifiable, but their sound is often said to be 'warm', and it is often trueto say that it is easier to mix a recording made on analog than it is to mixa digital multitrack recording. The other useful feature of analogrecorders is that they are universal. You can take a tape anywhere andfind a machine to play it on. As digital formats become increasinglydiverse, individual studios become more and more isolated with audiobeing subject to an often complex export process to transfer it from onestudio's system to another. With tape, you just mount the reel on therecorder and press play.

History

Magnetic tape recording was invented in the early years of the TwentiethCentury and became useful as a device for recording speech, but simplyfor the information content, as in a dictation machine - the sound qualitywas too poor. In essence, a tape recorder converts an electrical signal to amagnetic record of that signal. Electricity is an easy medium to work in,compared to magnetism. It is straightforward to build an electrical devicethat responds linearly to an input. As we saw earlier, 'linear' meanswithout distortion - like a flat mirror compared (linear) to a funfair mirror(non-linear). Magnetic material does not respond linearly to amagnetizing force. When a small magnetizing force is applied, thematerial hardly responds at all. When a greater magnetizing force isapplied and the initial lack of enthusiasm to become magnetized has beenovercome, then it does respond fairly linearly, right up to the point whereit is magnetized as much as it can be, when we say that it is 'saturated'.Unfortunately, no-one has devised a way of applying negative feedbackto analog recording, which in an electrical amplifier reduces distortiontremendously.

Early tape recorders (and wire recorders) had no means of compensatingfor the inherent non-linearity of magnetic material, and it was left up toscientists in Germany during World War II to come up with a solution.The tape recorder was apparently used to broadcast orchestral concerts at

Page 52: Pro  Engineer  School  Vol. 1

all hours of day and night, to the consternation of opposing countries whowondered how Germany could spare the resources to have orchestrasplaying in the middle of the night. (Obviously, recording onto disc waspossible, but the characteristic crackle always gave the game away).After hostilities had ceased, US forces brought some captured machinesback home and development continued from that point. There is a lot ofhistory to the analog recorder, which we don't need here, but it iscertainly interesting as the development of the tape recorder coincideswith the development of recording as we know it now.

The Sound of Analog

There are three characteristic ingredients of the analog sound:

• Distortion• Noise• Modulation noise• Distortion

The invention that transformed the analog tape recorder from a dictationmachine to a music recording device, during the 1940s, was AC bias.Since the response of tape to a small magnetizing force is very small, andthe linear region of the response only starts at higher magnetic forcelevels, a constant supporting magnetic force, or bias, is used to overcomethis initial resistance. Prior to AC bias, DC bias was used courtesy of asimple permanent magnet. However, considerable distortion remained.AC bias uses a high frequency (~100 kHz) sine wave signal mixed inwith the audio signal to 'help' the audio signal get into the linear regionwhich is relatively distortion-free. This happens inside the recorder andno intervention is required on the part of the user. However the level ofthe bias signal has to be set correctly for optimum results. In traditionalrecording, this is the job of the recording engineer before the sessionstarts. It has to be said that line up is an exacting procedure and manymodern recording engineers have so much else to think about (theirdigital transfers!) that line-up is better left to specialists.

Despite AC bias, analog recording produces a significant amount ofdistortion. The higher the level you attempt to record on the tape, themore the distortion. It isn't like an amplifier or digital recorder where thesignal is clean right up to 0 dBFS, then harsh clipping takes place. The

Page 53: Pro  Engineer  School  Vol. 1

distortion increases gradually from barely perceptible to downrightunpleasant. Most analog recordings peak at a level that will producearound 1% distortion, which is very high compared to any other type ofequipment. At 3%, most engineers will be thinking about backing off.More is unacceptable. It may not sound promising to use a medium thatproduces so much distortion, but the fact is that it actually sounds quitepleasant! It is also different in character than vacuum tube (valve)distortion so it is an additional tool in the recording engineer's toolkit.

Noise

As well as producing more distortion than any other type of audioequipment, the analog tape recorder produces more noise too - a signal tonoise ratio of around 65 dB is about the best you can hope for andrepresents the state of the art since tape recorders matured around theearly 1970s. It is debatable whether noise is a desirable component ofanalog recording, but it is certainly a feature. Noise isn't really the ogre itis made out to be. If levels are set correctly to maximize the use of theavailable dynamic range up to the 1% or 3% distortion point, then thereis no reason why it should be troublesome in the final mix, althoughsome 'noise management' will be necessary of the part of the mixengineer.

Modulation Noise

There have been digital 'analog simulators', but to my ears, unless thisaspect of the character of analog recorders is simulated, they just don'tsame the same. Modulation noise is noise that changes as the signalchanges, and has two causes. One is Barkhausen noise which is producedby quantization of the magnetic domains (a gross over-simplification of aphenomenon that would take too much understanding for the workingsound engineer to bother with). The other - more significant - cause ofmodulation noise is irregularities in the speed of tape travel. Theseirregularities are themselves caused by eccentricity and roughness in thebearings and other rotating parts, and by the tape scraping against thestatic parts. We some times hear of the term 'scrape flutter', which createsmodulation noise, and the 'flutter damper roller', which is a componentused to minimize the problem.

If a 1 kHz sine wave tone is recorded onto analog tape, the output willconsist of 1 kHz plus two ranges of other frequencies, some strong and

Page 54: Pro  Engineer  School  Vol. 1

consistent, others weaker and ever-changing due to random variations.These are known in radio as 'sidebands' and the concept has exactly thesame meaning here.

Modulation noise, subjectively, causes a 'thickening' of the signal whichaccounts for the fat sound of analog, compared to the more accurate, butthin sound of digital. It has even been known for engineers to artificiallyincrease the amount of modulation noise by unbalancing one of therollers, thus creating more stronger sidebands containing a greater rangeof frequencies. Don't try it with your hard disk!

Page 55: Pro  Engineer  School  Vol. 1

The Anatomy of the Analog Tape Recorder

Page 56: Pro  Engineer  School  Vol. 1

The Studer A807 pictured here is typical of a workhorse stereo analogrecorder, sold mainly into the broadcast market. Let's run through themajor components starting from the ones you can't see:

• Three motors, one each for the supply reel, take-up real andcapstan. The take-up reel motor provides sufficient tension tocollect the tape as it comes through. It does not itself pull the tapethrough. The supply reel motor is energized in the reverse directionto maintain the tension of the tape against the heads.

• The capstan provides the motive force that drives the tape at thecorrect speed.

• The pinch wheel holds the tape against the capstan.

• The tach (short for tachometer) roller contains a device to measurethe speed of the tape in play and fast wind.

• The tension arm smooths out any irregularities in tape flow.

• The flutter damper roller reduces vibrations in the tape, lesseningmodulation noise.

• The erase head wipes the tape clean of any previous recording.

• The record head writes the magnetic signal to the tape. It can alsofunction as a playback head, usually with reduced high frequencyresponse.

• The playback head plays back the recording.

Magnetic Tape

Magnetic tape comprises a base film, upon which is coated a layer of ironoxide. Oxide of iron is sometimes, in other contexts, known as 'rust'. Theoxide is bonded to the base film by a 'binder', which also lubricates thetape as it passes through the recorder. Other magnetic materials havebeen tried, but none suits analog audio recording better than iron, or moreproperly 'ferric' oxide. There are two major manufacturers of analog tape(there used to be several): Quantegy (formerly known as Ampex) andEmtec (formerly known as BASF).

Page 57: Pro  Engineer  School  Vol. 1

Tape is manufactured in a variety of widths. (It is also manufactured intwo thickness - so-called 'long play' tape can fit a longer duration ofrecording on the same spool, at the expense of certain compromises.).The widths in common use today are two-inch and half-inch. Oddlyenough, metrication doesn't seem to have reached analog tape and wetend to avoid talking about 50 mm and 12.5 mm. Other widths are stillavailable, but they are only used in conjunction with 'legacy' equipmentwhich is being used until it wears out and is scrapped, and for replay orremix of archive material. Quarter-inch tape was in the past very widelyused as the standard stereo medium, but there is now little point in usingit as it has no advantages over other options that are available.

Two-inch tape is used on twenty-four track recorders. A twenty-fourtrack recorder can record - obviously - twenty-four separate tracks acrossthe width of the tape, thus keeping instruments separate until finalmixdown to stereo. Half-inch tape is used on stereo recorders for the finalmaster.

The speed at which the tape travels is significant. Higher speeds arebetter for capturing high frequencies as the recorded wavelength isphysically longer on the tape. However, there are also irregularities(sometimes known as 'head bumps, or as 'woodles') in the bass end. Themost common tape speed in professional use used to be 15 inches persecond (38 cm/s), but these days it is more common to use 30 ips (76cm/s), and not care about the massive cost in tape consumption! At 30ips, a standard reel of tape costing up to $150 lasts about sixteen minutes.

Page 58: Pro  Engineer  School  Vol. 1

Analog Recorders in Common Use

Otari MTR90 Mk III

There have been many manufacturers of analog tape recorders, but thetop three historically have been Ampex, Otari and Studer. In the US, youwill commonly find the Ampex MM1200 and occasionally the AmpexATR124, which is often regarded as the best analog multitrack evermade, but Ampex only made fifty of them. All over the world you willfind the Otari MTR90 (illustrated with autolocator) which is considered

Page 59: Pro  Engineer  School  Vol. 1

to be a good quality workhorse machine, and is still available to buy. TheStuder range is also well respected. The Studer A80 represents thecoming of age of analog multitrack recording in the 1970s. It has a soundquality which is as good as the best within a very fine margin, butoperational facilities are not totally up to modern standards. For example,it will not drop out of record mode without stopping the tape. The StuderA800 is still a prized machine and is fully capable, sonicly andoperationally, of work to the highest professional standard. The morerecent A827 and A820 are also very good, but sadly no longermanufactured.

Multitrack Recording Techniques

How to set about a multitrack recording session is a topic in itself andwill be explained later. However, there are certain points of relevance tothe equipment itself. The first is the necessity to be able to listen to ormonitor previously recorded tracks while performing an overdub. Theproblem here is that there is a gap between the record head and theplayback head. If the singer, for example, sings in time with the outputfrom the playback head, the signal will be recorded on the tape a coupleof centimeters away, therefore causing a delay. To get around thisproblem, while overdubbing, the record head is used as a playback head.In this situation we talk about taking a 'sync output' from the record head.The sync output isn't of such good sound quality since the record head isoptimized for recording, nevertheless it is certainly good enough formonitoring. The playback head is used for final mixdown.

Also, it is commonplace to 'bounce' several tracks, perhaps vocalharmonies, to one or two tracks (two tracks for stereo), thus freeing uptracks for further use. This has to be done using the sync output of therecord head, otherwise the bounce won't be in time with the other tracks.The slight loss of quality has to be tolerated.

Another technique worth mentioning at this stage is editing. As soon astape was invented, people were cutting it apart and sticking it backtogether again. In fact, with the old wire recorders, people used to weldthe wire together, although the heat killed the magnetism at the join. Themost basic form of tape editing is 'top and tailing'. This means cutting thetape to within 10 mm or so of the start of the audio, and splicing in asection of leader tape, usually white (about two meters). Likewise the

Page 60: Pro  Engineer  School  Vol. 1

tape is cut ten seconds or so after the end of each track and more leaderinserted between tracks. At the end of the tape, red leader is joined on.No blank tape is left on the spool once top and tailing is complete.

Editing can also be used to improve a performance by cutting out the badand splicing in the good. Even two inch tape can be edited, in fact it isnormal to record three or four takes of the backing tracks of a song, andsplice together the best sections. The tape is placed in a special precision-machined aluminum editing block, and cut with a single-sided razorblade, guided by an angled slot. Splicing tape is available with exactlythe right degree of stickiness to join the tape back together. When the editis done in the right place (usually just before a loud sound), it will beinaudible. It takes courage to cut through a twenty-four track two-inchtape though.

Compared to modern disk recorders, the main limitation of tape-basedmultitrack - analog and digital - is that once they are recorded, all thetracks have a fixed relationship in time. In a disk recorder, it is easy tomove one track backwards or forwards in time, or copy it to a newlocation in the song. The equivalent technique in tape-based multitrackrecording is the 'spin in'. In the original sense of the term, a good versionof the chorus, or whatever audio was required to be repeated, would becopied onto another tape recorder. The multitrack would be wound towhere the audio was to be copied. The two machines would be backed upa little way, then both set into play. At the right moment, the multitrackwould be punched into record. Of course, the two machines had to be insync, and this was the difficult part. If the two machines were identicalmechanically, then a wax pencil mark could be made on correspondingrotating tape guides and the tapes backed up by the same number ofrevolutions. It sounds hit and miss, but it could be made to workamazingly quickly. When the digital sampler became available, it wasused in place of the second recorder.

Maintenance

There is a difference between the maintenance of an analog recorder anda digital recorder. Firstly you can do a lot of first-line maintenance on ananalog machine. You can't do more than run a cleaning tape on a digitalrecorder. The second is that you have to do the maintenance, otherwiseperformance will suffer. These are the elements of maintenance:

Page 61: Pro  Engineer  School  Vol. 1

Cleaning: the heads and all metallic parts that the tape contacts arecleaned gently with a cotton bud dipped in isopropyl alcohol. Isopropylalcohol is only one of a number of alcohol variants, and it has goodcleaning properties. It is not the same as drinking alcohol, so don't betempted. Also, drinking alcohol - ethanol - attracts additional taxes insome countries, therefore it would not be cost-effective to use it.

The pinch wheel is made of a rubbery plastic. In theory it shouldn't becleaned with isopropyl alcohol, but it often is. You can buy specialrubber cleaner from pro audio dealers but in fact you can use a mildabrasive household liquid cleaner. Just one tiny drop is enough.

Demagnetizing the heads: After a while, the metal parts will collect aresidual magnetism that will partially erase any tape that is played on themachine. A special demagnetizer is used for which proper training isnecessary, otherwise the condition can be made even worse.

Line-up: Line up, or alignment, has two functions - one is to get the bestout of the machine and the tape; the other is to make sure that a tapeplayed on one recorder will play properly on any other recorder. Thefollowing parameters are aligned to specified or optimum values:

Azimuth - the heads need to be absolutely vertical with respect to thetape otherwise the will be cancellation at HF. The other adjustments ofthe head - zenith, wrap and height are not so critical and therefore do notneed to be checked so often.

Bias level - optimizes distortion, maximum output level and noise.

Playback level - the 1 kHz tone on a special calibration tape is played andthe output aligned to the studio's electrical standard level.

High frequency playback EQ - the 10 kHz tone on the calibration tape isplayed and the HF EQ adjusted.

Record level - a 1 kHz tone at the studio's standard electrical level isrecorded onto a blank tape and the record level adjusted for unity gain.

HF record EQ - adjusted for flat HF response.

LF record EQ - adjusted for flat LF response.

Page 62: Pro  Engineer  School  Vol. 1

The line-up procedure used to be considered part of the engineer's day-to-day routing, but is now often left to a specialist technician.

To conclude, this is certainly far from a complete treatise on analog taperecording, but it is enough for a starting point considering that analogrecorders are now quite rare. Even so, analog recording has a long historyand will almost certainly have a long future ahead. In fact the machinesare so simple and are infinitely maintainable - a fifteen year old StuderA800 will still be working for its living in fifteen years time. You can'tsay that for digital recorders. Also, the sound of analog is very much thesound of recording, as we understand it. Does it make sense therefore touse digital emulation to achieve a pale shadow of the analog sound, orwould it be better to use the real thing?

Page 63: Pro  Engineer  School  Vol. 1

Check Questions

• Give two reasons why analog recorders are still in use in topprofessional studios.

• Comment on distortion in analog recording.

• Comment on noise in analog recording.

• Comment on modulation noise in analog recording.

• What is the function of AC bias?

• What is the distortion level of peaks in an analog recording?

• Why is the concept of clipping not relevant in analog recording?

• Why is the supply reel motor driven in the opposite direction to theactual rotation of the reel?

• What is the capstan?

• What is the pinch wheel?

• What is the tach roller?

• What two tape widths are in common top-level professional use?

• Name three twenty-four track analog tape recorders, make andmodel.

• What is 'bouncing'?

• Comment on cut and splice tape editing.

• What are the two functions of line-up?

Page 64: Pro  Engineer  School  Vol. 1

Chapter 6: Digital Audio

Why digital? Why wasn't analog good enough? The answer starts withthe analog tape recorder which plainly isn't good enough in respect ofsignal to noise ratio and distortion performance. Many recordingengineers and producers like the sound of analog now, because it is achoice. In the days before digital, analog recording wasn't a choice - itwas a necessity. You couldn't get away from the problems. Actually youcould. With Dolby A and subsequently SR noise reduction, noiseperformance was vastly improved, to the point where it wasn't a problemat all. And if you don't have a problem with noise, you can lower therecording level to improve the distortion performance of analog tape. Arecording well made with Dolby SR noise reduction can sound very goodindeed. Some would say better than 16-bit digital audio, although this isfrom a subjective, not a scientific, point of view. Analog record also hadthe problem that when a tape was copied, the quality would deterioratesignificantly. And often there were several generations of copies betweenoriginal master and final product. Digital audio can be copied identicallyas many times as necessary (although this doesn't always work as well asyou might expect. More on this in another module).

In the domestic domain, before CD there was only the vinyl record. Wellthere was the compact cassette too, but that never even sounded goodeven with Dolby B noise reduction. (Some people say that they don't likeDolby B noise reduction. The problem is that they are usually comparingan encoded recording with decoding switched on and off. The extrabrightness of the Dolby B encoded - but not decoded - soundcompensates for dirty and worn heads and the decoded version soundsdull in comparison!). People with long memories will know that theyused to yearn for a format that wasn't plagued with the clicks, pops andcrackles of vinyl. The release of the CD format was eagerly anticipated,and of course the CD has become a great success.

Done properly, digital audio recorders can greatly outperform analog inboth signal to noise ratio and distortion performance. That is why theyare used in both the professional and domestic domains. When thequestion arises of why the other parts of the signal chain have mostlybeen changed over to digital, any possible improvement in sound qualityis hardly relevant. Everything else performs as well as anyone couldpossibly want. Well almost anyone, the only exceptions being the

Page 65: Pro  Engineer  School  Vol. 1

microphone and the loudspeaker, but we are still some way off trulydigital transducers becoming available. By the time digital recording andreproduction had become properly established, digital audio in generalwas showing that it could offer advantages over analog in terms of priceand facilities offered. Digital effects were first, as it became possible toachieve, for instance, digital reverberation for a tiny fraction of the costof an electromechanical system. Digital mixing consoles came ratherlater because they require an incredible processing power. Digital mixingconsoles don't sound better than analog. They do however offer morefacilities for the price, and have the advantage that settings can easily bestored and recalled. This is an important feature that we shall discussmore when we discuss mixing consoles.

Having established the reasons we have digital audio, let's see how itworks...

Digital Theory

Firstly, what do we mean by analog? Analog comes from the wordanalogy. If I say that electrical voltage is a similar concept to the pressureof water behind a tap (excuse me, faucet), then I am making an analogy.If I convert an acoustic sound to an electrical signal where the rise andfall in sound pressure is imitated by a similar rise and fall in voltage, thenthe electrical signal is an analog of the original. An analog signal iscontinuous. It follows the changes of the original without any kind ofsubdivision. It might not be able to track the changes fast enough forcomplete accuracy, in which case the high frequency response will beworse than it could be. Its useful dynamic range lies between a maximumvalue which the analog signal cannot exceed (generally the positive andnegative voltage limits of the power supply - the signal can never exceedthese and will be clipped if it tries) and random variations at a very lowlevel that we hear as noise.

Digital systems analyze the original in two ways: firstly by 'sampling' thesignal a number of times every second. Any changes that happencompletely between sampling periods are ignored, but if the samplingperiods are close enough together, the ear won't notice. The other is by'quantizing' the signal into a number of discrete - separately identifiable -levels. The smoothly changing analog signal is therefore turned into astair-step approximation, since digital audio knows no 'in-between' states.

Page 66: Pro  Engineer  School  Vol. 1

As you can see, the digital signal here is only a crude approximation ofthe original, but it can be made better by increasing the samplingfrequency (sampling rate), and by increasing the number of quantizationlevels. Let's go deep...

To reproduce any given frequency, the sampling frequency, or samplingrate, has to be at least twice that frequency. So to convert the full range ofhuman hearing to digital, a sampling frequency of at least 40 kHz ( twice20 kHz) is necessary. In practice, a 'safety margin' has to be added, so weget the standard compact disc sampling frequency of 44.1 kHz (exactlythis to coincide with the requirements of early digital equipment), and48 kHz which is used in broadcasting (since in the early days of digital itwas easier to convert to the standard satellite sampling frequency of 32kHz).

To reduce the quantization error between the digital signal and theoriginal analog, more quantization levels must be used. Compact disc andDAT both use 65,536 levels. This, in digital terms, is a nice roundnumber corresponding to 16 bits. Without going into binary arithmetic,each bit provides roughly 6 dB of signal to noise ratio. Therefore a digital

Page 67: Pro  Engineer  School  Vol. 1

audio system with 16-bit resolution has a signal to noise ratio (at least intheory) of 96 dB.

The question will arise, what happens if a digital system is presented witha frequency higher than half the sampling frequency? The answer is thata phenomenon known as aliasing will occur. What happens is that thesehigher frequencies are not properly encoded and are translated intospurious frequencies in the audio band. These are only distantly related tothe input frequencies and absolutely unmusical (unlike harmonicdistortion, which can be quite pleasant in moderation). The solution is notto allow frequencies higher than half the sampling rate (in fact less, togive a margin of safety) into the system. Therefore an 'anti-aliasing' filteris used just after the input. Filter design is complex, particularly filterswith the steep slopes necessary to maximize frequency response, but notbe too wasteful on storage or bandwidth by having a sampling rate that isunnecessarily high. The design of the filters is one of the distinguishingpoints that make different digital systems actually sound different.

Once the signal has been filtered, sampled and quantized, it must becoded. It might be possible to record the binary digits directly but thatwouldn't offer the best advantage, and indeed might not work. In thecompact disc system, the tiny pits in the aluminized audio layerthemselves form the spiral that the laser follows from the start of therecording to the end. A binary '1' is coded by a transition from 'land' - thelevel surface - to a pit or vice-versa. A binary '0' is coded by notransition. But what if the signal was stuck on '0' for a period of time - thespiral would disappear! Hence a system of coding is used that rearrangesthe binary digits in such a way that they are forced to change every sooften, simply to make a workable system. There are other suchconstraints that we need not go into here.

Additionally there is the need for error correction. In any storage mediumthere are physical defects that would damage the data if nothing weredone to prevent such damage. So additional data is added to the rawdigital signal, firstly to check on replay whether the data is valid orerroneous, secondly to add a backup data stream so that if a section ofdata is corrupted, it can be reconstituted from other data nearby. Addingerror correction involves a compromise between preserving the integrityof the digital signal, and not adding any more extra data than necessary.It is fair to say that the error correction system on CD, and on DAT, is

Page 68: Pro  Engineer  School  Vol. 1

very good. But as in all things, more modern digital systems are cleverer,and better.

All of the above is known as analog to digital encoding, or A to D. Thereverse process is known, fairly obviously, as decoding. To spare thedetails that only electronics experts need to know, the digital signal goesthrough a D to A convertor and out comes an analog signal. The onlyproblem is that it now contains a strong component at the samplingfrequency. Obviously this is above audibility, but it could cause severelyaudible distortion if allowed into any other equipment that couldn'tproperly handle it. To obviate this therefore, the output is filtered withwhat is known as a 'brickwall' filter, because of its steep slope. Onceagain the design of the filter does affect the sound quality, but digitaltricks have now been developed to make the filter's job easier, thereforedesign is more straightforward.

Analog to Digital Conversion

Filtering: removing frequencies, in the analog domain, that arehigher than half the sampling rate.

Sampling: measuring the signal level once per sampling period.

Quantization: deciding which of the 65,536 levels (in a 16-bitsystem) is closest to the input signal level, for each sampling period.

Coding: converting the result to a binary number according to ascheme that incorporates a) error detection, b) provision for errorcorrection, c) is recordable or transmissable in the chosen medium.

The A to D decoder incorporates three levels of protection againstdamaged data:

Error correction; an error is detected in the data and completely correctedby using the additional error-correction data specifically put there for thepurpose.

Error concealment; an error is detected but it is too severe to becorrected. Missing data is therefore 'interpolated' - just one of the manyscientific words for 'guess' - from surrounding data and the resulthopefully will be inaudible. However, if you ever get chance to see a CD

Page 69: Pro  Engineer  School  Vol. 1

player that has correction and concealment indicator lights, you willnotice that an awful lot of concealment goes on just to play an averagedisc. How well concealment is done is one of the factors that makedifferent digital systems sound different.

Muting; in this case the error is so bad that the system shuts downmomentarily rather than output what could be an exceedingly loud glitch.

Bandwidth

Bandwidth, in this context, is the rate of flow of data measured inkilobits per second. 1 kilobit is 1024 bits. Often, the term byte isused where 1 byte = 8 bits. The abbreviation for bit is 'b' and forbyte is 'B', but these are often confused, as are the multiplierprefixes 'k' meaning x1000, and 'K' meaning x1024.

The bandwidth of a single channel of 16-bit 44.1 or 48 kHz digitalaudio is roughly 750 Kbps. Compare this with the bandwidth of amodem (56 Kbps), ISDN2 (128 Kbps) and common ADSL Internetconnections (512 Kbps). None of these systems is capable oftransmitting even a single channel of digital audio, hence the needfor MP3 and similar data-reduction systems.

24/96

The quest for ever better sound quality leads us to want to increase boththe sampling rate and the resolution. 24-bit resolution will in theory givea signal to noise ratio of 144 dB. This will never happen in practice, butthe real achievable signal to noise ratio is probably as good as anyonecould reasonably ask for. Of course, some of the available dynamic rangemay be used as additional headroom, to play safe while recording, buteven so the resulting recording will be remarkably quiet. Also, eventhough most of us cannot even hear up to 20 kHz, a frequency which isperfectly well catered for these days by a 44.1 or 48 kHz sampling rate,there is always a nagging doubt that this is only just good enough, and itwould be worthwhile to have a really high sampling rate to put all doubtat an end.

This of course, affects storage requirements. It is a reasonable rule ofthumb that CD-quality stereo audio requires about 10 Megabytes perminute of storage. 24-bit, 96 kHz digital audio will therefore, by simple

Page 70: Pro  Engineer  School  Vol. 1

multiplication, require 30 Megabytes per stereo minute. Of course,Megabytes are getting cheaper all the time. There is another problemhowever - data bandwidth. When recording onto a hard disk system,there is a certain data throughput rate beyond which the system willstruggle and possibly fail to record or playback properly. A standardmodern hard drive should be easily capable of achieving 24 tracks ofplayback under normal circumstances (the track count is affected, for onething, by the 'edit density' - the more short segments you cut the audiointo, and the more widely the data is physically separated on the disk, theharder it will be to play back). Try this at three times the data rate and thetrack count, or the reliability is bound to suffer. However, disks aregetting ever faster and most of the problems of this nature are in the past.Before long it will be possible to get virtually any number of tracks quiteeasily. It's worth a quick look at Digidesign's comments on hard diskspecifications to maximize track count.

Digital Interconnection

Digital interconnection comes in a number of standards, which aresummarized here:

AES/EBU

• Also known as AES3 1985 (the year it was implemented)• Standard for professional digital audio• Supports up to 24-bit at any sampling rate• Transmits 2 channels on a single cable• Uses 110 ohm balanced twisted wire pair cables usually terminated

with XLR connectors• Can use cables of length up to 100 meters• Electrical signal level 5 volts• Standard audio cables can be used for short distances but are not

recommended as their impedance may not be the standard 110ohm and reflections may occur at the ends of the cable

• Data transmission at 48 kHz sampling rate is 3.072 Megabit/s (64xthe sampling rate)

• Self clocking but master clocking is possible

S/PDIF

Page 71: Pro  Engineer  School  Vol. 1

• Two types:• Electrical• Uses 75 Ohm unbalanced coaxial cable with RCA phono

connectors• Cable lengths limited to 6 meters.• Optical• TOSLINK - Uses plastic fiber optic cable and same connectors as

Lightpipe (below). TOSLINK is an optical data transmissiontechnology developed by Toshiba. TOSLINK does not specify theprotocol to be used

• ST-type - Glass fiber can be used for longer lengths (1 kilometer).• Meant for consumer products but may be seen on professional

equipment• Supports up to 24-bit/48 kHz sampling rate• Self-clocking• It ought to be necessary to use a format converter when connecting

with AES/EBU since the electrical level is different (0.5 V) andthe format of the data is different also. However, some AES/EBUinputs can recognise an S/PDIF signal

• Some of the bits within the Channel Status blocks are used forSCMS (Serial Copy Management System), to prevent consumermachines from making digital copies of digital copies.

MADI

• an extension of the AES3 format (AES/EBU)• supports up to 24-bit/48 kHz sampling rate (higher rates are

possible)• transmits 56 channels on a 75 Ohm video coaxial cable with BNC

connectors• Length limited to 50 meters. Fiber-optic cable can be used for

longer lengths• Data transmission rate is 100 Megabit/s• Requires a master clock - a dedicated master synchronization

signal must be applied to all transmitters and receivers.

ADAT Optical

Page 72: Pro  Engineer  School  Vol. 1

• Sometimes known as 'Lightpipe'• Implemented on the Alesis ADAT MDM and digital devices such

as mixing consoles, synthesizers and effects units• Supports of to 24-bit/48 kHz sampling rate• Transmits 8 channels serially on fiber-optic cable• Distance limited to 10 meters., or up to 30 meters with glass fiber

cable• Data transmission at 48 kHz is 12 Megabit/s• Self clocking• Channels can be reassigned (digital patchbay function)

TDIF (Tascam Digital Interface Format)

• Implemented on Tascam's family of DA-88 recorders and otherdigital devices such as mixing consoles

• Supports of to 24-bit/multiple sampling rates• Transmits 8 channels on multicore, unbalanced cables with 25-pin

D-sub connectors• Bidirectional interface: a single cable carries data in both

directions• Cable length limited to 5 meters• Data transmission at 48 kHz sampling rate is 3 Megabit/s (like

AES/EBU)• Intended for a master clock system, although self-clocking is

possible

Page 73: Pro  Engineer  School  Vol. 1

Check Questions

• To which type of sound engineering equipment was digital audiofirst applied?

• In relation to the question above, why was this the most pressingneed?

• What types of equipment are currently not available in digitalform?

• Describe 'sampling rate'.

• What is the minimum sampling rate for a digital system capable ofreproduction up to 20 kHz (ignoring any 'safety margin').

• What is 'aliasing'?

• What two sampling rates are most commonly used in digitalaudio?

• Describe quantization.

• What is the signal to noise ratio, in theory, of a digital system with20-bit resolution?

• Why is coding necessary? Give two reasons.

• Why does a digital to analog convertor need a filter?

• What is error correction?

• What is error concealment?

• What happens (or at least should happen) if an error is neithercorrected nor concealed?

• How many Megabytes of data, approximately, are occupied by oneminute of CD-quality stereo digital audio?

Why, in a hard disk recording system, is it likely that fewer tracks can bereplayed simultaneously at the 24-bit/96 kHz standard, than at the CD-

Page 74: Pro  Engineer  School  Vol. 1

q u a l i t y 1 6 - b i t / 4 4 . 1 k H z s t a n d a r d ?

Page 75: Pro  Engineer  School  Vol. 1

Chapter 7: Digital Audio Tape Recording

The original purpose of DAT (Digital Audio Tape) was to be areplacement for the Compact Cassette (or simply 'cassette', as we nowknow it). Since DAT was intended to be a consumer product right fromthe start, the cassette housing is very small, 73 x 54 mm and just 10.5mm thick. For professional users, this is rather too small, not just becauseit makes the cassette easier to lose, but because there will always be afeeling that DAT could have been a better system if there had been a bitmore space for the data. This would allow for error concealment to beminimized, and tracking tolerances could be such that a tape recorded onone recorder could be absolutely guaranteed to play properly on anyother. This is generally the case for professional machines, but notnecessarily so for semi-pro 'domestic' recorders.

Sony professional DAT

Having said that DAT’s size is a disadvantage for professional users, itreally is amazing how it achieves what it does working at microscopicdimensions. DAT’s full title, R-DAT, indicates that the system uses arotary head like a video recorder. Unlike analog tape which records thesignal along a track parallel to the edge of the tape, a rotary head recorderlays tracks diagonally across the width of the tape. So even though thetape speed is just 8.15 millimeters/second, the actual writing speed is amassive 3.133 meters/second. The width of each track is 13.591millionths of a meter. Unlike an analog tape, the tracks are recordedwithout any guard band between them. In fact, the tracks are recorded byheads which are around 50% wider than the final track width and eachnew track partially overlaps the one before, erasing that section. Since thesame heads are used for recording and playback, this may seem to

Page 76: Pro  Engineer  School  Vol. 1

present a problem because if the head is centred on the track it is meantto be reading, then it will also see part of the preceding track and part ofthe next track. Won't this result in utter confusion? Of course it doesn't,because a system originally developed for video recording is used,known as azimuth recording. The ‘azimuth’ of a tape head refers to theangle between the head gap, where recording takes place, and the tapetrack itself. In an analog recorder the azimuth is always adjusted to 90degrees, so that the head gap is at right angles to the track. In DAT,which uses two heads, one head is set at -20 degrees and the other to +20degrees, and they lay down tracks alternately. So on playback, each headreceives a strong signal from the tracks that it recorded, and the adjacenttracks, which are misaligned by 40 degrees, give such a weak signal thatit can be rejected totally.

Mechanically, there is a strong similarity between a DAT recorder and avideo cassette recorder. Both use a rotary head drum on which aremounted the record/playback heads. But there are differences. A videorecorder uses a large head drum with the tape wrapped nearly all the wayaround. This is necessary so that there can always be a head in contactwith the tape during the time that each video frame is built up on thescreen. With digital audio, data can be read off the tape at any rate that isconvenient and stored up in a buffer before being read out at a constantspeed and converted to a conventional audio signal. The head drum in aDAT machine is a mere 30mm in diameter (and spins at 2000 revolutionsper minute). The tape is wrapped only a quarter of the way around, whichmeans that at times neither of the two heads is in contact with the tape,but as I said, this can be compensated for. This 90 degree wrap has itsadvantages:

• There is only a short length of tape in contact with the drum sohigh speed search can be performed with the tape still wrapped.

• Tape tension is low, giving long head and tape life• If an extra pair of heads is mounted on the drum, simultaneous off-

tape monitoring can be performed during recording just like athree-head analogue tape recorder.

The signal that is recorded on the tape is of course digital, and verydissimilar to either analogue audio or video signals. As you know, thestandard DAT format uses 16 bit sampling at a sampling frequency of 48

Page 77: Pro  Engineer  School  Vol. 1

kHz. This converts the original analog audio signal to a stream of binarynumbers representing the changing level of the signal. But since thedimensions of the actual recording on the tape are so small, there is a lotof scope for errors to be made during the record/replay process, and if thewrong digit comes back from the tape it is likely to be very much moreaudible than a drop-out would be on analog tape. Fortunately DAT, likethe Compact Disc, uses a technique called Double Reed-SolomonEncoding which duplicates much of the audio data, in fact 37.5%, in sucha way that errors can be detected, then either corrected completely orconcealed so that they are not obvious to the ear. If there is a really hugedrop-out on the tape, then the DAT machine will simply mute the outputrather than replay digital gibberish. As an extra precaution againstdropouts, another technique called interleaving is employed whichscatters the data so that if one section of data is lost, then there will beenough data beyond the site of the damage which can be used toreconstruct the signal.

The pulse code modulated audio data is recorded in the centre section ofeach diagonal track across the tape. There is other data too:

• 'ATF' signals allow for Automatic Track Finding which makessure that the heads are always precisely positioned over the centreof the track, even if the tape is slightly distorted and the trackcurved.

• Sub Code areas allow extra data to be recorded alongside the audioinformation. Not all of the capacity of the Sub Code areas is in useas yet, allowing for extra expansion of the DAT system. Those atpresent in use include:

• A-time, which logs the time taken since the beginning of the tape• P-time, which logs the time taken since the last Start ID.• Start ID marks the beginning of each item;• Skip ID tells the machine to go directly to the next Start ID, thus

performing an ‘instant edit’.• End ID marks the end of the recording on the tape.• There is also provision for SMPTE/EBU timecode

Page 78: Pro  Engineer  School  Vol. 1

DASH

DASH stands for Digital Audio Stationery Head. The DASHspecifications include matters such as the size of the tape, the tape speedand the layout of the tracks on the tape; also the modulation method anderror correction strategy, among other things. The format is based on twotape widths: 1/4” (6.3 mm) and 1/2” (12.55 mm). For each tape widththere are two track geometries, Normal Density and Double Density andthere are also three tape speeds, nominally Slow, Medium and Fast (afurther variation is caused by each of the three speeds being slightlydifferent according to whether 44.1 kHz or 48 kHz sampling is used).According to the above, there must be twelve combinations all of whichconform to the DASH format. This could make life confusing, but justbecause a particular combination of parameters is possible, it doesn'tnecessarily mean that a machine will be built to accommodate it.

Sony PCM 3348

Page 79: Pro  Engineer  School  Vol. 1

The original Sony 3324, and recent 24-track machines, use the normaldensity geometry on 1/2” tape which allows twenty-four digital audiotracks, two analog cue tracks, a control track and a timecode track. (Thecue tracks are there so that audio can be made available in other thannormal play speed +/- normal varispeed). The tape speed at 44.1 kHz is70.01cm/s. The 3324 is totally two-way compatible with the larger 3348which can record forty-eight digital tracks on the same tape. To give anexample, you may start a project on a 3324, of any vintage, and then theproducer decides as the tracks fill up that he or she really needs moreelbow room for overdubs. So you hire a 3348, put the twenty-four tracktape on this and record another twenty-four tracks in the guard bands leftby the other machine. Continuing my (hypothetical) example, when it isdecided that the project is costing too much and going nowhere, theproducer is sacked and another one brought in who decides that the extratwenty-four tracks are unnecessary embellishments and the originaltracks, with a little touching up, are all that are required. Off goes the3348 back to the hire company, the tape - now recorded with forty-eighttracks - is placed back on the 3324 and the original twenty-four tracks aresuccessfully sweetened and mixed with not a murmur from the tracks thatare now not wanted. We are now accustomed to new products andsystems which offer new features yet are compatible with materialproduced on earlier versions. This must be audio history's only exampleof forward as well as reverse compatibility. It shows what thinking aheadcan achieve.

DASH Operation

The first thing you are likely to want to do with your new DASHmachine is of course to make a recording with it, but it would beadvisable to read the manual before pressing record and play. Some ofthe differences between digital and analog recording stem from the factthat the heads are not in the same order. On an analog recorder we areused to having three heads: erase, record and play. DASH doesn't need anerase head because the tape is always recorded to a set level ofmagnetism which overwrites any previous recordings without further

Page 80: Pro  Engineer  School  Vol. 1

intervention. So the first head that the tape should come across should bethe record head. Right?

Wrong. The first head is a playback head, which on an basic DASHmachine is followed a record head only. If this seems incorrect, you haveto remember that while analog processes take place virtuallyinstantaneously, digital operations take a little time. So if you imagineanalog overdubbing where the sync playback signal comes from therecord head itself, you can see why this won't work in the digital domain.There will be a slight delay while the playback signal is processed, andanother delay while the record signal is processed and put onto tape. 105milliseconds in fact, which corresponds to about 75 mm of tape. Toperform synchronous overdubs there has to be a playback head upstreamof the record head otherwise the multitrack recording process as we knowit just won’t work. For most purposes two heads are enough, and a thirdhead is available as an option if you need it, and you'll need it if you wantto have confidence monitoring. (There are no combined record/playbackheads, by the way, all are fixed function).

On any digital recording medium the tape has to be formatted to be used.On DAT the formatting is carried out during recording, but on DASH itis often better to do it in advance. The machine can format whilerecording - in Advance Record mode - but this is best done in situationswhere you will be recording the whole of the tape without stopping. Ifyou wish, you can ‘pre format’ a tape but this obviously takes time. Youcan take comfort from the fact that it can be done in one quarter of realtime, and the machine will lay down timecode simultaneously.

Since there are different ways to format a tape and make recordings, the3342S has three different recording modes: Advance, Insert andAssemble. Advance mode is as explained above. Insert is for when youhave recorded or formatted the full duration of the material and you wantto go back and re-record some sections. Assemble is when you want toput the tape on, record a bit, play it back, record a bit more etc, as wouldtypically happen in classical sessions.

Converter Delay

The main text deals with some of the implications of delays causedby the process of recording digital signals onto tape and playingthem back again. There is another problem caused by delays in the

Page 81: Pro  Engineer  School  Vol. 1

A/D conversion itself. The convertors used in the Sony 3324S, forexample, while being very high quality, have an inherent delay ofabout 1.7 milliseconds.

Imagine the situation where you are punching into a track on ananalog recording to correct a mistake. You will probably set up themonitoring so that you and the performer can hear both the outputfrom the recorder and the signal to be recorded. The performer willplay along with his part until the drop in, when the recorder willswitch over to monitor the input signal. This will be returned to theconsole and you will hear the level go up by approximately 3dBbecause you are now monitoring the same signal via two paths.

On the 3324S you can make a cross fade punch in of up to about370 milliseconds. This is a good feature, but when you have madethe punch in - using the monitoring arrangement described above -you will hear the input signal added to the same signal returnedfrom the recorder but delayed by about 1.7ms. This will causedphase cancellation and an odd sound. Fortunately, Sony haveincluded an analog cross fade circuit which will imitate what ishappening in the digital domain, but without the delay.

Editing

DASH was designed to be a cut-and-splice editing format. Briefly,this is possible but it was found in practice that edits were oftenunreliable. Editing of DASH tapes is now done by copying betweentwo machines synchronized together with an offset. Twosynchronized 24-track machines are obviously more versatile in thisrespect than one 48-track.

Maintenance

Although an analog recorder can be, and should be, cleaned by therecording engineer in the normal course of studio activities, a DASHmachine should only be cleaned by an expert, or thousands of dollarsworth of damage can be caused. The heads can be cleaned with a specialchamois-leather cleaning tool, wiping in a horizontal motion only. Cottonbuds, as used for analog records will clog a DASH head with their fibers.Likewise, an analog record can be aligned by a knowledgable engineer,but alignment of a DASH machine is something that is done every six

Page 82: Pro  Engineer  School  Vol. 1

months or so by a suitably qualified engineer carrying a portable PC anda special test jig in his tool box. The PC runs special service softwarewhich can interrogate just about every aspect of the DASH machinechecking head hours, error rates, remote ports, sampler card etc etc. Withthe aid of its human assistant it can even align the heads and tape tension.

Current significance

The current significance of DASH is as a machine that can record onto arelatively cheap archivable medium, with confidence that tapes will bereplayable after many years. Also, when an analog project is recorded ontwin 24-track recorders, it is often considered more convenient forediting to copy the tapes to a Sony 3348. The single 3348 is far faster andmore responsive than synchronized analog machines, making the mixingprocess faster and smoother.

Page 83: Pro  Engineer  School  Vol. 1

MDM

The original modular digital multitrack was the Alesis ADAT (belowleft). On its introduction it was considered a triumph of engineering to anaffordable price point. The ADAT (Alesis Digital Audio Tape) wasclosely followed by the Tascam DTRS (Digital Tape Recording System)format (below right).

Alesis ADAT-XT

Tascam DA98-HR

There are certain similarities:

• Both formats capable of 8 tracks.• Multiple machines can be easily synchronized to give more tracks.• Recordings are made on commonly available video tapes: ADAT

takes S-VHS tapes, DTRS takes Hi-8• Tape need to be formatted before use. Formatting can take place

during recording, but this is only appropriate when a continuousrecording is to be made for the entire duration of the tape.

Page 84: Pro  Engineer  School  Vol. 1

• Very maintenance-intensive. For a 24-track system, four machines(4 x 8 = 32) are necessary to account for the one that will alwaysbe on the repair bench.

• High resolution versions available (ADAT 20-bit, DTRS 24-bit, 96kHz, 192 kHz, with reduced track count)

• The differences are these:• Maximum record time: ADAT - 60 minutes, DTRS - 108 minutes• ADAT popular in budget music recording studios• DTRS popular in broadcast and film post-production

One further difference is that it is probably fair to say that the ADAT hasreached the end of its product life-cycle, although there are undoubtedlystill plenty of them around and in use. DTRS however is still useful as atape-based system offering a standard format and cheap storage.

Page 85: Pro  Engineer  School  Vol. 1

Check Questions

• Was DAT originally intended as a professional or a domesticrecording medium?

• What is the sampling rate of standard DAT?

• What is the resolution of standard DAT?

• What is 'azimuth recording'?

• Describe the head wheel in DAT recorder.

• What is SCMS?

• What is the distinguishing feature of a DAT machine capable ofnear-simultaneous off-tape monitoring?

• What is the sub-code area of the DAT tape used for?

• What is 'interleaving'?

• What is the width of the tape used for 24-track DASH?

• What is the width of the tape used for 48-track DASH?

• Describe how 24-track and 48-track DASH machines arecompatible.

• How are DASH tapes edited?

• In DASH, why does a playback head come before the record headin the tape path?

• Comment on the cleaning requirements of DASH

• How many tracks does a modular digital multitrack (MDM) have?

• How can more tracks be obtained?

• Comment on the types of usage of ADAT and DTRS machines.

Page 86: Pro  Engineer  School  Vol. 1

Appendix 1: Sound System Parameters

Level

A large part of sound engineering involves adjusting signal level: findingthe right level or finding the right blend of levels. The level of a realsound traveling in air can be measured in µN/m2 (or µPa/m2 –micropascals per square meter if you prefer), or more practically dB SPLwith reference to 0 dB SPL or 20 µN/m2. The level of a signal inelectrical form can be measured in volts, naturally, or it can be measuredin dB. The problem is that decibels are always a comparison between twolevels. For acoustic sounds, the dB SPL works by comparing a soundlevel with the reference level 20 µN/m2 (the threshold of hearing).Therefore we need a reference level that works for voltage.

Going in back in history, early telecommunication engineers wereinterested in the power that they could transmit over a telephone line.They decided upon a standard reference level for power, which was 1mW (1 milliwatt, or one thousandth of a watt). This was subsequentlycalled 0 dBm. The ‘m’ doesn't stand for anything, it just means that anymeasurement in dBm is referenced to 1 mW. Today in audio circuitry,we are not too concerned about power except at the final end product –the output of the power amplifier into the loudspeaker. For the rest of thetime we can happily measure signal level in voltage. Going back intohistory, standard telephone lines had a characteristic impedance of 600ohms. (‘Characteristic impedance’ is a term hardly ever used in audio soexplanation here will be omitted). The relationship between power,voltage and impedance is: P = V2/R. Working out the math we find that apower of 1 mW delivered via a 600 ohm line develops a voltage of 0.775V. This became the standard reference level of electrical voltage, and it isstill in use today.

There is a slight problem here. Over the years it became customary torefer to a voltage of 0.775 V as 0 dBm. This is not wholly correct. It isonly true when the impedance is 600 ohms, which is not necessarily thecase in audio circuitry. Despite this, any reference you find to 0 dBm, inpractice, means 0.775 V regardless of what the impedance is.

Page 87: Pro  Engineer  School  Vol. 1

Technical sound engineers abhor inconsistencies like this, so a new unitwas invented: dBu, where 0 dBu is 0.775 V, without any reference toimpedance. Once again, the ‘u’ doesn't stand for anything. ‘dBu’ issometimes written ‘dBv’ (note lower case ‘v’). Confusingly there is alsoanother reference: dBV (note upper case ‘V’), where 0 dBV is 1 volt. Insummary:

0 dBm = 1 mW

0 dBu = 0.775 V

0 dBv = 0.775 V

0 dBV = 1 V

There are more:

dBr is a measurement in decibels with an arbitrary quoted reference level

dBFS is a measurement in decibels where the reference level is the fulllevel possible in a specific item of digital audio equipment. 0 dBFS is themaximum level and any measurement must necessarily be negative, forexample –20 dBFS.

All of the above (with the exception of dBFS) refer to electrical levels.We also need levels for magnetic tape and other media. Analog recordingon magnetic media is still commonplace in top level music recording,and outside of the developed countries of the world. Magnetic level ismeasured in nWb/m (nanowebers per meter). ‘Nano’ is the prefixmeaning ‘one thousandth of a millionth’. The weber (Wb) is the unit ofmagnetic flux. Wb/m is the unit of magnetic flux density, or simply ‘fluxdensity’. Wilhelm Weber the person (pronounced with a ‘v’ sound inEurope, with a ‘w’ sound in North America), by the way, is to magnetismwhat Alessandro Volta is to electricity.

There are a number of magnetic reference levels in common use. Ampexlevel, named for the company that developed the tape recorder fromGerman prototypes after World War II, is 185 nWb/m. NAB (NationalAssociation of Broadcasters, in the USA) level is 200 nWb/m. DIN(Deutsche Industrie Normen, in Europe) level is 320 nWb/m. Insummary:

Page 88: Pro  Engineer  School  Vol. 1

Ampex level: 185 nWb/m

NAB level: 200 nWb/m

DIN level: 320 nWb/m

It’s worth noting that none of these reference levels is better than anyother, but NAB and DIN are the most used in North America and Europerespectively.

Operating Level

An extension of the concept of level is operating level. This is the levelaround which you would expect your material to peak. Much of the timethe actual level of your signal will be lower, sometimes higher. It’s just afigure to keep in mind as the roughly correct level for your signal. Inelectrical terms, the standard operating level of professional equipment is0 dBu. There is also a semi-professional operating level of –10 dBV.This does cause some difficulty when fully professional and semi-professional equipment is combined within the same system. Either youhave to keep a close eye on level and resign yourself to makingcorrections often, according to what combination of equipment youhappen to be using, or buying a converter unit that will bring semi-prolevel up to pro level.

Magnetic tape also has a standard ‘operating level’ - several of them infact. To simplify a little since analog magnetic tape is now a minoritymedium, albeit an important minority: In a studio where VU meters areused, then it is common to align the VU meters so that 0 VU equals +4dBu. Tape recorders would be aligned so that a tone at 200 nWb/m givesa reading of 0 VU. In short:

200 nWb/m on tape normally equates to +4 dBu and 0 VU

Most brands of tape can give good clean sound up to 8 dB above 200nWb/m and even beyond, although distortion increases considerablybeyond that.

Digital equipment also has an ‘operating level’, of sorts. In some studios- mainly broadcast - digital recorders such as DAT are aligned so that–18 dBFS (18 dB below maximum level) is equivalent to +4 dBu and 0

Page 89: Pro  Engineer  School  Vol. 1

VU. This certainly allows plenty of headroom (see later), but it doesn’tfully exploit the dynamic range of DAT. Most people who recorddigitally record right up to the highest level they think they can get awaywith without risk of red lights or ‘overs’.

Gain

Gain refers to an increase or decrease in level and is measured in dB.Since gain refers to both the signal level before gain was applied, andsignal level after gain is applied, then the function of the decibel as acomparison between two levels holds good. The signal level from amicrophone could be around 1 mV, for instance. Apply a gain of 60 dBand it is multiplied by a thousand giving around 1 V – enough for themixing console to munch on. Suppose the signal then needed to be madesmaller, or attenuated, then a gain of –20 dB would bring it down toaround 100 mV. Some engineers find it fun to play around with thesenumbers. Your degree of fluency in the numbers part of decibels dependson whether you want to be a technical expert, or just concentrate on theaudio. There is work available for both types of engineer.

The need to make a signal bigger or smaller is fairly easy to understand,but what about making it stay the same level? What kind of gain is this?The answer is ‘unity gain’ and it is a surprisingly useful concept. Unitygain implies a change in level of 0 dB. In the analog era it was importantto align a recorder so that whatever level you put in on record, you gotthat same level out on replay. Then, apart from being spared changes inlevel between record and playback, you could do things like copy tapes,edit bits and pieces together and the level wouldn’t jump. If you hadn'taligned your machines to unity gain then the levels would be all over theplace. With digital equipment, it is actually the norm for digital input andoutput to be of the same level, so unity gain – in the digital domain atleast – tends to happen automatically.

Page 90: Pro  Engineer  School  Vol. 1

RMS and Peak Levels

How do you measure the level of an AC (alternating current) waveform?Or to put it another way, how do you measure the level of an ACwaveform meaningfully? A simple peak-to-peak measurement, or peakmeasurement, shows the height (or amplitude) of the waveform, but itdoesn't necessarily tell you how much subjective loudness potential thewaveform contains. A very ‘peaky’ waveform (or a waveform with ahigh crest factor, as we say) might have strong peaks, but it will not tendto sound very loud. A waveform with lower peaks, but greater areabetween the line and the x-axis of the graph will tend to sound louder ondelivery to the listener. The most meaningful measurement of level is theroot-mean-square technique. Cutting out all the math, the RMSmeasurement tells you the equivalent ‘heating’ capability of a signal. Awaveform of level 100 Vrms would bring an electric fire element to thesame temperature as a direct (DC) voltage of 100 V. A waveform of level100 Vpeak-to-peak would be significantly less warm.

Frequency Response

It is generally accepted that the range of human hearing, taking intoaccount a selection of real live humans of various ages, is 20 Hz to 20kHz, and sound equipment must be able to accommodate this. It is nothowever sufficient to quote a frequency range. It is necessary to quote afrequency response, which is rather different. In addition, we are notlooking for any old frequency response, we are looking for a ‘flatfrequency response’ which means that the equipment in questionresponds to all frequencies, within its limits, equally and any deviationsfrom an equal response are defined. The correct way to describe thefrequency response of a piece of equipment is this:

20 Hz to 20 kHz +0 dB/-2 dB

or this:

20 Hz to 20 kHz ±1 dB

Of course the actual numbers are just examples, but the concept ofdefining the allowable bounds of deviation from ruler-flatness is the key.

Page 91: Pro  Engineer  School  Vol. 1
Page 92: Pro  Engineer  School  Vol. 1

Q

Q is used in a variety of ways in electronics and audio but probably themost significant is as a measure of the ‘sharpness’ of a filter or equalizer.For example, an equalizer could be set to boost a range of frequenciesaround 1 kHz. A high Q would mean that only a narrow band offrequencies around the center frequency is affected. A low Q wouldmean that a wide range of frequencies is affected. Q is calculated thus:

Q = f0/(f2-f1) where f0 is the center frequency of the band, f2 and f1 arethe frequencies where the response has dropped –3 dB with respect to f0.

It may be evident from this that Q is a ratio and has no units. Q doesn'tstand for anything either, it’s just a letter. Whether you need to use a lowQ setting or a high Q setting depends on the nature of the problem youwant to solve. If there is a troublesome frequency, for example acousticguitars sometimes have an irritating resonance somewhere around 150Hz to 200 Hz, then a high Q setting of 4 or 5 will allow you to home inon the exact frequency and deal with it without affecting surroundingfrequencies too much. If it is more a matter of shaping the spectrum of asound to improve it or allow it to blend better with other signals, then alow Q of perhaps 0.3 would be more appropriate. The range of Q incommon use in audio is from 0.1 up to around 10, although specialistdevices such as feedback suppressers can vastly exceed this.

Page 93: Pro  Engineer  School  Vol. 1

Noise

Noise can be described as unwanted sound, or alternatively as a non-meaningful component of a sound. Noise occurs naturally in acoustics,even in the quietest settings. Air molecules are in constant motion at anytemperature above absolute zero and since sound is nothing more thanthe motion of air molecules, then the random intrinsic motion mustproduce sound - sound of a very low level, but sound none the less. Weare not generally aware of this source of noise, but some microphonesare. A microphone with a large diaphragm will have many moleculesimpinging on its surface, and the random motion of the molecules willtend to average out and be insignificant in comparison with the wantedsignal. A microphone with a small diaphragm however (such as a clip-onmic) will only be in contact with comparatively few air molecules so theaveraging effect will be less and the noise higher in level in comparisonwith the wanted signal.

When sound is converted to an electrical signal, the signal is carried byelectrons. Once again, electrons are in constant random motion causingwhat is called Johnson noise. If the signal is carried by a large current (ina low impedance circuit), then Johnson noise can be insignificant. If thesignal is carried by only a small current with relatively few electrons (in ahigh impedance circuit), then the noise level can be much higher. We canextend this concept to any medium that can carry or store a sound signal.

Page 94: Pro  Engineer  School  Vol. 1

Noise is cause by variations in the consistency of the medium. One moreexample would be a vinyl record groove. The signal is stored asundulations in the groove, but any irregularities such as dust or scratchestranslate into noise on playback.

Digital audio systems are not immune to noise. When a signal isconverted to digital form, it is analyzed into a certain number of levels,65,536 in the compact disc format for example. Of course, most of thetime the original signal will fall between levels, therefore the analysis isonly an approximation. The inaccuracies necessarily produced are termedquantization noise.

Signal to Noise Ratio

Signal to noise ratio is one measure of how noisy a piece of equipment is.We said earlier that a common operating level is +4 dBu. If all signalwere removed and the noise level at the output of the console measured,we might obtain a reading somewhere around –80 dBu. This would meanthat the signal to noise ratio is 84 dB. In analog equipment, a signal tonoise ratio of 80 dB or more is considered good. The worst piece ofequipment as far as noise is concerned is the analog tape recorder, whichcan only turn in a signal to noise ratio of around 65 dB. The noise is quiteaudible behind low-level signals. Outside of the professional domain, acompact cassette recorder without noise reduction can only managearound 45 dB. This is only adequate when used for information contentonly, for instance in a dictation machine, or for music which is loud allthe time and therefore masks the noise.

As we said, digital equipment suffers from noise too. Quantization noiseis more grainy in comparison to analog noise and therefore subjectivelymore annoying. Digital equipment requires a better signal to noise ratio.In basic terms, the signal to noise ratio of any digital system can becalculated by multiplying the number of bits by six. So the compact discformat with a resolution of 16 bits has a signal to noise ratio of 16 x 6 =96 dB, if all other parts of the system are optimized. Currently theprofessional standard is moving to 24-bit resolution, therefore thetheoretical signal to noise ratio would be 24 x 6 = 144 dB. This isactually greater than the useful dynamic range of the human ear, but inpractice this idealized figure is never attained.

Page 95: Pro  Engineer  School  Vol. 1

Another way of measuring the noise performance of equipment is EIN orEquivalent Input Noise, and this is mainly of relevance to microphonepreamplifiers. An example spec might be 'EIN at 70 dB gain: -125 dBu(200 ohm source)'. This means that the gain control was set to 70 dB andthe noise measured at the output of the mic preamp - in this case themeasurement would be –55 dBu. When the set amount of gain issubtracted from this we get the amount of noise that would have to bepresent at the input of a noiseless mic amp to give the same result. The'200 ohm source' bit is necessary to make the measurement meaningful.If the EIN figure does not give the source impedance, then I am afraid themeasurement is useless. Perhaps it is giving the game away to say thatthe reason a gain of 70 dB is quoted is because mic preamps normallygive their optimum EIN figures at a fairly high gain. The lower the gainat which a manufacturer dare quote the EIN, the better the mic inputcircuit.

Modulation Noise

Noise as discussed above is a steady-state phenomenon. It is annoying,but the ear has a way of tuning out sounds that don’t change. However,there is another type of noise that constantly changes in level, and that ismodulation noise. One source of modulation noise is that which occurs inanalog tape recorders. The effect is that as the signal level changes, thenoise level changes. This can be irritating when the signal is such that itdoesn't adequately mask the noise. A low frequency signal with fewhigher harmonics is probably the worst case and will demonstratemodulation noise quite clearly. Noise reduction systems, as mainly usedin analog recording, also have the effect of creating modulation noise.Noise reduction systems work by bringing up the level of low-levelsignals before they are recorded, and reducing the level again onplayback – at the same time reducing the level of tape noise.Unfortunately, the noise level is now in a state of constant change andthereby drawing attention to itself. Some noise reduction systems havemeans of minimizing this effect. All of the various Dolby systems, forexample, work well when properly aligned.

Quantization noise in digital systems is also a form of modulation noise.At very low signal levels it is sometimes possible to hear the noise levelgoing up and down with the signal.

Page 96: Pro  Engineer  School  Vol. 1

Where you are most likely to hear modulation noise is on a so-called HifiVHS video recorder. The discontinuous nature of the audio track causes alow frequency fluttering noise which requires noise reduction tominimize. On some machines, this noise reduction is not wholly effectiveand the modulation noise created can be very irritating.

It is worth saying that signal to noise ratio should be measured with anynoise reduction switched out, otherwise the comparison between peak oroperating level and the artificially lowered noise floor when signal isabsent gives an unfairly advantageous figure unrepresentative of thesubjective sound quality of the equipment in question.

Distortion

Unfortunately, any item of sound equipment 'bends' or distorts the soundwaveform to a greater or lesser extent. This produces, from any giveninput frequency, additional unwanted frequencies. Usually, distortion ismeasured as a percentage. For a mixing console or an amplifier, anythingless than 0.1% is normally considered quite adequate, although onceagain it's the analog tape recorder that lets us down with distortionfigures of anything up to 1% and above.

Distortion normally comes in two varieties: harmonic distortion andintermodulation distortion. Looking at the harmonic kind first, supposeyou input a 1 kHz tone into a system. From the output you will get notonly that 1 kHz tone but also a measure of 2 kHz, 3 kHz, 4 kHz etc. Infact, harmonic distortion always comes in integral multiples of theincoming frequency - rather like musical harmonics in fact. This is whydistortion is sometimes desirable as an effect - it enhances musicalqualities, used with taste and control of course.

Page 97: Pro  Engineer  School  Vol. 1

Sine wave - the simplest possible sound with no harmonics

The effect of even-order harmonic distortion on a sine wave

Page 98: Pro  Engineer  School  Vol. 1

The effect of odd-order harmonic distortion on a sine wave

Intermodulation distortion is not so musical in its effect. This is wheretwo frequencies combine together in such a way as to create extrafrequencies that are not musically related. For instance, if you input twofrequencies, 1000 Hz and 1100 Hz, then intermodulation will producesum and difference frequencies – 2100 Hz and 100 Hz.

A third form of distortion is clipping. This is where a signal ‘attempts’ toexceed the level boundaries imposed by the voltage limits of a piece ofequipment. In modern circuit designs the peaks of the waveform areflattened off causing a rather unpleasant sound. In vintage equipment thepeaks can be rounded off, or strange things can happen such as the signalcompletely disappearing for a second or two.

Page 99: Pro  Engineer  School  Vol. 1

Crosstalk

Crosstalk is defined as a leakage of signal from one signal path toanother. For instance, if you have cymbals or hihat on one channel ofyour mixing console and you find they are leaking through to theadjacent channel, then you have a crosstalk problem. Crosstalk canconsist of the full range of audio frequencies, in which case there is aresistive path causing the leakage. More often crosstalk is predominantlyhigher frequencies, which jump from one circuit track to another throughcapacitance. In analog tape recorders, an effect known as fringing allowslow frequencies to leak into adjacent tracks on replay. The worst problemcaused by crosstalk is when timecode leaks from its allocated track orchannel into another signal path. Timecode – used to synchronize audioand video machines – is an audio signal which to the ear sounds like avery unpleasant screech. It only takes a little crosstalk to allow timecodeto become audible.

Headroom

I have already mentioned the concept of operating level which is the'round about' preferred level in a studio. This would typically be 4 dBu ina professional studio. But above operating level there needs to be acertain amount of headroom before the onset of clipping. This is mostimportant in a mixing console where the level of each individual signalcan vary considerably due to: 1) less than optimal setting of the gain

Page 100: Pro  Engineer  School  Vol. 1

control, 2) gain due to EQ, or perhaps 3) unexpected enthusiasm on thepart of a musician. Also, when signals are mixed together, the resultinglevel isn't always predictable. Professional equipment can handle levelsup to +20 dBu or +26 dBu, therefore there is always plenty of headroomto play with. Of course, the more headroom you allow, the worse thesignal to noise ratio, so it is always something of a compromise.

In recording systems, it is common to reduce headroom to little or zero.The recording system is at the end of the signal chain and there are fewervariables. Nevertheless, it does depend on the nature of the signal source.If it is a stereo mix from a multitrack recording, then the levels areknown and easily controllable therefore hardly any headroom is required.If it is a recording of live musicians in a concert setting, then much moreheadroom must be allowed because of the more unpredictable level of thesignal, and also because there isn't likely to be a second chance ifclipping occurs.

Wow and Flutter

The era of wow and flutter is probably coming to an end, but it hasn'tquite got there yet so we need some explanation. Wow and flutter areboth caused by irregularities in mechanical components of analogequipment such as tape recorders and record players. Wow causes a long-term cyclic variation in pitch that is audible as such. Flutter is a fastercyclic variation in pitch that is too fast to be perceived as a rise and fall inpitch. Wow is just plain unpleasant. You will hear it most often, and at itsworst, on old-style juke boxes that still use vinyl records. Flutter causes a‘dirtying’ of the sound, which used to be thought of as whollyunwelcome. Now, when we can have flutter-free digital equipment anytime we want it, old-style analog tape recorders that inevitably sufferfrom flutter to some extent have a characteristic sound quality that isoften thought to be desirable. Wow and flutter are measured inpercentage, where less than 0.1% is considered good.

Page 101: Pro  Engineer  School  Vol. 1

Check Questions

• What is meant by '0 dBm'?

• What is meant by '0 dBu'?

• What operating level is commonly used by semi-professionalequipment?

• What does the term 'dBFS' mean?

• What level is commonly used as the reference level for analogmagnetic tape in North America?

• Which has the greater heating effect: 100 V RMS or 100 V DC?

• What is meant by 'unity gain'?

• Why is it not acceptable to quote the frequency response of a pieceof equipment as '20 Hz to 20 kHz'?

• What is meant by 'signal to noise ratio'?

• What is meant by 'EIN'?

• What is modulation noise?

• What is harmonic distortion?

• What is intermodulation distortion?

• What is clipping?

• What is headroom?