Microchips 2ed

Microchips

a simple introduction

Second Edition

Sitaramarao S. Yechuri, Ph.D.

ISBN 0-9741037-1-3Library of Congress Control Number: 2004093110Printed June, 2004Copyright © 2004 by Yechuri Software, Arlington, TX.All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,

or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording,or otherwise, without prior written permission of the publisher. Printed in the United States ofAmerica.

Introduction

Integrated circuit chips made of Silicon have undeniably transformed our world and the biggestchange happened in just 35 years. The first Germanium (Ge) transistor was built in Bell Labora-tories in 1947 by Walter Brattain and John Bardeen. The first integrated circuit was built at TexasInstruments in 1958 by Jack Kilby. Nowadays all micro-chips are made of Silicon (Si).

No one can predict the future and mankind has developed many technologies that fizzledout or just never became very popular. Some technologies developed slower than others. Twohugely important industries of our times, namely the automobile industry and the semiconductorindustry have displayed different behavior.

Automobile technology started in 1889 and is still evolving slowly. It is an industry with ahuge inertia that limits how quickly it can evolve. It is capital intensive and labor intensive andnowadays the profit margin is not that high due to robust competition. And in some ways it hasnot changed that much.

The efficiency of today’s cars are not more than double that of the cars of the 1930’s and ourcars today still use gasoline and still use an overhead cam-shaft to regulate the valves. Cars havebecome lighter, but people still drive a 2000 lb car to transport a single 150 lb person. Top speedsfor the cars of the 1940’s was easily a 100 mph, and even today cars are built to run at no morethan 100 mph and practical speeds on the roads do not exceed 70 mph.

Semiconductors on the other hand grew very rapidly in the 1990’s and matured very quicklyindeed. The metric used is the length of the gate of the transistors used. In 1965 Gordon Mooreof Intel corporation predicted that the transistor density would basically double every year andso far it has been quite accurate. In fact it is almost a business prediction as much as a technologyprediction in that consumers have grown to expect the new computers to become faster every fewmonths and they still expect to pay only as much as they did before. In fact it is common forconsumers to put off purchasing electronics until just before they need it because they believe thattomorrow everything will be a little cheaper and faster.

Another important feature of the micro-chip industry is that in a sense it did accelerate it’s owngrowth. What I mean is this. Up to 1970 most technology was developed on pen and paper. It wasanalytical. It is a very, very sad fact that analytical techniques are virtually unused today except byvery few technical people. The pocket calculator was the first step in increasing the speed of chipdesign because it allowed chip designers to calculate transistor sizes and bias points to severaldecimal places of accuracy instantly. The early TI programmable calculators had a slot throughwhich you passed a magnetic strip of paper containing instructions you had coded previously andthey were read in and the calculator was ready to perform a sequence of calculations rather thanjust one.

Chip designers were initially circuit designers and did all their design work with pencil andpaper and a calculator. At that time most chips were analog in function. But by the 1990’s com-puters started to take on the weight of chip design and chip designers became little more thanprogrammers. By then most chips were digital in nature. And this process started to feed on itself

iii

i.e., the improvement in computer speed allowed better and faster chip design software which inturn allowed better and faster chips to be designed and so on.

In a sense the chip shrink process was on a glide-path because the minimum feature size of themicro-chips was dictated by the wavelength of light used to define them and chip manufacturingequipment manufacturers just used lower and lower wavelengths to define the features and itseemed the juggernaut would never stop.

But the juggernaut is slowing down because the wavelength of the light needed to define thefeatures has become so small that the energy of the photons (which is inversely proportional tothe wavelength) is now that of an X-ray. At such a high energy there are few photon sensitivematerials which can respond to it.

Another factor which is causing a problem is the gate oxide thickness which has already beenreduced to no more than five layers of atoms. Besides these two factors the FETs made at verysmall dimensions are not delivering the behavior needed to properly design integrated circuits.

At the time of this writing 0.09 µ is the cutting-edge of the semiconductor processes world-wide and it is this author’s opinion that the 0.06 µ generation which we will attain by 2006 or asubsequent 0.05 µ generation may well be a stable point at which the industry starts to becomecommoditized and prices are driven to the minimum and when applications become the mainfocus. This happened with the automobile industry and it will probably happen with the chipindustry.

Keep in mind that even well known industry experts don’t agree on how much further siliconbased chips can be shrunk and many of these experts have a vested interest in persuading thepublic that newer, faster and cheaper technologies are just around the corner and that you shouldinvest your money in the leading semiconductor companies even at high P/E ratios. When youread about newer technologies, the key question you should ask is not whether they are feasiblebut whether they can be made cheaper than existing technology.

My belief is that chips with many layers of circuitry stacked one on top of the other offers thekey to higher density. To make this a reality I believe that techniques that are additive like epitaxyor chemical vapor deposition need to become much more cost effective, which could happen if thevolume of usage were increased. They also can be done at lower temperatures which is necessaryto keep the lowest circuit levels functional and finally there needs to be a way to sandwich passiveheat sinks between the layers to suck the heat out because otherwise the middle levels will burnup.

iv

Contents

1 Passive circuits 11.1 The three passive lumped elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.2 Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.3 Inductance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.4 Impedance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Basic circuit laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2.1 Ohm’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2.2 Kirchoff’s laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2.3 Y ∆ transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2.4 Mesh equations and node equations . . . . . . . . . . . . . . . . . . . . . . . 41.2.5 Thevenin and Norton equivalents . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.6 Maximum power transfer theorem . . . . . . . . . . . . . . . . . . . . . . . . 51.2.7 Transient analysis using Laplace transforms . . . . . . . . . . . . . . . . . . . 6

2 Active devices - historical 72.1 Vacuum technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Diode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.3 Triode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.4 Klystron tube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.5 Read diode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.6 Gunn diode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Semiconductor theory 133.1 Wave particle duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.2 Schroedinger’s time independent wave equation . . . . . . . . . . . . . . . . . . . . 143.3 Quantum well . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.4 Free electron theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.5 Bloch theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.6 Kronig Penney model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.7 Effective mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.8 Fermi-Dirac distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.9 Poisson’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.10 Drift and diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.11 Haynes-Schockley experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.12 Continuity equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.13 Band diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.14 Impurities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

v

4 Active devices 234.1 P-N Junction diode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.2 Bipolar junction transistor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.3 Heterojunction Bipolar Transistor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.4 Field-Effect transistor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.5 FET small signal equivalent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.6 Other transistors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.7 Shrink problems at 0.06µ and below . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.7.1 The premise of the shrink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.7.2 Vth (Threshold voltage) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.7.3 St (Sub-threshold swing) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.7.4 Tox (Gate oxide thickness) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.7.5 Ldiff (Sub-diffusion) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.7.6 Gate loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.7.7 Xj (Junction depth) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.7.8 ND (Drain doping level) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.7.9 Tj (Junction temperature) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.7.10 Thermal budget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.7.11 Heat generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5 Process characterization 415.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.2 Test equipment used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.3 Test circuit layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.4 Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.4.1 Drain characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445.4.2 Gate characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.4.3 Back bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.4.4 Collector characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465.4.5 Diode characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.4.6 Reverse characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.4.7 S parameter measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485.4.8 C-V measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485.4.9 Thermal behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.5 Production monitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.6 Scanning Electron Microscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.7 Striped wafers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.8 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515.9 Process skew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.10 Burn in testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545.11 Ion implant to create connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.12 Thermal imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6 Chip fabrication 566.1 Wafer preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.2 Lithography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576.3 Mask generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586.4 Oxide growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

vi

6.5 Doping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596.6 Implantation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606.7 Etching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6.7.1 Wet etch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616.7.2 Reactive ion etch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626.7.3 Reactive ion beam etch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6.8 Sputtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636.9 Polysilicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636.10 Sintering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636.11 Thermal budget constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

7 Logic circuits 657.1 Boolean logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657.2 Flip-flops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707.3 The pass gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707.4 Karnaugh maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717.5 Finite state machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727.6 Domino logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

8 Analog circuits 778.1 Current mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 778.2 Current sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 778.3 Active load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 818.4 Level shifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828.5 Common emitter/source amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828.6 DC gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 838.7 Emitter/Source follower input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 838.8 Bootstrapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848.9 Miller’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848.10 Gain bandwidth product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 858.11 Voltage reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 858.12 Differential circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 868.13 Transistor matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 878.14 Bode plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 888.15 Routh’s stability criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 898.16 Nyquist path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 908.17 Sample and Hold circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 918.18 Analog to digital conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 928.19 Digital to analog conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 948.20 Low power circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 968.21 Laser trimming and other techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

9 Microprocessors 999.1 Binary number system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

9.1.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 999.1.2 Floating point numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

9.2 µp block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1009.3 Arithmetic logic unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

9.3.1 Addition and subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

vii

9.3.2 Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1029.3.3 Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

9.4 Shift register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1039.5 Instructions and operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1049.6 CISC and RISC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1059.7 The critical path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1069.8 Pipelining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1079.9 Intentional clock skewing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1089.10 Clock trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

10 Phase-Locked Loops 11210.1 Ring oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11210.2 Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

10.2.1 Voltage controlled oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11410.2.2 Divider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11510.2.3 Phase comparator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11610.2.4 Loop filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

10.3 Loop operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11910.4 Delay-locked loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12010.5 Tracking and re-sync PLLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

11 Digital Signal Processors 12311.1 Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12311.2 Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12711.3 Digital Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12911.4 Pattern recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13011.5 Error correcting codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

11.5.1 Reed-Solomon code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13111.5.2 Convolutional coding and Viterbi decoding . . . . . . . . . . . . . . . . . . . 132

11.6 Motor control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

12 I/O circuits and pcb interactions 13612.1 Design consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

12.1.1 Capacitive loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13712.1.2 Transit time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13712.1.3 Line impedance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13812.1.4 Electro-static discharge or ESD . . . . . . . . . . . . . . . . . . . . . . . . . . 14112.1.5 Line drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14112.1.6 Line terminations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14212.1.7 Impedance variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14312.1.8 Cross coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14412.1.9 Antenna effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14512.1.10 Ground bounce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14512.1.11 Ringing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

12.2 Spread spectrum technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14712.3 Input/Output or IO circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

viii

13 Automatic Test Equipment 15013.1 DUT board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15113.2 Main computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15213.3 Tester boards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15313.4 Pin driver boards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

13.4.1 Timing generation chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15313.4.2 Pin driver chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

14 MMICs 15714.1 Lumped and distributed elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15714.2 Maxwell’s equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15714.3 Transmission lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15814.4 N-port circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

14.4.1 h-Parameters, y-Parameters & z-Parameters . . . . . . . . . . . . . . . . . . . 16214.4.2 S-Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

14.5 Balun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16314.6 Circulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16314.7 Impedance transformers and filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

15 Transducers 16815.1 Direct gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16815.2 Semiconductor lasers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

15.2.1 Edge emitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16915.2.2 Surface emitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17015.2.3 Bulk vs. distributed gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

15.3 Junction detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17215.4 Accelerometers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

16 Technology CAD 17416.1 Basic numerical techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

16.1.1 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17516.1.2 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17616.1.3 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

16.2 Grid selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17716.3 Device simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17916.4 Fabrication process simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18016.5 Monte Carlo analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

17 Power electronics 18317.1 Alternating current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18317.2 Transformers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18417.3 Rectification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18417.4 DC to AC conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18517.5 DC to DC conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18517.6 Silicon Controlled Rectifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18617.7 Power BJTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

ix

Chapter 1

Passive circuits

Electricity is the flow of electronic charge i.e. the movement of electrons.

1.1 The three passive lumped elements

There are only three basic elements namely resistance, capacitance and inductance. They arelinear.

1.1.1 Resistance

Any material that conducts electricity exhibits a resistance. Resistance essentially inhibits theflow of electrons and the energy dissipated is converted into heat. Resistors obey Ohm’s law

V = IR (1.1)

So if you apply a voltage with an arbitrary amplitude variation versus time, the current flowhas the same relative variation versus time. Two resistances a and b in series result in a resistanceof a + b. Two resistances a and b in parallel result in a resistance of a×b

a+b . The unit of resistance is Ω.

1.1.2 Capacitance

Capacitance is a way to store energy in an electric flux. Any two conductors of any shapewhich are not touching each other form a capacitor. For the case of two parallel plates of the samearea A spaced d apart in a medium of dielectric coefficient ε the capacitance C is given by C = ε·A

d .The way that the energy is stored is that electrons accumulate on one plate and an equal numberof electrons are missing on the other plate.

Two capacitances a and b in series result in a capacitance of a×ba+b . Two capacitances a and

b in parallel result in a capacitance of a + b. The unit of capacitance is the Farad. If neededCoulomb’s law can be used to obtain the force that acts on the two plates of a capacitor. Theelectrical relationship we need to use is that the voltage across the plates of the capacitor is givenby

V (t) =1C

∫I · dt (1.2)

Because the voltage across a capacitor is the integral of the current, a sinusoidal input currentresults in a sinusoidal voltage which lags the input current by a quarter cycle or 90o.

1

1.1.3 Inductance

Inductance is a way to store energy in a magnetic flux. Any conductor is an inductor. TheBiot-Savart law gives the magnetic field strength due to current flow through a conductor. Nowif the conductor is surrounded by material of a high permeability a magnetic flux flows throughthat material and energy is stored.

Two inductances a and b in series result in a inductance of a + b. Two inductances a andb in parallel result in a inductance of a×b

a+b . The unit of inductance is the Henry. The electricalrelationship we need to use is that the current flowing through the inductor is given by

I(t) =1L

∫V · dt (1.3)

Because the current through an inductor is the integral of the voltage across it, a sinusoidalvoltage results in a sinusoidal current which lags the applied voltage by a quarter cycle or 90o.

1.1.4 Impedance

By the use of the Fourier transform any section of a time domain waveform can be decom-posed into it’s frequency domain constituents. Because inductors and capacitors perform an in-tegral over time of voltage and current respectively, they will respond differently to sinusoidalwaveforms of the same amplitude but different frequencies.

For this reason passive circuits containing inductors and capacitors are analyzed as a functionof the angular frequency ω = 2 · π · f where f is the frequency in Hz of the constituents of the timedomain waveforms.

Impedance is the complex measurement of any combination of the three basic elements. Impedanceis denoted as Z(ω). The impedance of a resistance is simply the resistance and is independent offrequency. The impedance offered by a capacitor is −j/ωC. The impedance of an inductor is jωL.So if you have a series combination of resistance, capacitance and inductance the impedance is

Z(ω) = R − j

ωC+ jωL (1.4)

You can manipulate impedances similarly to resistances. So two impedances in series justresult in a new impedance of

Zser(ω) = Z1(ω) + Z2(ω) (1.5)

Two impedances in parallel result in a new impedance of

Zpar(ω) =Z1(ω) × Z2(ω)Z1(ω) + Z2(ω)

(1.6)

1.2 Basic circuit laws

1.2.1 Ohm’s law

Ohm’s law applies to impedances just as it applies to resistances. Since the impedance is afunction of frequency, so is the current flow and the voltage dropped across the load impedance.So as before

V (ω) = I(ω) × Z(ω) (1.7)

2

1.2.2 Kirchoff’s laws

Kirchoff proposed a current law and a voltage law. Kirchoff’s current law states that the sum ofall currents into each node in a circuit must be zero as shown in the figure 1.1. Kirchoff’s voltagelaw states that the voltages summed around a loop must equal zero as shown in the figure 1.2.

I2

I4I3

I1

Figure 1.1: Kirchoff’s current law: I1 + I2 + I3 + I4 = 0.

V1

V2

V3

V4

+−

+

−

−+

−

+

Figure 1.2: Using Kirchoff’s voltage law: V 1 + V 2 + V 3 + V 4 = 0.

1.2.3 Y ∆ transformation

R1

R3R2

Ra Rb

Rc

Figure 1.3: Y ∆ transformation.

In the figure 1.3 you can convert from Y to ∆ and from ∆ to Y using the equations 1.8, 1.9,1.10, 1.11, 1.12 and 1.13.

R1 =RaRb

Ra + Rb + Rc(1.8)

3

R2 =RaRc

Ra + Rb + Rc(1.9)

R3 =RbRc

Ra + Rb + Rc(1.10)

Ra =R1R2 + R2R3 + R1R3

R3(1.11)

Rb =R1R2 + R2R3 + R1R3

R2(1.12)

Rc =R1R2 + R2R3 + R1R3

R1(1.13)

1.2.4 Mesh equations and node equations

R6 R5

R2 R4

R1

12

VR3

3

4

R7

I3

I1 I2

I4a

−

+

Figure 1.4: Mesh circuit to analyze.

If we want to solve the circuit in the figure 1.4. The mesh equations are obtained by imple-menting KVL in each loop. So we obtain four equations for the four loops as shown below. Thereare four unknowns I1, I2, I3 and I4 and they can be obtained by solving the four equations usingCramer’s rule.

I1R7 + (I1 − I3)R6 + (I1 − I2)R2 = 0 (1.14)

I3R1 + (I3 − I1)R6 + (I3 − I4)R3 = 0 (1.15)

I2R4 + (I2 − I1)R2 + (I2 − I4)R5 = 0 (1.16)

−Va + (I4 − I2)R5 + (I4 − I3)R3 = 0 (1.17)

The node equations on the other hand are obtained by using KCL and Ohm’s Law at thedifferent nodes of the circuit. In the figure 1.4 the nodes are identified by small circles.

V1 − V2

R6+

V3 − V2

R2+

V4 − V2

R5− V2

R3= 0 (1.18)

V3 − V1

R7+

V2 − V1

R6− V1

R1= 0 (1.19)

4

V1 − V3

R7+

V4 − V3

R4+

V2 − V3

R2= 0 (1.20)

V4 = Va (1.21)

1.2.5 Thevenin and Norton equivalents

Thevenin’s theorem and the Norton’s theorem are explained in the same context. Thevenin’stheorem states that after selecting two nodes of a linear circuit the whole circuit can be simplifiedinto a single voltage source and a series impedance. This is explained as shown in the figure 1.5.

R1

R2

R3

L1

L2

C1

C2

V1

−

+

Figure 1.5: The circuit to reduce.

In the circuit in the figure 1.5 the circuit behavior as seen by the element with the circles oneither end can be simplified into either of the two circuits shown in the figure 1.6. The one on theleft is called the Thevenin’s equivalent while the one on the right is called the Norton’s equivalent.

Zth

Zth

Vth In−

+

Figure 1.6: Thevenin’s and Norton’s equivalent circuits.

Vth is obtained by measuring the voltage between the two circles with the element removed.Zth is then obtained by further removing the voltage source and replacing it with a short, so thatthe impedance measured between the circles is the Zth. To obtain the Norton’s equivalent thecircles are connected with a short and the current through the short is measured. This current isthe value of the current source in the Norton’s equivalent. The shunt impedance in the Norton’sequivalent is the same as the Zth.

1.2.6 Maximum power transfer theorem

If you take any two nodes in a circuit and you designate one side of the circuit as the sourceand the other side as the load, then the maximum power you can transfer from the source to theload occurs when the impedance looking into the load is the complex conjugate of the impedancelooking into the source. You can get the source impedance and the load impedance by the use ofthe Norton’s or Thevenin’s equivalent circuit.

5

1.2.7 Transient analysis using Laplace transforms

The Laplace transform is somewhat similar to the Fourier transform and is given by

f(s) =∫ ∞

0e−stf(t) dt (1.22)

The Laplace transform is really useful in transient analysis because of the way that it handlesintegrals and differentials.

f ′(t) sf(s) − f(t)|t = 0 (1.23)∫ t

0f(t)

f(s)s

(1.24)

I

+

−

R

L+ −

C

+

−

Figure 1.7: Circuit to analyze.

To illustrate let us analyze the circuit in the figure 1.7. In this circuit both the inductor and thecapacitor have initial states and then the circuit is allowed to settle. The equation that we need is

VC(0) +1C

∫ t

0Idt + L

dI

dt+ IR = 0 (1.25)

Applying the Laplace transform with I(t) as the variable then gives

I(s) =Ls I(t)|t = 0 − VC(t)|t = 0

Ls2 + Rs + 1C

(1.26)

Now you find the roots of the denominator and rewrite the expression in the following form

I(s) =s + c

(s − a)(s − b)(1.27)

The inverse Laplace transform then gives the actual current flow as a function of time

I(t) =(a + c) · eat − (b + c) · ebt

a − b(1.28)

Even though the solution seems to have only exponentials remember that due to Euler’s equa-tion, if either a or b is complex, the current will have sinusoidal variations.

6

Chapter 2

Active devices - historical

2.1 Vacuum technology

In 1947 the first Germanium transistor was built in Bell laboratories by Walter Brattain andJohn Bardeen. The first integrated circuit was built at Texas Instruments in 1958 by Jack Kilby. Butfor a lot more than the first half of the twentieth century all commercially available electronics wasbased on vacuum tubes. Rectification was achieved with the two terminal diode, amplificationwas achieved with the three terminal triode. At very high microwave frequencies, the triode isnot fast enough, so the Klystron tube was used for amplification. The Read diode was proposedin 1958 and it too is used at microwave frequencies and in 1964 the Gunn diode was proposed,also for microwave amplification. All these devices played an important role in the growth ofelectronics.

2.2 Diode

Glass tube

Cathode

Anode

Heater

Figure 2.1: A vacuum diode.

A vacuum diode is an evacuated glass tube as shown in the figure 2.1. The heater at thebottom heats the cathode red hot or hotter. The cathode is made of a very good conductor with alow work function such as copper. Due to the heat electrons are boiled off the cathode. The anodeis placed above the cathode and due to the electric field, the electrons from the cathode drift tothe anode where they combine with the anode, transferring a charge of q from the cathode to the

7

anode. The movement of the electrons in the field is governed by the Lorentz equation, where themagnetic field term is zero.

Rectification occurs because the anode is not heated, so is unable to emit electrons, so if thepolarity of the diode is reversed, current will not flow. The speed of the device depends on thetransit time from the cathode to the anode.

2.3 Triode

Glass tube

Cathode

Anode

Heater

Grid

Figure 2.2: A vacuum triode.

Amplification is achieved when a small current modulation controls a much larger current per-haps at the same voltage so that the modulation of the larger current is identical to the modulatingsignal, perhaps with a constant time delay.

The vacuum diode is converted into an amplifying device by the introduction of the grid be-tween the anode and the cathode as shown in the figure 2.2. To achieve the most control the gridmust be placed much closer to the cathode than to the anode, because the electric field inducedby the grid to cathode voltage is competing with the electric field due to the anode to cathodevoltage.

The grid has to be designed so that it has a high porosity on the one hand and can exerta controlling field on the other. The high porosity is very important because you want to runthe grid in a high impedance low current circuit and if the grid intercepts electrons, the currentwill dampen the modulating signal. Remember that the grid in order to modulate the currentsuccessfully must have a voltage more positive than the cathode, so current can flow from thegrid to the cathode.

Another aspect of the grid is that it should have a very low capacitance to the anode and forthis as well you want the grid to have a very small area while being able to control the field.The other side of it is that if it is too fine, then over time it will be damaged or warped causingdistortion. Photomultiplier tubes [1] are also based on vacuum tubes.

2.4 Klystron tube

The klystron is a device where the amplification is not achieved as in the triode or any of thedevices of today. The way the triode, the bipolar junction transistor or the Field Effect Transistorachieve amplification is that there is a high impedance control circuit that gates a much larger

8

Electron

sourceTap out the

microwave

gain

Bunching

occurs here

Modulation

occurs here

Energy is

supplied here

Figure 2.3: A klystron tube.

current. The klystron is completely different in that the amplification is not achieved at the timeof modulation. The klystrode configuration is shown in the figure 2.3.

There are five distinct sections of this setup. The electrons are boiled off a cathode on the left.Next they are accelerated in a strong electric field. As they pass through the next section, theyare modulated in a microwave frequency field. The next section is long passive section where theamplification actually happens.

When the electrons were modulated some electrons were accelerated and some were deceler-ated causing a modulation of velocity. In the figure 2.4 the electrons in the section A are decel-erated, the electrons in section C are accelerated while the electrons in section B are left unmodu-lated. In the long passive section this velocity modulation is converted into position modulationbecause the velocity modulation causes the electrons to bunch as shown on the right in the figure2.4.

C B A C B A

Figure 2.4: Amplification due to bunching.

The next and last section is the section that extracts the microwave energy from the electronstream. As the electrons pass through the parallel plates, they cause displacement current to flowand since the bunched electrons are tightly packed together, they will cause a sudden bump in thedisplacement current and this is the amplified output.

So in order to have a lot of amplification, you need a large number of uniformly spaced elec-trons moving at a large velocity and a large distance over which to allow them to bunch. Inaddition you need to extract the microwave energy at exactly the right distance because after thebunching reaches a maximum, they will overshoot and start to disperse at which point any energyextracted will be distorted.

9

2.5 Read diode

The Read diode [2] is based on a n+ − p − i − p+ structure as shown in the figure 2.5. It usesthe impact ionization effect and is used to generate microwave oscillation output upto 50 GHz orso.

n i+ p p +

+

T

h−e pairs generated here

−

+

Figure 2.5: The Read diode.

There is no input, the diode just needs to be biased at the correct voltage and if the outputfrequency needs to be adjusted the time T in the figure 2.5 has to be varied. If the conditions arecorrect oscillations will build up and the author [2] expects an efficiency of 30 %. More experiments[3] may have given better efficiency.

p in + p +

Figure 2.6: The electric field.

The figure 2.6 gives the electric field and shows the peak field occurring at the n+−p junction.As the oscillations build up, the oscillation voltage causes the depletion region at this junction toexpand and contract. The charge thus moving in a high field causes impact ionization to occurgenerating hole-electron pairs. The electrons simply move into the n+ region and then into thesupply, whereas the holes move across the space charge region to the p+ region.

The biasing of the diode needs to be such that during the negative part of the oscillation signal,the sum of the DC bias and the oscillation is lower than the voltage required to cause impactionization, but during the positive part of the oscillation the sum of the voltages causes impactionization to occur at the n+ − p junction. So the current increases during the positive portion ofthe oscillation voltage and decreases during the negative portion of the oscillation voltage.

The time that they take to reach the p+ region is what determines the frequency of oscillation.If the time taken to traverse the space charge region is a half a cycle, then the current is 180o out ofphase with the voltage and the oscillation is self-sustaining. The output frequency is given by the

10

equation 2.1 where W is the width of the space charge region and v is the velocity of the carriersin the space charge region. A nice computer simulation method to analyze the Read diode is givenin [4].

ω =πv

W(2.1)

2.6 Gunn diode

The Gunn diode [5] was based on an effect explained by [6] and [7]. The Gunn diode has astructure as shown in the figure 2.7.

n GaAs

−

+

Figure 2.7: The Gunn diode.

The oscillation of the Gunn diode is similar to the Read diode, but the mechanism is different.It is based on the fact that GaAs has a small direct gap and a large indirect gap as shown in thefigure 2.8. The increase in energy from A to B is 0.34 eV. At B the mobility is lower than at A.

A

B

Figure 2.8: The e-k diagram for GaAs [8], [9].

So in the Gunn diode if the field is high enough, the acceleration of conduction band electronsin the field will give them enough energy to move from A to B. But at the same time, the 0.34 evdifference is larger than the thermal energy so in the absence of the field electrons are not usuallyat B. So if the Gunn diode is biased just at the field required to cause a transition, then during onehalf of the oscillation cycle there will be a large number of transitions from A to B, but during theother half there will be none.

The electrons at B will move in the field and cause a current flow, but they will move moreslowly. So just as in the case of the Read diode, you will have an optimum frequency at whichthe electrons will arrive exactly half a cycle out of phase with the voltage and this will cause a

11

self sustaining oscillation. Like the Read diode, the Gunn diode will also operate at 40-50 GHz orhigher.

12

Chapter 3

Semiconductor theory

[10] and [11] are good references for quantum mechanics and [9] is a good book for semicon-ductor physics. The semiconductor properties of Silicon are due to it’s crystal structure. LikeCarbon, Silicon has a valence of four. In pure crystalline silicon, each silicon atom makes a co-valent bond with it’s four neighbors. The structure of pure silicon is as shown in the figure 3.1.

Figure 3.1: The silicon lattice.

In insulators the electrons are firmly bound to their atom or are part of a bond between atomsthat requires a lot of energy to break. In metals, the atoms are arranged in a periodic structure andthe valence electrons are free to move about, and although the metal as a whole is charge neutralthe valence electrons are not tied to a specific atom. The electrons that are firmly bound to an atomor as part of a bond are said to be in the valence band. The electrons that are free to move aboutare said to be in the conduction band.

So the figure 3.2 shows the valence and conduction bands for insulators, semiconductors andmetals. In the case of insulators the two bands are far apart in energy, in the case of semiconductorsthe energy gap between the two bands is in the same ballpark as thermal energy and in the caseof metals the two bands overlap.

13

Conduction

Valence

Conduction

Valence Valence

Conduction

Insulator Semiconductor Metal

Figure 3.2: The valence and conduction bands.

3.1 Wave particle duality

Every particle has a wavelength associated with it given by the de Broglie wavelength whichis:

λ =h

mv=

h

p(3.1)

The Davisson-Germer experiment confirmed this wave property for electrons by impinging acollimated beam of electrons onto a crystal and observing the diffracted electrons using a counter.So smaller objects or slower objects have larger wavelengths. The h is Planck’s constant.

3.2 Schroedinger’s time independent wave equation

The Schroedinger wave function Ψ is given by the solution to:

(− h2

2m∇2 + V )Ψ = EΨ (3.2)

The first term on the LHS represents the kinetic and the second term represents the potentialenergy and E is the total energy. In semiconductors we most often use the wave equation tocalculate the occupancy of energy levels and also to calculate the transition probability from onestate to another. The quantity |Ψ|2 represents the probability of finding the particle at a specificlocation and summed over all space it will integrate to 1. The sum of two solutions to the waveequation is also a valid solution of the wave equation.

If a particle is represented by a wave function Ψ then you can obtain it’s momentum as p =−ih∇Ψ = hk , where p is the momentum and k is the wave number. If you need an actual numberyou can integrate and average k over all space.

Figure 3.3: A wave packet.

A particle can be represented by a wave packet which may look something like the figure3.3. It has a beginning and an ending and is the sum of many waves like a Fourier transformrepresentation, and it has two velocities associated with it. The group velocity is the velocity of

14

propagation of the packet itself which is the velocity of the particle. The phase velocity is thevelocity at which a point on the sum of the waves would have to move in order that the phase atthat point remains a constant.

3.3 Quantum well

In the figure 3.4 is a rectangular potential well on the left side with the walls at Vm and thewell at zero. In one dimension the wave equation is equation 3.3.

0

Vm

0 a

E1

E2

E3

E4

x

Figure 3.4: Solutions in one dimension.

d2Ψdx2

+2m[E − V (x)]Ψ

h2 = 0 (3.3)

From the standard solution [12] you get the equation 3.4. Now if you set Ψ(0) = Ψ(a) = 0 youget the solutions on the right of the figure 3.4.

Ψ = c1ei·d + c2e

−i·d (3.4)

d =√

2m[E − V (x)]h

(3.5)

On either side of the well, E − V (x) is negative and the Ψ(x) decays exponentially to zero.

3.4 Free electron theory

The free electron theory of metals is obtained by solving the wave equation in a three dimen-sional potential box with infinite potential walls and no potential inside the box. So the boundarycondition is that the wave function is zero at the walls and the equation becomes:

− h2

2m∇2Ψ = EΨ (3.6)

15

Using separation of variables you can make

Ψ(x, y, z) = Ψx(x)Ψy(y)Ψz(z) (3.7)

− h2

2m(

1Ψx

∂2Ψx

∂x2+

1Ψy

∂2Ψy

∂y2+

1Ψz

∂2Ψz

∂z2) = EΨ (3.8)

− h2

2m

∂2Ψx

∂x2= ExΨx (3.9)

So you get the same solution as in equation 3.4, with

d =√

2mEx

h(3.10)

The first term in equation 3.4 has to be zero. From Euler’s equation and applying boundaryconditions i.e. Ψ = 0 at the walls, the imaginary terms go to zero, and you get a sine wavesolution. The wave number k =

√k2

x + k2y + k2

z . If the box of the potential walls is very large thepossible solutions will be finely distributed in k and the plot of the energy E vs. k will appear asshown in the figure 3.5.

k

E

Figure 3.5: The E-k diagram.

Because of Pauli’s exclusion principle two electrons cannot occupy a given state, so the numberof electrons is limited to the density of states function which is a count of the allowed states. For asemiconductor if you apply the Kronig-Penney model and count the allowed states you find thatthe density of states is proportional to

√E so the density of states is a parabolic function of energy.

Nowadays the doping level is so high that the semiconductor is said to be degenerate meaning thatit does not obey the exclusion principle, so the actual carriers exceed the density of states.

3.5 Bloch theorem

The Bloch theorem [13] for the wave function in a periodic potential such as shown in thefigure 3.6 for a displacement dx gives

Ψ(k, x + dx) = Ψ(k, x) · eik·dx (3.11)

3.6 Kronig Penney model

Kronig and Penney [14] used a potential function as shown in the figure 3.6 to solve for thewave function of an electron in a crystal. The rectangular barriers are located between the latticesites.

16

V

c

12 3

Figure 3.6: The Kronig-Penney potential.

The solution in regions 1,2 and 3 are the same as equation 3.4. Like the quantum well, region1 has a solution which is sinusoidal, but in regions 2 and 3 the solution is an exponential decay.The Bloch theorem allows the solution in region 2 to be related to the solution in region 3 by amultiplier of eikc.

Now the boundary conditions are applied at the interface of region 1 and 2 and at the interfaceof region 1 and 3. Even so, to make the analysis possible, the regions 2 and 3 are shrunk and thebarrier potential raised simultaneously so that the net decay stays the same.

This results in a solution as shown in the figure 3.7. As you increase the energy the wavenumber alternates between real and imaginary and where it is imaginary you have the forbiddengaps. Then you can jump to the next higher energy at the same k.

k

E

Figure 3.7: The Kronig-Penney E-k plot.

3.7 Effective mass

In a semiconductor an electron can appear to have many different effective masses dependingupon what you are measuring. This mass is usually smaller than it’s rest mass m0. There are twoimportant effective masses namely the density of states effective mass and the mobility effectivemass.

The way that you use the different effective masses is when you calculate different quantitiesbased on the Schroedinger’s wave equation. In the denominator is the mass m. So if are using thewave equation to calculate the mobility, then you would use the effective mass for mobility forthose conditions. Similarly if you are using the wave equation to calculate the density of states,then you need to use the density of states effective mass for those conditions.

Cyclotron resonance as described in [15] is a way to measure effective mass. Cyclotron reso-nance is used for lots of things, in fact electron cyclotron resonance based plasma etch equipment

17

E

B

Figure 3.8: Cyclotron movement of charged particles.

is sold by several vendors. The idea of cyclotron resonance is fairly straight forward. In a sim-ple RF system with an RF voltage applied between two plates, the carriers move back and forthbetween the plates.

But in cyclotron resonance the aim is to make the charged particles describe a circle or oval. Soit isn’t just an RF field but also a perpendicular magnetic field as well. From the Lorentz equation,as it moves either up or down, it also moves sideways. So it describes an oval as in the figure 3.8.

When the charged particle moves between the plates it causes displacement current and thatcan be detected and similarly when it absorbs energy from the magnetic field, that too can bedetected. Since silicon is anisotropic if you use specimens with different crystal orientations theovals and the resonance frequencies will be different. From the resonance frequencies and theshape of the ovals you can obtain the effective masses along the different crystal orientations.

3.8 Fermi-Dirac distribution

The Fermi-Dirac distribution function of equation 3.12 gives the probability of occupancy ofan electron state at energy E. If E = Ef , the probability is half. To get the actual number ofelectrons at that energy, you multiply it by the density of states function as shown in the figure3.9.

f(E) =1

1 + e[E−Ef ]/kT(3.12)

Figure 3.9: The Fermi-Dirac distribution.

To use this equation in practical calculations, the approximation proposed by [16] is easiest.Just the first two terms are probably enough as in equations 3.13 and 3.14.

Ef − Ec

kT= ln

(n

Nc

)+

n

2√

2Nc

(3.13)

18

Ev − Ef

kT= ln

(p

Nv

)+

p

2√

2Nv

(3.14)

One important feature of the Fermi-Dirac statistics is that in a junction of any type, either ahomojunction or a heterojunction, of p − n or p − i or n − i the Fermi level at equilibrium is flat.

3.9 Poisson’s equation

In the equation shown below ρ is the charge and ε is the permittivity. If there is no charge, theright hand side becomes zero and this is called the Laplace equation.

∇ε · ∇V = −ρ (3.15)

3.10 Drift and diffusion

Drift is the movement of charge in an electric field and the current density J due to this move-ment is given by the equations 3.16 for electrons and 3.17 for holes. Keep in mind that the flow ofelectrons is opposite to the direction of J . Typically µp is only half that of µn but in the high fieldchannel region under an FET gate it can be less than that.

Jn = q n µn E (3.16)

Jp = q p µp E (3.17)

Diffusion is independent of charge and applies to all particles. Due to thermal energy all par-ticles move about randomly (first shown by Brownian motion). If you have a collection of particlesin one location they will disperse with time. In probability theory this is called the random walk. Ina container they cannot disperse beyond the walls of the container where they are reflected.

In a semiconductor the current density at any given point due to a variation in the density ofeither holes or electrons is given by the equations 3.18 for electrons and 3.19 for holes. Note thatJp has a negative sign because a positive gradient will give a negative current whereas for Jn thenegative charge will reverse the sign again.

Jn = q Dn ∇ · n (3.18)

Jp = −q Dp ∇ · p (3.19)

The Einstein relationship relates the diffusion coefficient D to the mobility µ by the equation3.20.

D

µ=

kT

q(3.20)

The mobility reduces as the temperature increases due to lattice vibration as shown in [17].

3.11 Haynes-Schockley experiment

The Haynes-Schockley experiment [18] can be used to measure the mobility µ indirectly. Aslab of the semiconductor such as Silicon is biased as shown in the figure 3.10. A pulse of laserlight of suitable frequency is applied to the semiconductor. The light causes hole electron pairs tobe formed.

19

Laserpulse

I(t)

−

+

Figure 3.10: The Haynes-Schockley experimental setup.

If the slab is n type, the holes will drift to the right in the electric field. Because the holeconcentration is a pulse it will spread due to diffusion. So the narrow pulse of holes will reducein height and increase in width as it moves to the right. However, due to recombination the areaunder the pulse will reduce with time.

Now if you measure the current flowing out of the slab on the right you will see a pulseof current. The shape of the pulse is the most important. The Einstein relationship relates thediffusion coefficient D to the mobility µ. The recombination changes the area under the pulse andalso the shape of the pulse because the holes diffusing to the left spend a longer time among themajority carriers. In any case from the shape of the pulse you can obtain D and µ.

3.12 Continuity equations

The continuity equations for holes and electrons are shown below. The left hand side showsthe increase in number of electrons (or holes) in the control volume over time. The right hand sideis the number of particles left behind in the control volume due to the differential of the currentflow less the recombination rate R and plus the generation rate G.

∂n

∂t=

1q∇ · Jn − R + G (3.21)

∂p

∂t= −1

q∇ · Jp − R + G (3.22)

The continuity equations are actually common to many fields of science. For example in in-compressible fluid flow, the conservation of mass leads to continuity equations very similar tothe one above except the generation and recombination terms are zero. The Schockley-Read-Hall[19], [20] is the biggest recombination term in indirect gap semiconductors such as silicon and ismodeled by a lifetime as in equations 3.23 and 3.24.

RSRH =nexcess

τ0(3.23)

τ0 =τn0(p0 + p′) + τp0(n0 + n′)

p0 + n0(3.24)

But in direct gap semiconductors you can also have optical recombination which is modeledas Ropt = C(np − n2

i ) where ni is the intrinsic concentration. The value of C can be obtained

20

from experimental studies such as [21]. If the number of excess carriers is very high then Augerrecombination can occur which requires a three particle interaction and is modeled as in equation3.25.

RAug = Bn(n2p − nn2i ) + Bp(np2 − pn2

i ) (3.25)

3.13 Band diagrams

A band diagram is a spatial plot of the different energies in the semiconductor specifically thevalence and conduction bands and the Fermi level. Drawing the band diagram starts with theFermi level. At equilibrium with no applied voltage the Fermi level is flat. The figure 3.11 showsthe band diagram for a p-n junction.

junction

n type

p type

Ef

depletion width

Figure 3.11: A p-n junction at equilibrium.

With a voltage applied to a homogeneous semiconductor the Fermi level is not flat and itrepresents the potential and it’s derivative is the negative of the electric field. Within a junctionthe Fermi level splits into two quasi-Fermi levels one for the p type side and the other for the ntype side and the separation is equal to the applied voltage.

n =∫ Evacuum

Ec

D(E)f(E)dE (3.26)

p =∫ Ev

−∞D(E)f(E)dE (3.27)

Having drawn the Fermi level you extract the conduction band and valence band from it.The known electron concentration is equal to the integral of the product of the density of statesand the Fermi probability from the conduction band to the vacuum level. Similarly the knownhole concentration is the integral of the product of the density of states function and the Fermiprobability from the valence band to minus infinity.

Since you have the majority carrier concentration (either n or p) you can obtain the minoritycarrier concentration by the relationship np = n2

i where ni is given by the relationship

ni =√

NcNv e−Eg/2kT (3.28)

Here Nc and Nv are constants and are the effective density of states for electrons and holes.

21

3.14 Impurities

Impurities or dopants in Si require some energy to be ionized and occupy energy levels [22],[23] immediately below the conduction band for donors and immediately above the valence bandfor acceptors. The gap is small, 0.044 ev for Boron, 0.049 ev for Arsenic, 0.044 ev for Phosphorus.

At room temperature all of these dopants are ionized. P and As lose an electron to the con-duction band and become positive ions, whereas B accepts an electron from the valence band andleaves behind a hole, and becomes a negative ion.

Copper has an acceptor level 0.49 ev from the valence band and Silver has an acceptor level0.54 ev from the valence band. These are deep level traps and if the Si is contaminated withthese impurities they can cause significant leakage because even if there are only a few of thesestates the probability of transition to or from these states from the valence or conduction band isexponentially increased because the gap is only half of the energy gap.

Excessive doping results in band-tailing [24] where the impurities cause either the conductionor valence band to extend into the energy gap. This effectively reduces the band gap and increasesleakage current.

22

Chapter 4

Active devices

a c e b c sg

d

Figure 4.1: The semiconductor active devices.

4.1 P-N Junction diode

The junction diode is the first device in the figure 4.1. It is formed by first implanting onespecies either n or p, then implanting the other to form a junction. As you move toward thejunction the depletion region begins gradually. In the bulk n type region the electron concentrationis ≈ ND which is the donor concentration and in the bulk p type region the hole concentration is≈ NA which is the acceptor concentration.

V

Vbin p

Figure 4.2: A diode junction in equilibrium.

The electrons cross the junction from the n type to the p type region and occupy the holesthereby leaving behind ionized donors on the n type side and causing the acceptors to becomenegatively charged ions. This is called depletion and the depletion region is characterized by a lackof carriers. In order to get the actual extent of depletion you have to solve the continuity equationsand Poisson’s equation.

The depletion approximation is a simple way of solving the diode as shown in the figure 4.2. Youassume that the depletion region begins abruptly as you approach the junction. Then the chargeon the n side is simply n×xn if you assume unity cross-sectional area. Similarly the charge on the

23

n type

p typeVbi

Figure 4.3: Obtaining the built-in voltage.

p side is simply p × xp. From charge neutrality, these two have to be equal to each other.

E(x) =1ε

∫ x

−∞ρ(x)dx (4.1)

V (x) = −∫ x

−∞E(x)dx (4.2)

depletion width

p type

n type

Ef

Vbi − VA

Figure 4.4: Diode under bias.

Since you know the doping concentrations ND and NA and the band gap Eg and the densityof states Nc and Nv, you can use equation 3.12 to obtain the Fermi levels on either side of thejunction. Then obtain the electric field by equation 4.1 and then the potential by equation 4.2.Then you increase the length xn and xp until the Vbi is the difference as in the figure 4.3. Theresulting band diagram is shown in the figure 3.11. As you can see the conduction and valencebands are the same shape as the potential V (x) except they are flipped and scaled by q.

ln IkTq

Va

Figure 4.5: Diode current.

As you apply a positive bias to a band diagram you will push it down relative to the fixed endas shown in the figure 4.4. One way to remember this is to remember that electrons are attracted

24

to a positive potential and so the conduction band has to bend downward w.r.t the fixed referenceso that the electrons can slide down the conduction band toward the positive potential.

I = Is

[e

q VakT − 1

](4.3)

The ideal diode equation is 4.3. For Va > 0, the diode current is exponential, so if you plot thenatural log of it vs. voltage, as in the figure 4.5 the slope should be nkT/q where n usually liesbetween 1 and 2.

4.2 Bipolar junction transistor

If you widen the diode and add a third implant to create either an n-p-n structure or a p-n-pstructure you get the bjt as shown in the figure 4.1. The most important feature in a bjt is howthin the base region is. If the base is too wide then the bjt will not function at all.

N

P

N

1

2

3

b

e

c

Figure 4.6: Carrier flow in a BJT.

In the figure 4.6 there are three currents shown numbered 1, 2 and 3. The two currents 1 and2 add up to form the base to emitter current. 1 is the hole current flowing from the p type base tothe n type emitter. 2 is the electron current flowing from the n type emitter to the p type base andactually being captured by the contact to the p type base.

However as you can see in the figure 4.7 the collector is heavily reverse biased with respect tothe base and so any electrons in the base see a long slide down the conduction band to the collectorcontact. Due to the narrowness of the base, as the electrons move across the base toward the basecontact, a large number of them fall down the potential into the collector causing the current 3. Ifthe current 3 is much larger than the sum of the currents 1 and 2, then you have a large gain.

The figure 4.8 illustrates the problem of punchthrough. As you know the depletion widthdepends on the doping concentration on either side of the junction. If the base and emitter areat the same potential, then the region marked as depletion 1 in the figure 4.8 is the equilibriumdepletion width. Since the base collector junction is reverse biased to the supply potential theregion marked depletion 2 will be much larger than depletion 1.

25

Collector

BaseEmitter

p type

n type

n type

Figure 4.7: A BJT under bias.

When the sum of these two widths is equal to the physical width of the base, the base contactessentially loses control of the potential in the base and all that electrons in the emitter see is a longdepletion region with a conduction band falling continuously toward the collector. So appreciablecurrent will flow.

c

e

b P

N

N

depletion 1

depletion 2

Figure 4.8: Punchthrough in a BJT.

So you want to design your device so that even including process variation and a supplyvoltage surge, all BJTs in the circuit avoid punchthrough. But you can’t make the base too thickbecause you the current gain is given by equation 4.4, and if you make the base thicker the electricfield from the collector is reduced and at the same time the space charge voltage drop looking intothe base also drops because I2 and I1 are moving in a wider channel and the net result is that thegain reduces because I2 increases and I3 decreases.

GI =I3

I2 + I1(4.4)

The collector current is given by the equations 4.5 and 4.6. So Is is similar to the reversesaturation current of the base diode except amplified by the maximum value of the gain β.

Ic = Is expqVbe

kT(4.5)

Ic = β Ib (4.6)

26

VA Vce

I

Figure 4.9: The Early voltage.

The Early voltage is shown in the figure 4.9. The equation 4.5 assumes that the collectorcurrent depends only on the Vbe and not the Vce. But in fact as the Vce rises the Ic will rise as well.This is due to a reduction in the width of the base due to an increase in the Vcb and hence thedepletion width called the Early effect. The Early voltage is a way to factor this in as in equation4.7. The inherent assumption is that VA Vce.

Ic = Is

(exp

qVbe

kT

) (1 +

Vce

VA

)(4.7)

4.3 Heterojunction Bipolar Transistor

Collector

BaseEmitter

p type

n type

n type

Figure 4.10: The band diagram for a HBT.

As we saw in the last section the current I1 reduces the current gain and you want to make itas small as possible. This is achieved in the HBT by using an emitter of a larger band gap thanthe base as shown in the figure 4.10. Here the dotted line is the homojunction. In this way I1 isreduced because the holes in the base see a larger potential barrier to entering the emitter.

4.4 Field-Effect transistor

The FET is controlled by the gate. The gate has a capacitance to the body of the FET and whenthis capacitance is charged a sheet of charge forms below the oxide and this is the charge sheet [25]model. The body of the NFETs is tied to ground and the body of the PFETs is tied to supply. Thesource and drain are of the opposite type to the body, and so in the absence of a gate voltage thesource to body region is either at equilibrium or is reverse biased while the drain to body diode isdefinitely reverse biased, so current does not flow from the drain to the source.

27

sg

d

Figure 4.11: Saturation of an FET.

The gate voltages have to lie between supply and ground. For an NFET if the gate is at supplythe charge sheet that forms under the gate is negative i.e. it is made of electrons. The source is ntype and is at ground potential. If the drain is somewhat lower than the gate, the charge sheet isconnected to both the source and drain and the electrons from the source see an electric field fromthe drain to source in which they travel. If the drain is also at supply just like the gate is then thecharge sheet cannot reach up to the drain because you cannot have a charge sheet unless there isa capacitor voltage to support it. This situation is called saturation and is shown in the figure 4.11[26].

If you hold the gate at supply and slowly step the drain up from ground to supply, the steadilyincreasing drain to source electric field causes a steadily increasing drain to source current flow. Asthe drain approaches supply and the charge sheet at the drain starts to disappear, an increasinglylarger portion of the drain to source voltage is dropped across the gap between the drain and thecharge sheet due to it’s higher resistance.

I

Vds

Linear

Saturation

Figure 4.12: Linear and saturation operation of the FET.

Even in the saturation region the current does increase with increasing drain voltage howeverthe rate of current increase with drain voltage starts to fall and for low gate voltages the gapbetween the charge sheet and the drain becomes large enough that the drain current barely riseswith drain voltage.

The figure 4.12 shows the two regions of operation of the FET other than the sub-thresholdnamely the linear and the saturation. The figure 5.6 shows the sub-threshold behavior. Theequation 4.8 represents the linear region, the equation 4.9 represents the saturation region andthe equation 4.10 represents the sub-threshold region.

Id = µεox

Tox

W

L

[(Vgs − Vth)Vds − 1

2V 2

ds

](4.8)

28

Id = µεox

Tox

W

L(Vgs − Vth)2 (4.9)

Id = kxW

Le(qVgs/nkT )(1 − e−(qVds/nkT )) (4.10)

The effect of the high electric on mobility is given in [27], [28]. Several theories on computingthe threshold voltage are given in [29], [30]. Other models for conduction are given in [31], [32].The effect of moving from metal gates to polysilicon gates is given in [33]. The narrow width effectis described in [34], [35].

4.5 FET small signal equivalent

s

g

d

Figure 4.13: Parasitics of a FET.

There are two kinds of circuits that we use namely digital and analog. When you simulate adigital circuit you are mostly interested in it’s temporal behavior. On the other hand when youdesign an analog circuit you are also interested in it’s harmonic behavior. So essentially when youdesign a digital circuit you only need to make sure that as you vary the signals in the time domain,the outputs of the circuits change rapidly enough and that they can drive enough current to chargeand discharge the load capacitances.

When an analog circuit is designed there are really two parts to the process which you need toiterate through. The first part is to set the bias point. The bias point for a FET is the combination ofthe three voltages Vgs , Vbs and Vds . For a bipolar transistor it is the combination of Vbe and Vce .

g

s

d

b

Figure 4.14: Small signal equivalent of a FET.

The figure 4.13 shows the parasitics of a FET. Using these parasitics we get the small signalequivalent of a FET as shown in the figure 4.14. The only other component is a current source

29

which is given by I = gmVgs. gm is called the transconductance and is given by gm = dIds/dVgs.The value of gm is dependent on the bias point. The resistance is the channel resistance and it toois dependent on the bias point.

The small signal equivalent is useful in order to obtain the behavior of a circuit when ACsignals are applied to the gates. But keep in mind that this AC behavior is only valid so longas the signal is very small compared to the supply voltage. If the prediction is for large swingsin the drain voltages, then the prediction is incorrect because under such conditions you are toofar from the bias point at which the equivalent circuit was extracted and so the capacitances, thetransconductance and the channel resistance will actually be different. So a small signal equivalentis used as a design tool but a time domain simulation using a sufficiently small time step will giveyou the actual response of the circuit.

4.6 Other transistors

The most visible place that MESFETs are used are in circuits used in routers. They are madeusing GaAs MESFETs because GaAs has a much higher mobility than Si and thus you can get amuch higher speed out of the same size chip than using Si MOSFETS. The competition for GaAsMESFETs is provided by Si BJTs which are fast by virtue of being junction devices.

The MESFET does not use a gate oxide. Instead the metal forming the gate is deposited directlyonto the GaAs between the source and the drain. When metal is in contact with the GaAs it formsa Schottky contact and there is a barrier which needs to be overcome in order that current flow.This barrier is caused by the difference in work function between the metal and the GaAs.

So when a MESFET is turned on you actually have a diode to ground and so gate current flowscontinuously. This is different from the MOSFET where the gate current stops flowing once thegate is charged. Just like a MOSFET the current flow starts when the body to source is forwardbiased. The way this happens is similar to the MOSFET in that the gate voltage is initially sharedbetween the Schottky diode and the source to body diode and the region under the gate does getan increased number of electrons which then see the drain to source electric field.

So even here you get a gain because for every electron which travels from the source into thegate, you have a large number of electrons traveling from the source to the drain, but the gate stillcontrols this current flow by controlling the potential of the channel immediately adjacent to thesource.

The High Electron Mobility Transistor received a lot of attention because it uses a channelwhich is an two dimensional electron gas giving the carriers an extremely high mobility. One in-teresting thing about the development of the HEMT was that modeling engineers would predicta theoretical maximum mobility and then the experimental researchers would promptly obtain ahigher mobility and this happened several times.

Figure 4.15: The well of a HEMT.

30

The channel of the HEMT is caused by growing a low band gap layer of semiconductor over alayer of high band gap semiconductor of the same lattice periodicity. The easiest is AlxGa1−xAs.As you increase the fraction x of Aluminum the band gap increases but the lattice spacing remainsthe same. The best reference to start with for information about AlxGa1−xAs properties in [36].For a simple example of usage you could look at [37].

For the case of an abrupt i − n+ heterojunction as shown in the figure 4.15, a large number ofelectrons on the high band gap side on the right will fall down into the low band gap side. This iscreates a charge dipole which will raise the bulk potential on the left side of the figure up until theFermi levels align. Immediately to the left of the junction there is a deep valley in the conductionband toward the Fermi level to account for the excess electrons.

This potential valley now requires you to solve for allowed states using the wave equationgiving you the density of states function. Poisson’s equation needs to be satisfied as well. Due tothe well the energy along the vertical axis is constrained so there is one less degree of freedom andhence the mobility is higher along the lateral axis.

The strained layers is another version of growing different types of semiconductor on top ofeach other, but here the lattice spacing of the two layers is different, so the layer with a smallerlattice spacing is pulled apart whereas the other is pushed together.

If the layers are too thick, the shear force will weaken the junction but by alternating two orthree mono layers of these two materials you can get a new semiconductor with intermediateproperties such as band gap [38]. So a strained layer in the channel of a FET using Ge will create achannel region with a lower bandgap allowing carriers to fall into it and a larger current to flow.

4.7 Shrink problems at 0.06µ and below

4.7.1 The premise of the shrink

From the 1 µ to the 0.18 µ generations, shrinks were very straightforward. They were basedon the saturation equation for MOSFET current given by:

I =µ εox

Tox

W

L(Vgs − Vth)2 (4.11)

Device parameter Value after shrinkGate oxide thickness 0.707Length 0.707Supply voltage 0.707Width 0.707Gate oxide thickness 0.707Threshold voltage 0.707Current sameGate loading 0.707Operating frequency ≈ (0.707)2 = 2.0

Table 4.1: Effect of shrink.

If you shrink the MOSFET by 0.707 as shown in the Table 4.1 then the ratio of the new satu-ration current compared to the old saturation current and the ratio of the new gate capacitance to

31

the old gate capacitance are given by:

Inew

Iold=

Wnew Lold Tox_old

Wold Lnew Tox_new

(Vgs_new − Vth_new)2

(Vgs_old − Vth_old)2= 1 (4.12)

New load

Old load=

Wnew Lnew Tox_old

Wold Lold Tox_new= 0.707 (4.13)

So if you have the same current charging a load of 0.707 what it was previously to a voltage of0.707 what it was previously, then you can operate it at approximately double the clock rate thatit was operating at previously.

Today’s reality is a little different from the past. Let us compare an imaginary 0.13 µ Leff

process to an imaginary 0.06 µ Leff process. This is shown in table 4.2. Because I am makingup these numbers to show a point, they are not very real but at least they are good enough tounderstand the basic ideas.

Device parameter 0.13 µ Leff 0.06 µ Leff

Vdd (Supply voltage) 1.8 V 1.1 VVth (Threshold voltage) 0.5 V 0.4 VSt ( Sub-threshold swing) 125 mV/dec 100 mV/decTox (Gate oxide thickness) 25 Å 14 ÅLdiff ( Sub-diffusion) 0.022 µ 0.015 µGate loading z 0.6 zXj (Junction depth) 0.15 µ 0.10 µND (Drain doping level) 1020/cm3 3 x 1020/cm3

Tj (Junction temperature) 100o C 70o CThermal budget x 0.6 xOperating frequency 3 GHz 12 GHzHeat generation y/mm2 higher

Table 4.2: Comparison of device parameters.

Let us look at each parameter of the Table 4.2 and what is limiting it if any and what effectthat has on the MOSFET’s operation and it’s effect on the circuit behavior as well. Let us use theNFET to illustrate our comparison.

4.7.2 Vth (Threshold voltage)

As a general rule supply voltage should scale directly with gate length. So therefore the ex-pected supply voltage for a 0.06 µ Leff should be 0.06/0.13 of the 0.13 µ Leff supply voltage butthat is not really feasible because of the Vth.

In order for the FET to be useful, we need to be able to turn it on and off just like a switch.From chapter 2 we know that when the Vgs is lower than the Vth of an NFET, the FET is meant tobe off. But is it really off ? Also we know that when the Vgsis larger than the Vth the FET is meantto be on. But is it really on ? Just like everything in life, there is no black and white, there is awhole gray area when it is neither one nor the other.

In the Figure 4.16 you see the gate characteristic at Vds = Vsupply and it is divided into 2portions essentially the portion on the left which is the sub-threshold region and the portion onthe right which is the saturation region. Keep in mind that the y axis here is in the logarithm scale

32

Vth

headroom

Ith

Ioff0

Log I

4 decades

Ion

Figure 4.16: The gate characteristic showing Vth.

i.e. 1 actually means 10, 2 means 100 etc. When the Vgs is zero, the current is as low as it will goand this current is called Ioff . The current at Vgs = Vth is Ith. You want the Ioff to be about 4decades lower than Ithand you want Ion to be about 2 decades higher than Ith. For example if theIth is 1 µA, then you want Ioff to be 100 pA and you want Ion to be about 100 µA.

Now since we just reduced the supply voltage to 1.1 V, then we have to also reduce Vth. But ifwe reduce Vth, the curve in the Figure 4.16 moves to the left and the Ioff will increase significantly.But if that happens then the chips will consume a lot of power all the time and they will get veryhot and basically melt. So we will only reduce the Vth to 0.4 V. But even so we have to increase thesub-threshold slope so that the Ioff stays the same. Also when we reduced the supply voltage,our headroom also reduced from 1.3 V to 0.7 V. This is probably a problem, but the real extent ofthe problem won’t be clear until we set all the other parameters and test the new transistor. Theproblem is that:

Ion =µn εox

Tox

W

Leff(Vgs − Vth)2 (4.14)

In this equation W and Leff scale together, so we get no advantage there. Tox reduces so weget some advantage there, but (Vgs − Vth)2 just reduced from 1.32 to 0.72. So the net result is thatIon is lower but that is still all right because the load capacitance also reduces as we will see later.So let’s hold off on judging the transistor until later.

4.7.3 St (Sub-threshold swing)

The dependency of the FET’s current on Vgs when Vgs is less than Vth is basically similar to thatof a diode i.e. it is exponential in nature. But unlike a diode, in this case the Vgsis not really thevoltage across the body-source diode because Vgsis applied across the gate-source contacts andit is related to the body-source diode voltage through the gate capacitor. Now we know that areverse biased diode acts as a capacitor, so the sharing of Vgs between the gate-body capacitanceand the body-source diode capacitance is proportional to the ratio of the depletion width of thebody-source diode and the oxide thickness Tox. In other words if the ratio of the Tox to the body-source diode depletion width is smaller, you will get a steeper rise in the sub-threshold current vs.Vgs.

33

Vth

dV

decades

Log I

Figure 4.17: The gate characteristic showing swing.

The sub-threshold swing is the inverse of the slope of the sub-threshold drain current vs. Vgs

and is expressed in mV per decade of current rise. At 0.13 µ Leff the St was 125 mV/dec andnow we need a steeper slope so we use 100 mV/dec, so that works out to 0.4 V over 4 decades ofcurrent. So then, the off current density will stay the same.

4.7.4 Tox (Gate oxide thickness)

The gate oxide as well is normally scaled linearly with gate length. The issue here is that thegate oxide must not break down when you apply the maximum supply voltage on it. At least tothe first order each mono-layer can withstand about 200 mV. The catch is that as the thickness getslower the probability of electrons tunneling across the gate oxide from the gate to the channel orequally probably from the channel to the gate starts to get larger. Many physicists and experimen-talists have come to the conclusion that 5 mono-layers is the physical limit. The other aspect to thisis that as the gate lengths reduce there are larger concentrations of electrons in the channel andthey are moving very fast so due to the Boltzmann distribution of energies there will always be ahighly energetic tail in the distribution where the particles have a lot of energy and want to try totunnel through the gate oxide. 5 mono-layers is 14 Å and that is what we chose for the 0.06 µ Leff

which makes some of us believe that when we go to 0.03 µ Leff the gate is going to be very leakyindeed. There are other materials such as Silicon-Nitride that have a slightly higher ε than Silicondi-oxide but they have different problems of their own.

Another interesting problem with making a 14 Å gate oxide is the way that gate oxide is de-posited. Gate oxide needs to be very pure so it is normally grown not deposited by heating the waferat 800o C for about a half-hour in pure oxygen and nitrogen without water vapor. But when youput the batch of wafers into the oven, it takes a few minutes for the temperatures to stabilize andso the timing gets to be a problem as the thickness of the gate oxide is reduced because the waythat oxide grows is that it grows the fastest in the first few minutes after placing it in the ovenbecause at that time all you have is pure silicon reacting with the hot oxygen gas and formingsilicon dioxide but as time goes on the reaction slows down because the silicon under the oxide isseparated from the oxygen it wants to react with by the growing oxide layer. So because of thiseven if you put the batch of wafers into the oven and take them out in two minutes you will havesignificant oxide on them and the variation is very large. This was not a problem when you were

34

growing 60 Å of gate oxide because such a thick oxide growth buffers itself but it is a very big prob-lem when you only want to grow 14 Å worth of gate oxide because due to the normal variation inthe temperature of the oven and depending on how long you keep the oven door open and othersuch things that you normally don’t consider, the oven temperature during the first few minutesafter you close the oven door may vary between 700o C and 800o C and of course that means youmay get 6 mono-layers instead of 5 which is already 20% too much.

So basically we need some researcher to come up with a way to inhibit the oxide growth untilthe oven temperature stabilizes and also slow down the oxide growth process so that we can keepthe batch of wafers in the oven for a half-hour give or take a minute. I am sure that some brightyoung student at Berkeley or Stanford will do just that. Although keep in mind that a solution tothis problem has been in existence for decades and it is called molecular beam epitaxy or MBE whereyou can grow precisely as many mono-layers as you want very repeatably indeed. Because of it’shigh cost MBE is normally used for building semiconductor lasers and other higher cost products.

When you reduce the gate oxide thickness the capacitance per unit area increases. But sincethe W and the Leff both reduce as well, the area decreases as well. Let the minimum W for the0.06 µ Leff process be 0.5 µ and the minimum W for the 0.13 µ Leff process be 1.0 µ, then:

New gate capacitance

Old gate capacitance=

0.06µ 0.5µ 25Å0.13µ 1.0µ 14Å

= 0.41 (4.15)

4.7.5 Ldiff (Sub-diffusion)

The source and drain are implanted on either side of the gate using a method called the self-aligned process. However subsequent heat cycles for example for the purpose of annealing thesource and drain regions or otherwise for growing an oxide etc. will cause these implants todiffuse away in all directions and one of those directions is under the gate. The length of the gateas it is drawn in the self-aligned process is called the drawn gate length Ldrawn and this is usuallylonger than the effective gate length Leff and the sub-diffusion is the difference between the twoi.e.:

Ldiff = Ldrawn − Leff (4.16)

So when you shrink the MOSFET you have to reduce the sub-diffusion as well. One way todo this is to reduce the heat cycles following the implant process. Another way is to reduce theimplant energy as a way to reduce the damage and thereby to do away with annealing. Anothermethod is to implant acceptors at a large angle into the source and drain into the region underthe gate as shown in the Figure 4.18 as a way to counteract the sub-diffusion and increase theeffective gate length back toward the drawn gate length.

gs d

Figure 4.18: The sub-diffusion and the implants to reduce it.

35

Some semiconductor companies adjust the implants so that the Leff ≈ Ldrawn and that maysound like a good thing but consider this; If an effective gate length is shorter than it is drawnbecause Ldiff > 0 then the current becomes larger and the chip runs correctly but may consumetoo much power but if an effective gate length is longer than it is drawn because Ldiff < 0then the extra channel region is not directly under the gate and so the gate control of this region isnot direct and this is a huge problem because the MOSFET may not even turn-on properly whichmeans the current will be too low which means it is non-functional. For example if a microproces-sor is sold as a 2.5 GHz processor, people are not going to be happy if it can only run at 2.0 GHzor even less and even worse they may run it at 2.5 GHz not knowing that some circuits are notoperating correctly and then the computer will behave unpredictably.

4.7.6 Gate loading

For all practical purposes we consider the load seen by the logic gates to be a capacitance. Thatcapacitance has three main components:Gate capacitance

When you lower the gate oxide thickness you increase the capacitance per unit area as wediscussed in the subsection on Tox and if you reduce the Tox, Ldrawn and W by the same factorof say half you hope that the load due to the gate capacitance is reduced to half as well. Butin reality the W does not reduce to half. Consider this; that when we design a next generationmicroprocessor we are not satisfied if it can do the same as the previous generation did; No, wewant it to run faster and faster. So as a result the W actually used in the chips is usually largerthan half. Maybe a more realistic estimate of W is two-thirds. So effectively the gate loading dueto the gate capacitance of the load reduces not to a half but perhaps closer to two-thirds.Junction capacitance

The source and drain are junctions as in a diode. These junctions are reverse biased w.r.t thebody. A reverse biased diode has a fairly large capacitance as shown in the Figure 4.19.

Figure 4.19: The source junction capacitor.

The junction capacitance is really made of two components. The vertical component is relatedto the area of the source or drain. It is a capacitance between the bottom of the source implantand the body below it. The other component is the lateral component and it is related to theperiphery of the source or drain implant and the depth Xj of the implant. The other factor is thedoping concentration of the source and drain because the higher the concentration the lower thedepletion width of the reverse biased diode. In the Figure 4.20 is shown the vertical view of thesource and drain junction.

36

L

W

Figure 4.20: The area and periphery of the junction capacitor.

So ultimately the capacitance that is seen is given by:

C = [εsilicon W L

Tdepletion] + [

2 εsilicon (W + L) Xj

Tdepletion] (4.17)

The first component is not good because we have already decided that the W is only going toreduce to two-thirds and so the best that we can expect of this component is that it reduces in amanner similar to the gate capacitance. The second component is not good at all because it is notdependent on (W L) but rather (W + L) and it also depends on the depth Xj which definitelydoes not reduce to half. So to sum it up we want the junction capacitance to reduce to half but wehave to settle for perhaps two-thirds or even more than two-thirds.Line capacitance

Vert

Hor

Figure 4.21: The interconnect line capacitances.

The lines interconnecting the FETs are made of aluminum and more recently of copper. Thelines appear as shown in the Figure 4.21. They are stacked in layers of as many as 5 layers andare separated horizontally and vertically by a dielectric with the consistency of glass. When youshrink the chip these lines get closer to each other. So the capacitance of these lines w.r.t each otherincreases. Of course to compensate the lines do get shorter because the FETs get closer to eachother. But in the end the line capacitance does not reduce enough as the chip shrinks.

4.7.7 Xj (Junction depth)

Junction depth Xj is the depth at which the source and drain diffusions change type back intothe substrate doping type, so in this case of the NFET it is the junction between the n-type source

37

or drain and the p-type body. Like everything else, you want the junction depth to reduce as thedevice shrinks. This is a problem because in order to decrease the implant depth, you have toreduce the implant energy. As you continue to reduce the energy a point is reached when theenergy of the dopants that are trying to penetrate the silicon surface is so low that the statisticalvariation of the implant depth becomes large.

Part of the problem is that implants are usually done at a slight angle of perhaps 17o fromthe vertical. The reason for this is that if the implants are done vertically a phenomenon occurscalled channeling where the dopants apparently see a tunnel and they can travel quite a long dis-tance within that tunnel and so the depth of the implant is very large. To combat this, the implanthas to be done so that you cannot see a line of sight through the silicon crystal, so that the implantdopants collide with a lattice atom and thereby stop penetrating further into the crystal. The prob-lem is that at low enough energies the implant dopants may simply get reflected off the surface ofthe crystal and it is hard to predict exactly how much of the implant will be reflected this way. Soit is not that you will not get a shallow implant it is just that sometimes you will and sometimesyou won’t and this type of variation will quite simply make the chip non-functional and that isthe problem.

4.7.8 ND (Drain doping level)

Sometime by the mid-nineties doping densities for the source and drain implants reacheddegenerate levels. All that means is that the number of dopants exceeds the number of uniquestates allowed by quantum mechanics. In itself this is not an issue but what it does do is that thissuper high density of dopants changes the behavior of the silicon crystal. One important effect ofextremely high doping density is the reduction in the mobility of holes and electrons [39].

This directly means you are reducing the current flow and that is not what we want. And yetwe don’t have a choice in this, as the devices shrink the dopant densities have to go up to provideenough carriers. Another bad effect of extremely high doping densities is leakage currents. Thisis something that can only be explained with a certain amount of quantum mechanics so I’m notgoing to say any more.

4.7.9 Tj (Junction temperature)

Log I

Lowertemperature

Figure 4.22: The change in the gate characteristic with temperature.

38

The effect of junction temperature is more readily visible in the sub-threshold region than inthe saturation region. In the Figure 4.22 the curve that drawn as a dash-dot line is what happenswhen the temperature is lowered. There are already a few computer vendors that are offeringwater cooling to reduce the operating temperature of the computer’s CPU by as much as 40−50oC.These coolers are very effective and I think you will see a lot more of that even going to the extentof using refrigeration and good quality copper plates to suck the heat away from the CPU andgraphics chips.

Cooling a chip made of MOSFETs is a win-win situation. The sub-threshold current is basicallya diode current, so it’s response to an applied voltage is based on units of [k T/q] and this quantityis normally referred to as VT and at room temperature it is about 26 mV. So you can imagine thatfor every 26 mV change in applied voltage the current will change by a factor of e = 2.68. But ifyou raise the temperature to 100o C the VT is 32 mV so the current changes by e every time theapplied voltage changes by 32 mV. So by cooling the chip down so that it runs at 10o C, you canreduce the Vth by as much as 60 mV and you can use this extra 60 mV to increase the saturationcurrent.

At the same time you win in the saturation region as well. When you reduce the temperaturethe atoms in the lattice have less energy and vibrate more gently. The thermal energy of theelectrons also reduces. As a result of these things the mobility µ of the electrons in the channelincreases. The current increases directly with mobility, so the Ion increases as shown in the Figure4.22. When the Ion increases, the charging and discharging of the load capacitance is faster, so thechips runs faster.

4.7.10 Thermal budget

Thermal budget is a process issue. It refers to the heat cycles that a wafer goes through as it isprocessed from a blank wafer with nothing on it to a completed wafer that is ready to be broken upand packaged into working chips. The thermal budget is important because it affects the diffusionof the dopants [40]. To understand the worst case scenario it should be noted that if you leave afunctional wafer in the oven at over 800o C for several days, all the implanted dopants will diffusearound to the point where the source and drains will merge and the wafer will be non-functional.

The two main reasons for the thermal cycles are oxide growth and the anneals required afterimplantation. So it would be nice if the critical implants that we don’t want to diffuse can bedelayed and done after the oxide growths and after less critical implants. But to some extentthe order of the different implants, anneals and oxide growths is immutable. For example theanneal has to follow the implant although other implants may take place in between. Sometimesthe oxides are used as masks during the implant so they have to be done prior to that particularimplant. An oxide growth can double as an anneal however the problem with that is that thetemperature required to anneal is different than the temperature required to grow an oxide. Tocreate an oxide you need to heat to only 800o C or so because all you are doing is providingenough energy to the endothermic reaction to cause the oxide to grow rapidly however to annealthe semiconductor you need to heat to perhaps 1100o C to give the silicon and dopant atomsenough energy to move about and create new bonds with their neighbors in an orderly manner.

So when you do a thermal budget you have to somehow scale the different temperatures toa common temperature. The temperature I normally like to use for a thermal budget would be850o C. The higher temperature cycles at 1100o C and so on are scaled to the 850o C temperatureby multiplying by a pro-rating factor. Keep in mind that this factor will differ for different diffusingparticles but you may pick a factor that represents the diffusion of the most critical particles.

Another feature of the thermal budget is that when you assess the impact of the heat cycles

39

it is usually as an aggregate of the effects of different portions of the heat cycle on the implantsthat they succeed. So the total thermal budget is not meaningful except in that it gives you amind-picture of what the process looks like. As a general rule however, a lower thermal budget isusually a good thing because it means that the particles you implant stay where you put them.

4.7.11 Heat generation

The issue of the heat generated by a chip is also worrisome when considering the chips of thefuture. This is mostly important for large chips such as microprocessors. In the discussion belowkeep in mind that in whatever way power is consumed it ultimately exits the chip in the form ofheat i.e. the heat produced by the chip is equal to the power consumed by the chip. In a CMOSchip the power is effectively given by:

PC = fC V 2

Supply

2(4.18)

where C is the total capacitance switched, VSupply is the supply voltage and f is the frequencyof operation of the chip. Keep in mind that C is not the total capacitance of the chip becausethe vast majority of the circuitry in the chip maintains whatever state it is in across many cyclesand only changes states occasionally. So C is the capacitance that is actually being switched ineach cycle. This is why many CMOS manufacturers especially microprocessor manufacturers aredesigning their chips so that as few gates are changing state at any given time.

There is another source of power consumption which is the leakage current. Ioff is neverzero so for each and every logic gate in the chip a certain amount of current is leaking throughwhichever FET is supposed to be off from the supply to the ground. The power lost in this way isgiven by

PL = n I2off R (4.19)

where n is the number of circuits that are leaking, and R is the resistance of the path from thesupply to the logic ground.

Both types of power consumption rise in every generation of microprocessors, the first becausethe FET and line capacitances do not reduce sufficiently as the gate length is reduced and thesecond because in each generation we are willing to tolerate a slightly higher off current andfinally because we widen the FETs until we get the speed we want out of the chip. Anyway thenet result is that our microprocessor chips are generating as much heat as a room heater and weneed very good heat sinks and powerful fans to remove this heat.

40

Chapter 5

Process characterization

5.1 Overview

Semiconductor chip manufacturing has several sub-groups within it who have their own dis-tinct "philosophy". The most important is the production group because they are responsible foractually passing the Si wafers through the production line and growing or depositing or implant-ing the chip design onto them. The production group is the place where money is made or lost.The design group is responsible for designing the circuits that are incorporated onto the chip.They think primarily at the circuit level, meaning that to this group the active and passive de-vices are defined by the nominal characteristics and the variation therein, so the emphasis is moremathematical than physical.

There is a third group that is mostly invisible to the general public and this is the processcharacterization group. When chip equipment manufacturer release a new machine the decisionof whether to buy it or not is a management decision but once it is purchased it’s output needs tobe characterized so you can identify what the machine is capable of and this is done by the processcharacterization group. Under this umbrella is device characterization which we will discuss here.

5.2 Test equipment used

Figure 5.1: A probe station.

41

The probe station that you use to test a wafer usually looks something like figure 5.1. In thecenter is the wafer chuck with a small heater under it. The stem of the chuck is connected to anX-Y-Z positioner which has vernier screws down to 0.1 µm or so. The wafer to be tested is placedon the chuck and a vacuum suction is used to hold it firmly on the chuck. The chuck can be raisedand lowered by an adjustable amount by the use of a lever.

Above the chuck and surrounding it is the probe platform that you clamp the probes onto.The probes are usually hard mounted onto a probe card for rigidity but in the case of analog orRF testing you could also use single probes which are long needle points with their own X-Y-Z positioners. Above the wafer is the microscope that allows you to look at the wafer you areprobing.

Nowadays you also have the option of using a video camera mounted in place of the eyepieceso that you can simply see down the microscope by looking at a video monitor. Surroundingeverything is the RF enclosure which acts like a screen room upto microwave frequencies.

Needles

Figure 5.2: A probe card.

The wafer probe one normally uses is basically a rectangular printed circuit board with a roundhole in the center of it. The back edge of the board is striped with the edge connector metal to allowthe card to be inserted into an edge connector that is screwed onto the probe platform as shownat the left of the figure 5.2.

Figure 5.3: Looking at the wafer through the microscope.

Surrounding the hole in the card are metal needles which may be mounted by through holesoldered connections. These needles are usually of metal of low resistivity and as low a thermalexpansion as possible. They are not all of the same length as shown in the figure 5.2 and are angled

42

downward by as much as a half cm. The needles tips are usually arranged to form a rectangulararray which coincides with a matching array of contact pads on the wafer.

Looking down through the microscope one will usually see what appear to be a sea of padsas in the figure 5.3 and also the visible portions of the silicon devices and their interconnects andconnections to the contact pads.

The parametric analyzer is the measurement unit that can take all the measurements requiredto characterize most semiconductor devices such as transistors, diodes and resistors etc. Capaci-tance measurements are made using the C-V meter and if the measurements need to be made us-ing sinusoidal inputs and outputs, the equipment used is the network analyzer. Hewlett Packardequipment is the most popular choice among device engineers.

The primary behavior of a parametric analyzer is just to generate tables. Suppose the supplyvoltage of your circuit is Vs, then typically the supply voltage may be 10% higher. Both FETsand BJTs have three terminals and the FET also has the body contact. So in a real life usage ofthe device any terminal may potentially have any potential and so you basically need all possiblecombinations of terminal bias. But of course we don’t actually need all combinations, instead thereis a method to it as we discuss below.

5.3 Test circuit layout

The circuits used for direct current measurements are usually distinct from those used for al-ternating current measurements. If they were interchanged the measurements would not functioncorrectly. In addition there are usually far more test circuits to characterize FETs as opposed toBJTs.

As you see in the chapter on process skew, the variation in the effective gate length is a sig-nificant fraction of the minimum gate length. In addition as you will see in the analog section,longer gate lengths may be used in circuits that are sensitive to the length variation or which needa higher driving point impedance.

Usage Length WidthMinimum 0.06 µm 1 µm2nd order 0.09 µm 1 µm1st order 0.20 µm 1 µmLong channel 0.75 µm 1 µmShort & narrow 0.06 µm 0.2 µm

Table 5.1: Five transistors needed.

For these reasons the DC section for each FET usually contains an absolute minimum of fivetransistors, often even a few more. For example if the minimum gate length that can be drawnis 0.06 µm and the minimum gate width is 0.2 µm, then the devices chosen may appear as in thetable 5.1.

The gate length dependence is usually second order in most FET models so you need the firstthree devices to fit the 1st order and the 2nd order dependence. The fourth device is the longchannel FET which is used to calculate the threshold voltage and is also needed to do the skewingof the threshold voltage i.e. the long channel FET is the one which is measured by the productionline monitoring of the Vth because it is independent of the variation of Ldiff and will give thevariation solely due to the threshold adjust implants. The last device is to fit the short and narroweffects [34], [35].

43

Often the gates are connected together and to a single contact pad which is used when testingany of the devices and if you don’t need to get the reverse characteristics, you could connectall the sources to a common pad as well. Similarly several substrate connections can be placednear all the devices and connected to a common pad, so that finally only the drains usually haveindependent contact pads of their own.

There are two main components that make up the total gate capacitance. In the figure 5.4there is the component of the gate capacitance between the gate and the body and this dependsupon the area W × L. The other component is the capacitance of the gate to the source and drainjunctions. The second capacitance depends primarily on the width W .

W

L

Figure 5.4: Gate capacitances.

In addition there is a capacitance that is significant for narrow devices which is an effective"gate extension". This is the capacitance from the gate to the region of the body on either side ofthe gate along the W direction. However this is often ignored because the error is small for alldevices except the narrow devices which are rarely used anyway.

So we only have two basic components to separate. So we need two different structures toseparate the effect of the two components. So we would need to write an equation for each struc-ture as shown below and solve the two equations together to separate the contributions of the twocomponents.

C(W1, L1) = [2 × W1 × Cw] + [W1 × L1 × Ca] (5.1)

C(W2, L2) = [2 × W2 × Cw] + [W2 × L2 × Ca] (5.2)

So after measuring the capacitances of the two structures, we use Cramer’s substitution toextract Ca and Cw.

In general BJTs are characterized on a per structure basis i.e. each size of BJT has it’s ownmodel. For this reason, most of the devices to be used are placed in the test circuits.

5.4 Measurements

5.4.1 Drain characteristics

The typical drain characteristic appear as shown in the the figure 5.5. As you watch the screenof the parametric analyzer you are looking for the square law behavior just to assure yourselfthat the measurements are OK. So for the drain characteristics, you step the gate voltage fromsubthreshold to supply and sweep the drain. Different engineers look for different gate voltagesto use but my philosophy has always been "the more the merrier" and in fact I like to have a fewdrain sweeps in the subthreshold even though they are essentially meaningless for a drain sweep.

44

1

4

9

16Vgs = 4

Vgs = 1

Vgs = 2

Vgs = 3

I

Vds

Figure 5.5: FET drain characteristics.

5.4.2 Gate characteristics

The typical gate characteristic appear as shown in the the figure 5.6. As you watch the screenof the parametric analyzer you want a straight line steeply rising until the Vg gets close to thethreshold voltage and then it starts to flatten out. The long steep rise is only important becausewhen you fit the model to the measurement, if the model’s threshold calculation is incorrect, thenthe model predicted current will run parallel to the measured curve in this region. But of coursemost of the design work is done in the region above the threshold voltage, and the subthresholdis important primarily for leakage current and for low power circuits.

−6

−3

−9

−12

Vd = 0.05 V

Vd = supply

Figure 5.6: FET gate characteristics.

If you look at the gate characteristics in a linear plot as opposed to a log plot then essentiallyyou will not see the sub threshold at all and the plot would appear as shown in the figure 5.7.

The threshold voltage is calculated from the gate characteristic at a drain voltage of 50 mV. Thepoint of maximum slope is determined and the slope extrapolated to the x axis, and then the 50mV is subtracted from the intercept. In order to get a good result you want the spacing of the gatevoltage to be fine and you want to use a 5 point derivative.

5.4.3 Back bias

The body of NFETs is usually connected to ground whereas the body of PFETs is connected tosupply. But consider the case of the upper NFET in a two input NAND gate, it’s source is at thedrain potential of the NFET below it and it could be considerably above ground. To allow proper

45

Figure 5.7: FET gate characteristics.

simulation of all circuits, the gate and drain measurements are repeated with a back bias whichis the body to source voltage. It is negative for NFETs and positive for PFETs and can reach amaximum of supply voltage.

Typically it is enough to measure at two back bias voltages to fit essentially a quadratic de-pendency, I prefer a third and two third supply. For the most part the back bias will shift the gatecharacteristics to the right, but there is a cross term dependency between drain and body voltagesbecause they essentially compete at the same diodes. The curves you expect to see are as shownin the figure 5.8 where the dashed line shows the case without back bias.

−6

−3

−9

−12

Figure 5.8: FET gate back bias characteristics.

5.4.4 Collector characteristics

The collector characteristics are obtained by stepping the base voltage and sweeping the col-lector voltage as shown in the figure 5.9. If the collector voltage is lower than the base voltage theBJT is in saturation and if it is higher than the base voltage the BJT is in active mode.

Due to the Early effect the collector current rises with an increase in the collector voltage andthe intercept of this slope on the x axis is the Early voltage as shown in the figure 4.9.

46

Active

Saturated

Vce

Ic

Figure 5.9: The collector characteristics.

5.4.5 Diode characteristics

The figure 5.10 shows the diode characteristic. The dotted line shows the exponential rise youare expected to see using the ideal diode equation 5.3

I = Is(eqV/nkT − 1) (5.3)

As the current rises the bulk resistances in the diode share some of the applied voltage and thecurrent rises less than exponentially. The ideal diode equation uses n = 1 but in real life n ≈ 1.5.

V

I

Is

Figure 5.10: The diode I-V characteristic.

5.4.6 Reverse characteristics

In some fabrication processes the source and drain implants are different from each other whilein others they are identical. If the drain is different from the source, then the FET will have differentcharacteristics when the source is used as the drain and the drain as the source. It is a rare circuitwhich allows the source and drain to be reversed so usually one does not characterize the FET inreverse mode.

47

Similarly if you have a situation where the emitter of a BJT may be interchanged with the col-lector, you definitely need to use a different model for that case because the emitter and collectorare completely different as shown in the figure 4.1. For BJTs one usually does make models forboth forward and reverse cases.

5.4.7 S parameter measurement

Figure 5.11: S parameter measurement.

BJTs are often used in circuits operating at several GHz. So they are also characterized usingmeasurements made at frequencies ranging from a few hundred MHz upto 10 GHz or so. Theequipment that makes these measurements is called a network analyzer. Two 50 Ω cables connectthe network analyzer ports to the BJT as shown in the figure 5.11.

The probe that is used to make these measurements is a 50 Ω coplanar waveguide mountedon a ceramic substrate. The tip that touches down has two contacts with the lower one (in thiscase) being the common ground and the upper the signal. The signal contains both DC and ACcomponents.

The DC component on the left sets the Vbe while the DC component on the right sets the Vce

of the bias point. The AC component on the left is the input sinusoidal signal, and the collectoron the right drives the amplified AC component onto the probe on the right and into the networkanalyzer.

The network analyzer provides the input sinusoidal signal by superimposing it onto the DCinput bias, and it isolates and measures the output sinusoidal signal driven by the collector. It tab-ulates the input amplitude and phase and the output amplitude and phase. From this informationthe BJT is characterized.

5.4.8 C-V measurement

FET gate capacitance

The figure 5.12 shows a typical C-V curve obtained when measuring the capacitance of a FETgate. Initially all you see in the capacitance of the gate to the body and from the gate to the sourceand drain regions. As the gate voltage is increased past the onset of inversion, a thin layer ofcharge starts to form under the gate connected to the source. Since the charge is immediatelyunder the gate oxide instead of in the bulk of the device the capacitance starts to rise. Initially theregion closer to the drain is not yet inverted, but as the gate voltage increases the inverted layerstretches from the source to the drain and also the inversion is more complete so the charge sheetis right under the gate oxide and thus the capacitance reaches it’s maximum.

After this maximum is reached, any increase in gate voltage has to result in an increase incharge but the gate only has so much charge density per unit volume, so the charge on the gatestarts to extend upward to higher layers of the gate, thereby separating the plates of the capacitor,thereby reducing capacitance which is why you see the capacitance drop as the gate voltage tendstoward the supply voltage.

48

Vth

Figure 5.12: Inversion capacitance of the gate.

Diode junction capacitance

Diode junction capacitance is a little difficult to measure because of the direct current thatflows as the bias of the junction is made more positive. In the figure 5.13 you see that as the diodeis progressively more forward biased, the width of the depletion region drops and the measuredcapacitance increases.

C

V

Figure 5.13: Diode C-V measurement.

The way that the measurement is done is that a DC bias is applied across the diode and asinusoidal voltage usually at about 160 kHz is superimposed over the DC bias and the currentflow due to the sinusoidal signal is separated from the direct current by the use of an isolatingcapacitor. So the circuit whose capacitance you are trying to measure is on the right of figure 5.13but since the series resistance of the diode drops a lot faster than the capacitance rises so the RCtime constant drops very quickly. In addition, the direct current has noise in it, so as the directcurrent rises noise becomes a problem.

The only way to reduce the direct current is to reduce the area of the diode and this wouldalso reduce the capacitance you are trying to measure. So the bottom line is that measuring thecapacitance of a diode is difficult and so the C-V measurement of a diode usually only extends tojust before the diode turns on.

5.4.9 Thermal behavior

The most stringent temperature requirements for chips are military requirements and those areusually that the chip operate over a range of −50o C to +125o C. Normally the low temperaturemeasurements are done at room temperature. The only other measurements are taken at about+100o C.

49

5.5 Production monitors

A production line is kind of like those old cars with carburettors, you have to keep tuningit periodically. In addition you cannot duplicate production volume anywhere else, so there aresome tests which can only be done in production and nowhere else.

The wafers that go through the production line usually have several microns space betweenthe chips because this is where the chips are split apart. But until the wafer is broken up, thisis perfectly good space to put test circuits and so production monitors are usually placed herebecause it does not cost anything. These are called scribe line monitors.

Basically any test which can be automated and can be performed in a very short time can bea production monitor. A few tests are almost invariably chosen. The gate oxide thickness is veryimportant to monitor because it determines the reliability of the gate oxide and also determinesthe loading of the gates. The drain current at maximum gate and drain voltage is another measurethat gives you a good indication of whether the chip is running toward the fast or slow side. Thethreshold voltages of the FETs determines the off state currents and the stand by current of thechip.

5.6 Scanning Electron Microscopy

If you impinge highly energetic electrons on a semiconductor surface and vary the angle, theywill penetrate to several microns and get diffracted out and by measuring the spatial and angulardistribution of the diffracted electrons you can tell what the semiconductor is made of, so you canget impurity concentration profiles or oxide profiles etc.

The reason SEM is important is that it is not just another electrical test but is a physical test soit is used as an independent confirmation of a profile that you obtained from a simulation or thatyou estimated based on electrical measurements and so forth.

5.7 Striped wafers

Stepper for wafers [41]. Suppose the chip you are manufacturing is 1.67" on the side includingthe scribe lines. On a 12" wafer you can fit 6 columns for a total of 32 chips as shown in the figure5.14.

1

5 10

13

17 18 19 21 22

23 24 25 26 27 28

29 30 31 32

2 3 4

9876

1615141211

20

Figure 5.14: Chips on a wafer.

50

In the case of the wafer of figure 5.14, the mask reticle is stepped sequentially across the wafer32 times so that all 32 of the chips are defined. Striping is a way to take advantage of this fact toobtain more information from a test wafer during process development.

During the step when you perform the channel implant, if you only perform the resist exposureon a one column say chips 1,6,12, 18,24 and 29, and do an implant and then repeat the processusing a different column and a different implant, then you can have 6 columns with 6 different Vth

voltages, but having all other processing the same.The mask misalignment variation between one chip and another, the exposure time etc are

variable but factors which affect the current such as source/drain implants and the sub-diffusionwill be common across the wafer. So, for the most part the difference you see between chips inone column and the next will be due to the different channel implant that you used.

Vth is only one example, there are many process parameters that can be varied between columns.So although it is more time consuming and hence expensive, striping is a way to obtain valuableinformation during process development, and more importantly it is a way to make a decisionas to the process recipe you will use for the first pass of your design, in a timely manner withoutmany iterations. The time saved is the biggest issue.

5.8 Noise

There are three kinds of noise normally characterized in semiconductor devices namely ther-mal (Johnson) noise, shot noise and 1/f noise. Thermal noise occurs in resistive material so thebulk regions of a device will generate thermal noise. The Nyquist equation for the noise spectrumis given in the equation 5.4 where R is the resistance, T is the temperature, k is the Boltzmannconstant and ∆ν is the frequency over which the noise is measured.

〈V 2 〉 = 4kTR∆ν (5.4)

Shot noise occurs in a p-n junction. The equation used for the shot noise in diodes and bipolarjunction transistors is given by the equation 5.5 from [42], where ID is the average value of thecurrent flow and ∆f is the frequency over which the measurement is made.

i2 = 2qID∆f (5.5)

Output

−

+−

+

Figure 5.15: The 1/f noise measurement setup of [43].

1/f noise occurs in FETs. Some papers discussing the measurement of noise in FETs are [44],[45], [46], [47] and [43]. The causes of 1/f noise are not fully understood but the noise power has

51

been measured. The setup used by [43] to measure the 1/f noise is shown in the figure 5.15. At agiven bias point the noise voltage from the FET is directly amplified and analyzed in a spectrumanalyzer and will show a linear reduction with increase in frequency.

5.9 Process skew

In the figure 5.16 you see what happens to an FET when it is processed. The design length thatyou draw on the mask is the Ldrawn, but when you define the gate using lithography and etch it,the length you get will be shorter than the drawn length and is shown in the figure 5.16 as Lgate.After the gate is defined, the source and drain are implanted and annealed, but due to the thermalcycles the wafer goes through after the implants, the implants spread in all directions.

Lgate

Ldrawn

sub−diffusion

Leff

Figure 5.16: Drawn and effective gate length.

Toward the top they are constrained by the interface, but they are not constrained laterally andso they spread under the gate. The extent to which they spread under the gate is the sub-diffusionlength which brings the source and drain closer together hence reducing the effective gate length.Variation in gate length is the first most important cause of process skew.

The threshold voltage is lowered by the sub-diffusion and so an additional implant is used toadjust the threshold voltage. Remember that the source and drain are of the opposite type to thesubstrate that you implant them in, so when the source dopant migrates under the gate, it pullsthe substrate closer to intrinsic, which means the threshold voltage reduces for either PFET orNFET.

In general, the more quantities that you add together, which compensate for each other, themore variation you will have in the end result and the threshold voltage is no exception, and itdoes vary. After the Leff variation, the threshold voltage variation is the next most critical.

Gate oxide usually varies at least 10% probably 15%. For example a gate oxide of a nominalthickness of 20 Å would vary between 17 Å and 23 Å in units of 2.8 Å . If the oxide is on thethin side the electric field across the oxide increases. If the field exceeds the dielectric strength theoxide can breakdown.

At 17 Å there is another problem other than dielectric strength which is tunneling. UsingSchroedinger’s wave equation the wave function of the electrons in the gate and the channel ex-tends across the gate oxide which means there is a reasonable probability they can tunnel throughthe gate oxide.

If the oxide is on the thick side the inversion of the channel may not be complete. So thethreshold voltage rises and the maximum current the FET can conduct at supply voltage reduces.

52

Due to the variation in implantation, the source and drain junctions vary in depth about thenominal. A deeper source and drain than nominal would increase the probability of punchthroughwhere the source and drain depletions come into contact causing high leakage current and makingthe FETs inoperable.

But even if this does not occur, the variation in source and drain junction depth would modifythe field in the channel region and thus the current flow. If the source and drain junctions are tooshallow, the FETs would have trouble turning on and the current would be low. In any case sincemost source and drains are made using multiple implants the relative positions of the implantswould vary causing a variation in current.

The junction area depends on the junction depth and when that varies so too does the thejunction capacitances. This varies the load seen by logic gates since the drains typically needto vary between supply and ground as the logic is evaluated. So any extra capacitance linearlyincreases current consumption and increases the charge times almost linearly as well.

The region between the contact and the edge of the channel in a source and drain as shown inthe figure 5.17 is resistive as all semiconductor is. This can vary due to the variation in the sheetresistance of the source and drain regions but it can also vary due to variation in the distance fromthe contact to the edge of the channel.

Figure 5.17: Source resistance.

The source resistance is defined on a per unit width basis. If the minimum gate length devicewith a width of 1µ conducts 1 mA then the source resistance needs to be much less than 100 Ω.

Where the metal connection is in contact with the Si of the source or drain it creates a Schottkycontact which is a diode and thus has rectification properties and a potential barrier both of whichare undesirable in a source or drain contact.

The ohmic contact is a way of connecting a metal to either p or n type Si without creatinga diode. The silicided contact we discussed previously is one method of achieving this goal. Nomatter how you do it this contact region has a significant resistance. You can reduce this resistanceby increasing the number of contacts.

Assuming two contacts per µ gate width, using the same logic as in the case of the sourceresistance, each contact needs to be much less than 50 Ω.

The interconnect lines vary in width due to the lithography process. This line width variationcauses a variation in the series resistance as well as the line to line capacitance. The resistanceincrease is the bigger issue and so interconnect width variation is monitored.

If parameters that contribute to a total skew have some relationship to each other then they arenot independent and this could be either good news or bad news.

The best example is the relationship between the load due to gate capacitance and the currentdriven by the FET. If the Tox is lower than it should be the load capacitance due to the FETs’ gatesincreases, but at the same time the inversion in the FETs’ channels reduces and therefore they drivemore current, so these two effects partially compensate each other which is good news.

Similarly if the sub-diffusion becomes larger the effective gate length decreases, but the sourceresistance increases due to the increase in distance from the contact to the edge of the channel. Sohere again there is some compensation which is also good.

53

Process space is usually depicted as shown in the figure 5.18. The quantity on the axis is speed.The fact that the major axis of the ellipse lies along the diagonal can be interpreted to mean thatthere tends to be some correlation between P and N. For example if the gate oxide is running onthe thin side, it would probably cause both the P and the N FETs to have thinner gate oxide thannormal etc.

P

N

f−s

s−fs−s

f−f

Figure 5.18: Process space.

This is also known as the Gaussian distribution and is given by the relationship

y =1√2π

e−(x−xm)2/(2σ) (5.6)

Most natural phenomenon tend to have a Gaussian distribution. The two parameters thatdefine a Gaussian are the mean and the variance σ. The integral of the distribution from −3σ to+3σ is larger than 99%.

Metrics that are normally monitored for an FET are modeled as normal distributions. Forexample the threshold voltage variation is modeled with a mean and a variance. Tox cannot beGaussian because the oxide grows in mono-layers 2.8 Åthick. The sub-diffusion is Gaussian.

3 σ analysis is a very conservative method of analyzing the yield of a circuit. In general it isincorrect because here you model the slow 3 σ corner with the +3 σ Vth, the +3 σ Tox, the +3 σ gatelength etc., though the probability of all of them occurring at the same time is remote. Given thatthe yield for most large circuits rarely exceeds 80%, this slow corner will definitely not make thecut.

If your circuit has only PFETs and NFETs the corners that are analyzed may be slow-slow,slow-fast, fast-slow and fast-fast to cover all the possible problems that could occur. The slow-slow corner will simply expose the regions of the circuit that do not make the cut for speed. Theslow-fast and the fast-slow will expose the problems when the cross-over voltage is low and highrespectively.

The fast-fast corner will expose problems such as race conditions when the output in a portionof the circuit is incorrect because one of the signals used in the evaluation arrived too early. Thiscorner is also where the leakage is high and the power consumption is high and hence the heatingis excessive causing the chip to essentially burn up.

5.10 Burn in testing

Burn in is a kind of stress testing. It is a substitute for lifetime testing. It is a proven fact that themean time to failure measured when operating the chip at an elevated temperature has a strong

54

correlation to the mean time to failure when operating under normal conditions. So if the chip istested at an ambient of 125 oC for two days it is equivalent to testing the chip continuously forperhaps a month.

The key is not to test the lifetime of a chip but rather to flush out possible defects. Since chiptesters are expensive you would not try heating the DUT to high temperatures, but rather youwould create a small regulated oven within which perhaps ten chips are placed on a PCB withlines for supply, ground, system clock and perhaps even a few signal lines to apply simple tests.After running the chips in this fashion for a few days the chip can be removed from the oven andtested normally on a chip tester.

5.11 Ion implant to create connections

Sometimes during the first pass of a design there may be errors in the interconnects, for ex-ample a via may be missing. So during the testing of the prototype the chip may not functioncorrectly and this error may be discovered. But that does not mean that this is the only error inthe interconnections or that if this via were in place the circuit would function correctly perhapsbecause the transistor sizing is incorrect.

So the prototype needs to be fixed and the testing needs to be continued and the other errorsidentified. Ion implantation is one way this can be achieved. This is especially easy if all that ismissing is a via. Since highly doped silicon is very conductive the area which should have had avia is implanted at exactly the correct energy causing a large concentration of donors to be presentbetween the metalization of the interconnects that were to be connected by the via. Although theresistance may be different from the via the connection is established and testing can continue. Ifa short section of interconnect is missing this may be more difficult to fix because then the implantwould have to laid as a path but sometimes even that can be done.

5.12 Thermal imaging

Occasionally it does happen that portions of the chip are malfunctioning and you cannot figureout why. If overheating is suspected as a cause you can run simulations to generate a map of heatgeneration vs. position in the chip but it would be a little difficult because the extracted netlistdoes not have the position of the FETs and also circuit simulators do not solve the heat equation.

Just as liquid crystal displays change opacity with applied voltage there are organic com-pounds which change color with heat. So sometimes it does happen that the top of a packagedchip is opened up and such a liquid or gel is introduced on top of the chip surface and the chip istested for functionality while being observed under a microscope. The colors that you observe arereal not simulated so you can obtain experimental evidence of the heat patterns.

55

Chapter 6

Chip fabrication

6.1 Wafer preparation

The very first step in building a microchip is having a clean wafer to build it on. Si is a crystaland crystals are grown. Sand is mostly Si with some impurities thrown in, so the starting point isto melt sand and remove all the impurities from it.

Figure 6.1: Liquid Czochralski pull.

The Liquid Czochralski pull is a way to create Si ingots which can be sliced up into wafersmuch like salami. The wafers used nowadays are 12 inches in diameter but previous generationswere 8 inches and 6 inches respectively. One ingot of 12 inch diameter may have a retail value ofa 100k or so. The cleaned out and molten Si is called the Si slurry.

In the figure 6.1 the vat contains the slurry which is maintained in a molten state. A rod witha piece of pure silicon attached to it’s tip is lowered until the silicon seed is just in contact with theslurry and then it is lifted and rotated at a special rate

The seed causes the slurry that adheres to it to align to the crystal symmetry and as it is liftedthe new material added to the seed cools and solidifies into a crystal layer. The width of the seedkeeps increasing until it reaches a maximum width which is determined by the rotation and liftrate. You could guess that rotating the seed rapidly would reduce the diameter of the pulled ingotdue to the shear forces at the edges.

After the cutting process the wafers are not of even width and they have a rough uneven

56

surface unsuitable for chip growth. So they are polished in a diamond paste slurry. They are placedface down on a table about 6 foot round which has a wet slurry of fine diamond chips and silicon.

The table is made to rotate back and forth and as it does so the wafers slide on the slurry andthe surface of the wafers is ground to a fine sheen. Both sides are ground although only one side isthe actual surface to be used so may be polished more finely. Then they are removed and bathedand dried. To remove damage caused during the polishing, the surface is etched [48].

The FETs on the wafer are created by implanting impurities into the surface of the wafer so thesurface has to be absolutely flat and absolutely crystalline and so an epitaxial layer is grown onthe raw wafer [49], [50]. before any further processing is done. One bonus is that you can dopethis layer to have exactly the right amount of dopant that you want in the chip substrate.

There are many types of epitaxy such as liquid phase epitaxy (LPE), vapor phase epitaxy (VPE)or molecular beam epitaxy (MBE). MBE is the slowest, requires the most amount of work to main-tain the machine and gives the best results. It is too expensive to be used to grow the substrate,whereas VPE is more popular.

6.2 Lithography

Resist is a gel like fluid that is used in lithography [51]. Usually a blob of resist is placed at thecenter of the wafer and it is spun rapidly causing the gel to spread outward and cover the entirewafer. The surface tension causes it to stop at the edge and even out. Then it is heated in an ovento harden it.

Then a mask is placed over it and aligned to the wafer and the wafer is exposed to deepultraviolet light. The light is absorbed by the exposed surface and causes a chemical reaction tooccur. There are two types of resists positive and negative. The first type hardens in the presenceof the UV light and becomes impervious to specific etchants. The second becomes susceptible tocertain etchants.

Mask

Wafer

Resist

Coherent lightsource

Figure 6.2: The lithography process.

The process of creating etch masks using resist layers and UV light is called lithography asshown in the figure 6.2. It used to be that the minimum feature that could be defined by the UVlight used in the exposure was half a wavelength, however with the advent of phase shift masksthat has been reduced to a quarter of a wavelength.

The energy that the photoresist needs to absorb rises as the inverse of the wavelength of UVlight and it has been getting progressively more difficult to develop resist that can absorb thephotons and change chemical properties without breaking down.

The stepper [41] is used during lithography to expose each chip using the mask, so it has tostep across the wafer as shown in the figure 6.3. Each time it moves a step it aligns the cross hairs

57

Figure 6.3: Stepping across the wafer.

on the mask to the corresponding marks on the wafer so during the lithography process there isroom for misalignment and there is variation between different chips even on the same wafer.

6.3 Mask generation

For a typical process you may make no more than 40 distinct masks but you may have as manyas 150 processing steps. So many masks are used more than once and some of them are mixed andmatched to create the final structures.

A single transistor uses many design layers in it’s physical structure. The source and drainimplants would be a design layer. The contacts for the source and drain are another design layer.The gate is another design layer. Notice that sometimes design layers are responsible for morethan one material, for example the gate design layer includes both the gate oxide and the gatepoly. So a design layer is a layer as you would see it when drawing the physical layout of the chip.

The mask layers are extracted from the design layers. The extraction process achieves twogoals. Firstly it converts the design layers into the actual masking needed by the processing steps.Secondly it seeks to reduce the number of masks needed by using masks for more than a singlestep.

The only function of a mask is to filter light and light travels in straight lines. So it is possibleto combine two masks when exposing resist. For example let us suppose that you have a mask Athat outlines all the source and drain regions. In addition you have masks B and C that outline theareas containing the PFETs and the areas containing the NFETs respectively.

So if you combine the masks A and B you only get the PFET source and drains whereas if youcombine the masks A and C you only get the NFET source and drains. There are many instanceswhen masks are combined as a way to reduce the total number of masks.

The best example of a very fine mask is the mask that defines the gate oxide of the FETs becausethe feature size is a quarter of the wavelength used in the exposure. Such a mask is obviouslygoing to be very expensive because it has to be perfect and the tolerance has to be very small. Thistype of mask will be expensive.

But for some masks you don’t need such a small tolerance and also the features in the maskitself are very large. For example a mask to define a p well surrounding many NFETs will beseveral microns in size and so it is a lot less critical. Some masks have features even larger than

58

that. These masks have less of a tolerance problem and can be made more cheaply and they arecalled coarse masks.

6.4 Oxide growth

Oxide can be either grown or deposited. You can only grow oxide on a Si surface. Si isconsumed in the process. But you can deposit oxide even on the areas that don’t have exposedsilicon. If you don’t need pure oxide but just a thick layer of silica you can spin on a silica gel andbake it into a silica layer.

Dry oxidation [52] is done by heating the wafer in the presence of oxygen and nitrogen. Thenitrogen does not participate in the reaction. Typically dry oxide is grown at about 850oC or highertemperatures. The reaction is the equation 6.1.

Si + O2 → SiO2 (6.1)

Dry oxide is the purest oxide you can grow and is used to make the gate oxide of field-effecttransistors. It’s purity gives it the maximum dielectric strength. The growth rate is the lowestbecause once the uppermost layers of silicon are oxidized, the oxygen has to diffuse through theoxide to get to the layers of silicon to react with. As the growth progresses the thickness of theoxide increases, so the growth rate slows down.

Wet oxide [53], [54] grows much faster than dry oxide but it is not as pure. The method usedis to heat the wafer in the presence of steam. The reaction is the equation 6.2.

Si + 2H2O → SiO2 + 2H2 (6.2)

Oxide can be deposited using Chemical Vapor Deposition [55]. You could use silicon tetra-chloride with hydrogen and carbon-di-oxide as in the equation 6.3. Or you can use silane [56] asin the equation 6.4.

SiCl4 + 2CO2 + 2H2 → SiO2 + 2CO + 4HCl (6.3)

SiH4 + 2O2 → SiO2 + 2H2O (6.4)

6.5 Doping

Si has 4 valence electrons and has covalent bonds with four neighboring atoms. The processof doping Si is to substitute a Si atom with another atom that has either 3 valence electrons or 5valence electrons.

Boron has 3 valence electrons. When it is substituted for a silicon atom it cannot make bondswith all four of it’s neighbors. The important result of this is that if an electron was looking for aplace to jump to, then this boron atom could accept it. Of course it won’t be charge neutral anymore however the crystal as a whole would be charge neutral.

This available state to which an electron can jump is called a hole. In an energy diagram holesfloat meaning that if you uniformly dope a Si crystal with Boron the electrons in the valence bandwill settle to the lowest levels of the Fermi-Dirac distribution which means the highest energylevels in the conduction band are the vacant ones. So Si doped with Boron is p type.

Arsenic and Phosphorous both have 5 valence electrons. So if either is substituted for a Siatom one electron is left over. Four of it’s electrons + the four electrons in the covalent bonds withit’s four Si neighbors create an octet completing the outermost shell. The fifth electron will havea higher energy and so it is susceptible to moving around. For this reason Si doped with Arsenicor Phosphorous is n type.

59

6.6 Implantation

The figure 6.4 shows the structure of the implantation process. On the left is a chamber to heatand ionize the dopant, this could involve an RF voltage to cause the ionization. Then you havetwo plates to accelerate and collimate the dopant ion beam. There are other structures that areused to weed out different velocities because you need a beam of ions all of which have the samevelocity and direction.

Figure 6.4: The implantation process.

The figure 6.5 shows what happens to the implant as you increase the energy. The peak getsdeeper, but the distribution also spreads a little in depth.

Low HighMedium

Figure 6.5: Implant energy.

The energy determines the depth of the peak and the depth distribution of dopants but it doesnot determine the actual amount of dopants that is implanted. The dose determines that. Thedopant actually implanted is linearly dependent on the dose.

Implants are usually done at an angle of 17o off the vertical [57]. This is to avoid an effect calledchanneling. When the ions enter the crystal they see tunnels. If the ions are angled so that theyhit the walls the implanted ions stop in the interstices of the crystal and the penetration depth ispredictable. If the ions are aligned to the tunnels they can penetrate to a very large distance. So byexperimental work the angle of 17o was found to give the best results.

Annealing [58] is a process step that follows an implantation step to repair the damage causeby implantation. The way this is done is to heat the region that was implanted to a temperaturehigh enough that the crystalline bonds become looser and the atoms can realign themselves dueto thermal agitation and in the process the semiconductor recrystallizes.

This step is required because otherwise there will be a lot of dangling bonds or other recom-bination traps and so the leakage current rises. There are many ways to perform annealing. Thesimplest is to heat the entire wafer. There are more localized annealing methods which use a laserto supply the energy and a mask to select which areas to heat. As a general rule engineers havefound that using a higher temperature for a shorter duration causes less diffusion of dopants thana low temperature for a longer duration.

60

6.7 Etching

Acids based on Fluorine and Chlorine will react with Si and SiO2. So if you dip a Si waferin an acid bath the exposed portions of the wafer will be eaten away. An isotropic etch is an etchwhich does not favor one direction over another i.e. it etches in all directions at the same rate. Ananisotropic etc is more directional, so if the anisotropy is 2 then it etches two units into the waferfor every 1 unit parallel to the surface.

6.7.1 Wet etch

A wet etch is an acid bath [59]. It is isotropic. In the figure 6.6 you see what happens when youdip a wafer in a wet etch. The top figure is before the etch starts, the second figure is the outcomethat you want. The third figure is what happens when you stop the etch when you first etch downto the Si surface. There is considerable SiO2 still remaining that needs to be removed. On theother hand, if you continue the etch until you remove the SiO2 you want to remove, it appears asin the fourth figure where the etch has undercut under the resist and has also created a furrow inthe Si.

SiO2

Resist

Si

SiO2

Resist

Si

SiO2

Resist

Si

SiO2

Resist

Si

Figure 6.6: An isotropic etch.

The etchants used for wet etching of Si [60] are hydrofluoric and nitrous acid by the equation6.5. The equation for wet etch of SiO2 by hydrofluoric acid [61] is given by the equation 6.6.Aluminum can be etched with bases or acids [62] as in the equations 6.7 and 6.8.

18HF + 4HNO3 + 3Si → 3H2SiF6 + 4NO + 8H2O (6.5)

SiO2 + 6HF → H2SiF6 + 2H2O (6.6)

2Al + 6NaOH → 2Na3AlO3 + 3H2 (6.7)

2Al + 6HCl → 2AlCl3 + 3H2 (6.8)

61

6.7.2 Reactive ion etch

A reactive ion etch system is outlined in the figure 6.7. RIE is not the most anisotropic of etchesbut it is sufficiently anisotropic and also sufficiently cheap to be the most popular choice. Ionizedand partially ionized CF4 gas in the RIE chamber acts as the etchant and the electric field is a AChigh voltage applied between a plate at the back of wafers and a plate placed above it [63].

The reason for using a high frequency AC voltage is that the ions describe circles in the fieldand collide with and ionize other neutral molecules. The larger the voltage or the lower the fre-quency, the larger the radius described by the ions.

SiO2

Resist

Si

Figure 6.7: Reactive Ion Etch equipment.

RIE equipment operate at 13.56 MHz and believe it or not this frequency is regulated by theFCC because communication equipment also operates at nearby frequencies and so the RIE equip-ment must stay within it’s allocated band.

6.7.3 Reactive ion beam etch

SiO2

Resist

Si

Figure 6.8: Reactive Ion Beam Etch equipment.

62

Even RIE etching is only slightly anisotropic. To get better anisotropy reactive ion beam etch-ing is used [64]. Here the ions are accelerated in a field perpendicular to the wafer surface andessentially filtered so that the ions are moving in a direction perpendicular to the wafer when theycome into contact with it. Even the collision force contributes to the anisotropy.

6.8 Sputtering

Sputtering is the process used to deposit metal used in creating the interconnects. The earliestgates of the FETs used to be deposited this way. Sputtering [65] is done as shown in the figure 6.9.

Target

Wafer chuck

Plasma−

+

Figure 6.9: Sputtering a metal.

The ions in the plasma are accelerated in a high electric field to impinge on the target which ismade of the metal to be sputtered. The collision knocks off metal atoms from the target which arethen deposited on the wafers on the wafer chuck.

6.9 Polysilicon

One does not normally deposit silicon after all, Si is a crystal and it has to be grown. Poly-crystalline silicon or polysilicon or just poly for short is deposited Si and it is not really crystalline[66]. It contains domains within which the Si is crystalline but the orientation of the crystal in thedifferent domains are not aligned with each other.

But polysilicon has an advantage when used as gate material because it does not need to be de-posited at high temperatures like aluminum so it does not damage the gate oxide. The depositionis done as in the equation 6.9.

SiH4 → Si + 2H2 (6.9)

6.10 Sintering

Sintering is a way of creating a layer of material which is neither semiconductor nor metal butis a kind of amalgam [67], [68], [69]. Such a material has advantages in creating a junction betweenmetal and semiconductor because it reduces the Schottky barrier. Sintering is done by sputteringmetal onto the silicon surface then heating it to perhaps 500o C or so to cause the metal to fusewith the silicon.

The source and drain junctions are a big part of the series resistance of a FET when it is fullyturned on. One way to reduce the source drain resistance is to sinter them with metal and thisprocess is called siliciding [70]. The siliciding does not include the whole source and drain regionbecause you still want the source to body junction and the drain to body junctions to be semicon-ductor junctions.

63

The gate polysilicon has a resistance problem too and if the mask used during the silicidingincludes the source and drain regions as well as the gate poly, it is called saliciding [71]. A sali-cided gate is more effective in inverting the channel evenly especially if you have a wide gate atminimum gate length operating at very high clock speeds.

6.11 Thermal budget constraints

There are many implant steps used in a typical chip process. These steps are evenly inter-spersed between other steps out of a total of 150 steps or so. Some of these other steps couldinvolve heating the wafer to temperatures of 750o C or more, for example oxide growth. Thesesteps will cause the implanted regions to diffuse [40]. Diffused source and drains create a problembecause it could lead to punch through. In addition sources and drains are often created usingmultiple carefully chosen implants. Diffusion could cause them to smear modifying the devicecharacteristics.

64

Chapter 7

Logic circuits

7.1 Boolean logic

Digital circuits use boolean logic. In boolean logic a signal only has one of two states. It can beeither high or low. The two states are numerically defined as 1 and 0. There are only three basicoperations used in boolean logic from which all other operations can be derived. These threeoperations are and, or and invert.

Invert AND OR

Figure 7.1: The logic symbols.

1 2

0

3

Figure 7.2: A CMOS inverter.

The invert operation is the simplest operation and it is a unary operator meaning that it op-erates on a single operand. For example a operates on a. The line over the operand indicates theoperation. If a was 1, then a is 0 and vice versa. If you have an expression under the line, thenevaluate the expression and then operate on it. The symbol for the inverter is the first symbol inthe figure 7.1. The FET circuit for a CMOS inverter is shown in the figure 7.2.

We have already discussed the behavior of an FET and the electrical behavior of the inverter isshown in the figure 7.3. The straight line is the input and in the figure, the supply voltage is 2V.As the input voltage rises the output voltage falls, but not linearly.

65

Vth of NFET

Vth of PFET

0V

2V

Voltage of node 1 & 2

1

Figure 7.3: The voltage relationship of an CMOS inverter.

The current flowing from the supply to ground is shown in the figure 7.4. As you can see itrises from zero to a maximum and then falls back to zero.

0V

VthpVthn

I

Figure 7.4: The iv relationship of an CMOS inverter.

The reason it does this is shown in the figure 7.5. This shows the regions of operation throughwhich the circuit goes through as the input voltage rises from 0 to supply voltage. Initially whenthe input voltage is 0V, only the PFET is turned on. No current is flowing because the NFET isturned off. Since the PFET is on, the output voltage is clamped to supply and is 2V. As the inputvoltage rises, the NFET moves into the sub-threshold region and a small amount of current flows.Now the output voltage is that of a voltage divider. Since the resistance offered by the NFET ismuch larger than that of the PFET, the output voltage is still close to the supply voltage.

As the input voltage rises still further, it increases beyond the threshold voltage of the NFETand so it turns on. So now both the PFET and the NFET are on, so the current reaches a maximum.As the input voltage increases still further the difference between the input voltage and the supplyvoltage reduces to below the threshold voltage of the PFET. So now the PFET starts to turn off andas the input voltage reaches the supply voltage the PFET is fully turned off.

The and operation is a binary operation meaning it operates on two operators. The and opera-tion is denoted by the · sign, and so the expression (a · b) is the and operation on a and b. If both aand b are 1, then the result is 1, otherwise the result is 0. If either of the operands is an expression,

66

Vth of PFET

PFET

is on

PFET & NFET

are on

NFET

is on

Vth of NFET

0V

Figure 7.5: The regions of operation of a CMOS inverter.

then evaluate the expression and substitute the result for that operand. The symbol for the andoperation is the second symbol in the figure 7.1.

NAND NOR XOR

Figure 7.6: More logic symbols.

The or operation is also a binary operation. It is denoted by the + sign, and so the expression(a + b) is the or operation on a and b. If both a and b are 0, then the result is 0, otherwise the resultis 1. The symbol for the or operation is shown in the figure 7.1.

I

3

a

b

o

Figure 7.7: The CMOS nand gate.

The nand, nor and xor operations shown in the figure 7.6 are derived operations. The nandoperation is the and operation followed by an invert operation. In boolean logic we would saythat the nand operation on a and b is (a · b), so first do the and operation and obtain the result,then invert that result. However, when using FETs to create logic circuits it takes fewer transistorsto make a nand gate than it does to make an and gate, so the more realistic way of looking at it isto think of an and gate as a nand gate followed by an inverter.

The FET circuit diagram for a nand gate is shown in the figure 7.7. The lower half of the circuitis the series combination of NFETs and the upper part of the circuit is the two PFETs in parallel. If

67

both the inputs are 1, then both the NFETs are turned on whereas both the PFETs are turned off.So in this case the output is 0. For any other combination of inputs the output is 1.

I

o

a b c

a

b

c

Supply

V2

V1

0

V3

Figure 7.8: Three input CMOS nand gate.

The case when you have three inputs is shown in the figure 7.8. For most processes three isprobably a reasonable limit to the number of inputs. In the figure the voltages V1, V2 and V3 liebetween supply voltage and ground. Because there are four FETS in series between the supplyand ground, they have to share the supply voltage. In the case of the nand gate the PFETs arenot affected, however the three NFETs are affected by having to share the supply voltage in thismanner. Of the three NFETs the lowermost one is the least affected. All that happens to it is thatit’s drain voltage is not as high as it could be.

a

b o

0

3

Currentflow

Figure 7.9: The CMOS nor gate.

For the NFET in the middle, the body of the FET is still at ground, but it’s source is at V1 whichis higher than ground. Effectively it sees a back bias. This back bias will prevent it from turningon as easily as the lowermost NFET. The topmost NFET sees the largest back bias. For this reasonthree input gates don’t use the same sizing for all three NFETs. Instead the topmost NFET is thewidest and the lowermost NFET is about normal size. This in turn means that the signal that has

68

to charge the gate of the topmost NFET sees a larger gate area. The net result is that if you use athree input gate, you need to drive the topmost NFET with the strongest signal of the three inputs.

The nor operation is the or operation followed by an invert operation. In boolean logic wewould say that the nor operation on a and b is (a + b), so first do the or operation and obtain theresult, then invert that result.

The FET circuit diagram for a nor gate is shown in the figure 7.9. In this case you see that onlyif both the inputs are 0, the output is 1. For any other combination of inputs the output is 0.

The exclusive or operation or the xor operation for short is different from the or operation inonly one instance. If both a and b are 1, the or operation yields a 1, however the xor operationyields a 0. The symbol for the xor is shown in the figure 7.6. The xor operation on a and b is givenby (a · b + a · b) So it seems that the xor operation requires two inverters, two and gates and oneor gate which makes it a fairly large circuit for a single gate.

A + A = 1 (7.1)

A + 1 = 1 (7.2)

A · 0 = 0 (7.3)

Commutative property:A · B = B · A (7.4)

Associative property:A · B + A · C = A · (B + C) (7.5)

De Morgan’s theorem:A + B = A · B (7.6)

A · B = A + B (7.7)

A truth table is a way of defining the behavior of an arbitrary logic circuit. It is just a table thathas a column for each signal. Some of the signals may be inputs and some of them may be outputs.So if you have inputs a, b and c and you have outputs d and e, then you would have five columnsin the table. For each combination of a, b and c you would put down the value of the outputs dand e. Keep in mind that the table need not be complete i.e. you don’t have to have a row for eachcombination of a, b and c, just enough rows to define the states you need. A consolidated truthtable showing the output of all the binary operators is shown in the table 7.1

Input1 Input2 and or nand nor xor0 0 0 0 1 1 00 1 0 1 1 0 11 1 1 1 0 0 01 0 0 1 1 0 1

Table 7.1: Truth table for the binary gates.

If you add a clock to combinational logic you get sequential logic. In sequential logic theoutput can be a function of the output during the previous clock cycle. You can store the values inflip-flops and use them as needed.

69

7.2 Flip-flops

The simplest flip-flop (and the most popular) is the D flip-flop. The logic diagram for a D flip-flop is shown in the figure 7.10. You can buy a discreet logic chip containing 5 D flip flops in a 20pin package for probably 50 ¢.

Q

QD

Clk#1

#2

#3

#4b

a

Figure 7.10: Logic diagram for a D flip-flop.

Input

Clock

Output

Output

Clock pulse

Start reading D

Start writing Q’

Q and Q’ are ready

Q

Q’

D

Figure 7.11: A D flip-flop.

The way a D flip-flop functions is shown in figure 7.11. On the rising edge of the clock pulsethe value at D is read in and at the falling edge of the clock pulse it is written to Q and it’s inversewritten to Q. When the clock is low a = 1 and b = 1. The two nand gates #1 and #2 form afeedback loop and the value at Q is held constant.

When a clock pulse arrives a → 0 within 1 gate delay, and within the same delay b → D.Now, regardless of what value Q had, it now goes to 1 within an additional gate delay. Now, sinceb = D, Q → D. At this time Q is still 1. However when the Clk goes low a → 1 and b → 1, and thevalue of Q is set.

7.3 The pass gate

The pass-gate shown in the figure 7.13 is a switch. The pass-gate uses a back to back NFETand PFET pair to conduct a signal because an NFET conducts a good 0 but can conduct only up to(V dd − V t) as shown in the figure 7.12 where V dd is the supply voltage and V t is the thresholdvoltage of the NFET so they conduct a 1 only poorly i.e. the 1 that is conducted for the case whenthe supply voltage is 2.5 volts and the threshold voltage is 0.5 volts is only 2.0 volts and not 2.5volts as required. This is because when the source of the gate is raised to (V dd − V t), the NFETturns off.

70

Similarly the PFET conducts a good 1 but a poor 0 because the PFET will turn off when thesource voltage is reduced to the threshold voltage of the PFET. So for the case when the supplyvoltage is 2.5 volts and the threshold voltage is 0.5 volts, the 0 that is conducted is 0.5 volts andnot 0 volts as required. So, by using a back to back NFET and PFET pair, with a 0 on the PFET gateand 1 on the NFET gate, any signal either 0 or 1 can be conducted properly.

Vdd − Vth

Vdd

Vdd

Figure 7.12: An NFET conducting a 1.

C = 1, Cb = 0 ==> High impedance

C = 0, Cb = 1 ==> Conduction

Cb

C

Figure 7.13: A pass gate.

7.4 Karnaugh maps

10

11

01

00

00 01 11 10

1 x x

111

abcd

1

Figure 7.14: A Karnaugh map.

The purpose of a Karnaugh map is to help you extract the minimum logic that satisfies thetruth table you have defined for a circuit. A sample Karnaugh map for a logic circuit containingfour inputs a, b, c and d and one output is shown in the figure 7.14. The four combinations ofsignals a and b are listed on the left and the four combinations of signals c and d are listed across

71

the top. Notice that in these combinations only one bit changes at a time as you go from top tobottom and from left to right. Within the map, you place a 1 for each row and column combinationthat should output a 1. If you leave it blank it means a 0. If you specifically don’t care what theoutput is for than row column combination you can place an x which can be used as either a 0 ora 1.

Now you start grouping the 1’s together. The grouping has to be rectangular in shape andyou can wrap around the edge of the table i.e. the right edge continues on the left edge and thetop edge continues on the bottom edge. Any element can appear in multiple groups and the goalis to create the largest groupings possible. Each group that you define will yield a term in thefinal boolean expression. The larger the grouping the smaller the term. If you use an x within agrouping you are assigning a 1 as the output for that row column combination. If an x is not usedin any grouping the output for that row column combination will be 0. For the Karnaugh map wehave defined the expression yielded is given by:

a · c + b · c · d (7.8)

7.5 Finite state machines

000

001

011

0/0

1/0

0/00/0

0/0

1/1

111

110

1/01/0

1/0

0/0

Figure 7.15: A finite state machine.

Designing finite state machines is fundamental to digital circuit design. The key feature of afinite state machine is defining a set of states which a system could be in. Without thinking interms of finite state machines, any sequential logic would become difficult to manage as you addmore gates.

Let us design an FSM to read the pattern 01011. So a stream of bits are coming in and if itsees this pattern, it outputs a 1 otherwise the output is 0. The state machine will look as shownin the figure 7.15. It has five states and one output. In this diagram the arrows are labeled asinput/output. Notice that when picking the sequence designating the five states we picked themso that only one bit changes at a time, this we have a good chance of getting a logic that is minimal.

The implementation of the FSM is shown in the figure 7.16. The state of the FSM is abc. Toanalyze this diagram use the input 00110101011 as shown in the table 7.2. The state abc is after

72

I abc Output0 001 00 001 01 011 01 000 00 001 01 011 00 111 01 110 00 111 01 110 01 000 1

Table 7.2: FSM states for 00110101011 input.

the input I is received. The truth table for the FSM of figure 7.17 is shown in table 7.3.

an−1 bn−1 cn−1 I O an bn cn

0 0 0 0 0 0 0 10 0 0 1 0 0 0 00 0 1 0 0 0 0 10 0 1 1 0 0 1 10 1 0 0 0 0 0 00 1 0 1 0 0 0 00 1 1 0 0 1 1 10 1 1 1 0 0 0 01 0 0 0 0 0 0 01 0 0 1 0 0 0 01 0 1 0 0 0 0 01 0 1 1 0 0 0 01 1 0 0 0 1 1 11 1 0 1 1 0 0 01 1 1 0 0 0 0 11 1 1 1 0 1 1 0

Table 7.3: Truth table for FSM of figure 7.17.

an = an−1bn−1(cn−1I + cn−1I) + an−1bn−1cn−1I (7.9)

bn = an−1bn−1(cn−1I + cn−1I) + an−1cn−1(bn−1I + bn−1I) (7.10)

cn = an−1bn−1(cn−1 + I) + bn−1I(an−1 + cn−1) (7.11)

O = an−1bn−1cn−1I (7.12)

73

OL3

L2

L1

L0

a

b

c

I

Figure 7.16: An FSM implementation.

When you power on a circuit, it may come up in any state. Hang states are states which haveno way of changing to a state that you designed for. For example in the figure 7.15 there are threestates that are unaccounted for namely 010, 100 and 101. Since these are three out of eight possiblestates you could guess that there is a 37.5 % probability that the FSM will be in one of these stateswhen it powers up. If it did it might not respond to any input, so the FSM would never changestates. If it did this, then the system is said to be in a hang state. To avoid hang states we simplyadd a path from these states to a known state as shown in the figure 7.17. The x indicates a don’tcare.

7.6 Domino logic

Domino logic is the most common form of dynamic logic. Dynamic logic is used in micro-processors for two reasons namely that the size of the circuit is smaller and for a given supplyvoltage you can implement logic that would not be feasible in CMOS. Dynamic logic is based onthe transient movement of stored charge so it is particularly layout sensitive. A generic stage is asshown in the figure 7.18. The dotted box contains the logic to be realized.

The operation is divided into two parts precharge and evaluate. In this figure the pmos tran-sistor is the pull-up transistor. It is turned on when the clock is low. It charges up the source anddrains of the nmos logic in the dotted box. Since the nmos pull down transistor is off current cannotflow, but the logic inputs decide what charge is stored on the source and drain capacitors in thelogic box and on the input capacitance of the load.

When the clock goes high the logic is evaluated. The pfet pull up transistor is turned off andthe nfet pull down is turned on. At this time if the logic in the dotted box evaluates to 1, the sourceand drain capacitors are discharged. However if the logic in the dotted box evaluates to 0, then

74

000

001

011

0/0

1/0

0/00/0

0/0

1/1

111

110

1/01/0

1/0

0/0

010 100 101

x/0

x/0x/0

Figure 7.17: Avoiding hang states.

Clock

Figure 7.18: A Domino logic stage.

75

the charge stored on the source and drain contacts is visible to the output.Since the charge stored was originally stored based on the pull up pfet being turned on there

will be some charge redistribution when the pull down nfet is on instead. The sizing of the transis-tors needs to be such that this charge distribution does not result in a change in the output value.This is only a concern for the case where the logic in the dotted box evaluates to 0. A method ofanalyzing dynamic logic is outlined in [72].

Since all capacitances are leaky, you cannot depend on the evaluated logic to hold for muchafter the end of the cycle, so it is timing critical that you take the output as soon as possible and dosomething with it i.e. this is dynamic logic and it is only valid as long as the data keeps flowing.Often CMOS logic is interspersed with the dynamic logic to actively pull many signals to either asolid 1 or 0.

76

Chapter 8

Analog circuits

8.1 Current mirror

Consider the circuit in the figure 8.1. The PFET controls the currents I1, I2, I3 and I4. If theareas of the BJTs are A1, A2, A3 and A4 respectively, then the ratio of the currents is given by theequation 8.1.

I1

A1=

I2

A2=

I3

A3=

I4

A4(8.1)

A1 A2 A3 A4

I2 I3 I4I1

Vc

Figure 8.1: A current mirror.

8.2 Current sources

The ideal current source can drive a fixed current no matter how large the resistance it isdriving into. In real life there is no such thing as an ideal current source. In real life currentsources are essentially variable resistors which have the property that the resistance drops as thevoltage across the source drops and rises as the voltage across the source rises. In this way theycan attempt to maintain a constant current.

The driving point impedance is a concept which is only valid over a range of current outputby a current source. It is based on a Thevenin equivalent circuit. Compare the two circuits shownin the figure 8.2.

If the load resistance Rl in the figure 8.2 is 1 Ω then either of the two circuits will drive 1 Aof current through the load. But if the load resistance is 20 Ω, then the circuit on the left will only

77

Rl

99

100Rl

9

10−

+

−

+

Figure 8.2: Different driving point impedance.

drive 0.345 A through the load, whereas the circuit on the right will drive 0.840 A through theload.

Now, since your goal was to drive 1 A through the load, then if your load varies between 1 Ωand 20 Ω, then it is clear that the circuit on the right does a better job of meeting the requirement.So the Thevenin impedance of your current source is called it’s driving point impedance and youwould like this to be much higher than the maximum value that the load resistance can become.

−

+

−

+

Figure 8.3: An n type current source.

Current sources normally use only a single FET or BJT to control the current flow as shownin the figures 8.3 and 8.4. If you want to sink current you use nmos or n-p-n as in the figure 8.3whereas if you want to source current you use pmos or p-n-p as in the figure 8.4.

−

+

−

+

Figure 8.4: A p type current source.

One way to improve the output resistance of an FET current source is to simply use longerchannel FETs, which are less susceptible to the dibl or drain induced barrier lowering effect andhence have drain characteristics with a smaller slope. This also has the advantage that the varia-tion in Ldiff is a much smaller fraction of the channel length, so you will get better immunity fromprocess variation. The flip side is that the longer the channel the more your exposure to back biasso now you will have to worry about fluctuations in substrate potential.

One disadvantage with controlling a current source by a gate voltage or base voltage is thatwith the normal variation in process, the same voltage at the gate or base at the slow end ofprocess would result in a current very different from that obtained at the fast end of process. This

78

is especially true for the BJT because the base diode current is exponentially dependent on thebase voltage.

Figure 8.5: Resistor controls the current.

The way this is avoided is to use a current mirror as shown in the figure 8.5. The current ofthe current source is controlled by the resistance. So at the slow process corner the control voltageis higher and at the fast process corner the control voltage is lower so there is some compensationhere.

But even so, given that the threshold voltage of the typical FET is a significant portion of thesupply voltage and the gate voltage has to be somewhat higher than the threshold voltage, thecurrent at the fast process corner is going to be higher than that at the slow process corner.

A length of diffusion implant with a contact at either end is an effective resistor as shown inthe figure 8.6. The resistance is dependent on the effective length L, the width W (not shown) andthe depth D.

L

D

Figure 8.6: A diffusion resistor.

Since the diffusion is implanted in a semiconductor of the opposite type as the implant species,the depth D is actually potential dependent and hence the usual practice is to characterize thediffusion resistor as a JFET. As the potential of the resistor increases, the depletion width reducesthe height of the resistor and increases it’s resistance.

When you need a constant value independent of process corner, it is common to use perhapsfour independent resistances with two of them in parallel in series with the other two in parallel.By orienting the resistors in perpendicular directions and using specific combinations of W and Lone can construct a resistor which is relatively independent of process corner.

The Widlar current source [73] is a way to generate very small currents in a BJT current sourcewithout the use of large value resistors. A resistor is inserted in series with the emitter as shown inthe figure 8.7. Since both base and collector current flows through this resistor it drops a voltageof (1 + β)Ib, so a small resistor is sufficient.

79

This voltage is the difference in the Vbe of the two n-p-n BJTs and so you can get a very largeratio difference between the currents I1 and I2. After solving equation 8.2 to obtain VBE1, you cansubstitute it in equation 8.3 for an assumed value of I2. Then you can adjust either A2 or R2 untilthe Vout is what you require.

I1

I2

Q1

R2

R1

Q2

Vout

Figure 8.7: Widlar current source.

VBE1 + R1Is exp

[VBE1

VT

] [1 +

VBE1

VA

]= VCC (8.2)

VBE2 = VBE1 − kT

qln

I1A2

I2A1(8.3)

A cascoded current source is similar to a simple current source except you stack the mirrorsections as shown in the figure 8.8. The improved performance is due to the way the potential atpoint a responds to changes in the potential b.

a

b b

a

Figure 8.8: A cascoded current source.

When the load resistance becomes large, the cascoded current source behaves just like thesimple current source. The improvement is when the load resistance is reduced. In the simplecurrent source of the figure 8.5, the current of the source will increase when the load resistancedrops.

80

Vds, Vce

Vgs5, Vbe5Vgs4, Vbe4Vgs3, Vbe3Vgs2, Vbe2Vgs1, Vbe1

Id, Ic

Figure 8.9: Bias point a in a cascoded current source.

This is because for either an FET or a BJT, for the same gate voltage or base voltage, the currentthrough the FET or the BJT will increase as the drain voltage or collector voltage is increased, byas much as 50% or more. This is due to the slope in the drain or collector characteristics.

But in the case of the cascoded current source, as the voltage at b rises due to a drop in theload resistance, the voltage at a rises as indicated by the diamonds in the figure 8.9. So as theload resistance falls, the bias point of the upper transistor moves from the left-most diamond tothe right-most diamond. So as the Vds or Vce rises, the Vgs or Vbe falls so as to keep the drain orcollector current the same.

8.3 Active load

An active load is actually a current mirror used in a differential circuit as shown in the figure8.10, but it is better than a pair of resistors because it effectively causes a gain in the output swing.

A A

B

Q1 Q2

Q3 Q4

Figure 8.10: A current mirror as an active load.

Let us look at what happens when A drives low and A drives high. Q3 starts to turn off, sothen Q1 starts to turn off, forcing Q2 to also start to turn off, which means it exhibits a higherresistance, therefore the voltage at B will drive hard low.

Contrast this with the case when Q2 were replaced with a resistor, in this case it will stillprovide a pull-up current which Q4 has to fight to pull-down the load attached to B. So as you can

81

see using an active load improves the output swing.

8.4 Level shifting

In analog circuits one sometimes wishes to shift an output voltage either higher or lower by afixed amount. In the first case you would use a p type shifter and in the second you would usea shifter based on nfet or n-p-n as shown in the figure 8.11. The actual amount of the shift willdepend on the bias point of the upper transistor.

A

A − Vgs A − Vbe

A

Figure 8.11: A level shifter.

8.5 Common emitter/source amplifier

You can construct an amplifier with a single transistor as shown in the figure 8.12. If you varythe input voltage by a small amount about the bias point the transistor’s bias point varies betweenthe points 1 and 2 as shown in the figure 8.13.

I2

Ii

IpuIpu

I2Vg

Figure 8.12: Single transistor amplifier.

The single stage shown in the figure 8.12 is inverting. When the input voltage rises, the BJTor FET current rises, the voltage dropped across the load resistor increases and the output voltagedrops. When the voltage is at it’s lowest the transistor is at point 1 and when the input voltagereaches it’s maximum the transistor is at 2.

82

Vds, Vce

Id, Ic 2

1

Figure 8.13: Transistor bias point.

8.6 DC gain

The small signal gain of the amplifier of figure 8.12 depends on the bias point and the valueof the resistance and also the slope of the drain or collector characteristics.

The output voltage is given by the equation 8.4 for the BJT stage and the equation 8.5 for theFET stage. In equation 8.4 Is is just the reverse saturation current of the base emitter diode aloneand equation 8.5 uses just the ideal saturation equation for the FET current and does not considerthe drain slope.

Vout = Vsupply − β · R · Is · eq (Vbe +∆Vbe)/(nkT ) (8.4)

Vout = Vsupply − R · µ εox

Tox

W

2 L(Vgs + ∆Vgs − Vth )2 (8.5)

Now take the derivative of Vout vs. Vin and you get the voltage gain. Similarly you can obtainthe current gain. In order for you to have amplification the product of the current gain and thevoltage gain has to be significantly higher than one.

8.7 Emitter/Source follower input

The common collector/drain which is also called the emitter/source follower the figure 8.14is used as an input stage due to it’s high input impedance and low output impedance. The reasonthis is important in an input stage is to avoid loading the output of the previous stage.

A

A − Vbe

Rl

A

Rl

A − Vgs

Figure 8.14: An emitter follower input.

83

The load impedance RL is multiplied by (1 + β) in the case of the BJT and by gm in the caseof the FET and is visible to the input i.e. the input voltage dropped across the base-emitter or thegate-source is reduced by (1 + β)Ib RL or gmVgs RL.

So the input resistance is very high so long as you have a high β or gm. In the case of the FET,the back bias does have an effect, but if the gate length is very short, the back bias is much lessimportant.

8.8 Bootstrapping

A − Vbe

Rl

A

R1

R2

Figure 8.15: A bootstrap capacitor.

If you have a unity voltage gain amplifier, and that amplifier has an input biasing arrangement,then you can increase the effective input impedance of the amplifier by the use of a bootstrapcapacitor. This is just a capacitor large enough that it has a low impedance at the lowest frequencyto be amplified, connected between the output and the input as shown in the figure 8.15.

The bias voltage of the base is R1Vsupply/(R1 + R2). If the signal driving the input wants toraise the base voltage by ∆V , the current through R1 has to reduce by ∆V/R1 while the currentthrough R2 has to increase by ∆V/R2 which means that the feedback capacitor has to supply thecurrent

∆I = ∆V

[1

R2− 1

R1

](8.6)

Keep in mind that this only works if the input signal is ac. The capacitor voltage does notchange because the voltage gain is unity so there is no damping effect due to the feedback.

It is called a bootstrap circuit because the input just seems to pull itself up by it’s bootstraps. Thebest way to think of a bootstrap circuit is to compare it to the counterweight used in an elevatorshaft.

8.9 Miller’s theorem

Miller’s theorem is particularly useful in analyzing amplifier circuits which have an impedancesuch as a capacitance connecting the output to the input, but it applies in general to any circuit. Inthe figure 8.16 you need to be able to express the voltage at 2 as a function of the voltage at 1 sothat V2 = G V1.

84

Z

Z’ Z"

1 2 1 2

Figure 8.16: The Miller effect.

If that is true then you can analyze the circuit by the equivalent circuit on the right, where:

Z ′ =Z

1 − G(8.7)

Z” =Z G

G − 1(8.8)

8.10 Gain bandwidth product

Any integrated circuit contains parasitic capacitances. In addition, the internal capacitances ofthe devices change with the bias point. In order for an analog circuit to function, these capacitancesneed to be charged and discharged. The current drive of the FETs or BJTs used in an analog circuitare determined by the bias point.

So once you set the bias point in an amplifier circuit, for a small signal applied to the input,both the capacitances and the drive currents are known. At this time the frequency content at theinput is amplified differently. For the low frequency content, there is more time to charge anddischarge the capacitances than at higher frequencies.

For this reason a quantity is defined called the gain-bandwidth product at a given bias point.By decomposing the input using the Fourier transform and calculating the amplification for eachcomponent and then merging them back together you can obtain the output.

So if you wish to obtain as little frequency distortion as possible you would like to design thebias point with a high value for the gain-bandwidth product.

8.11 Voltage reference

The figure 8.17 is the Widlar band-gap reference [74]. In a previous paper [73], the voltagedropped across R3 is given by equation 8.9 if Q1 and Q2 are identical. The equation relating Vbe ofa BJT to the collector current was given in [75] as equation 8.10 where Vg0 is the band gap energyextrapolated to 0K and VBE0 is the VBE at T0.

∆VBE =kT

qloge

Ic1

Ic2(8.9)

VBE = Vg0(1 − T

T0) + VBE0(

T

T0) +

nkT

qloge(

T0

T) +

kT

qloge

Ic

Ic0(8.10)

85

I

Vref

Q3

Q1

R1 R2

R3

Q2

v1

v2

Figure 8.17: Widlar band-gap reference.

In [74] the requirement to make the output voltage Vref stay constant over a temperature rangeis given by equation 8.11. The resulting variation was reported to be as little as 0.3% over a rangeof temperature from -55 oC to 125 oC. There are also all CMOS voltage references such as [76].

Vg0 = VBE0 +kT0

qloge

Ic1

Ic2(8.11)

8.12 Differential circuits

Differential circuits are different from normal circuits in that all signals travel in pairs. Forevery signal there is a corresponding signal that does the opposite. This other signal is oftendenoted by a bar over the signal name to denote that it is the opposite, just as in a logic circuit.The way these signals are used are as shown in the figure 8.18.

In the figure 8.18 suppose R = 20 kΩ, I = 20 µA, n = 1.5 and Is = 50 pA. If A and A are atthe same voltage, then 10 µA flows through the left side of the circuit and 10 µA flows through theright side of the circuit, and both B and B are at 200 mV lower than the supply voltage.

Now, if the difference voltage A − A = 200 mV , then the ratio of the currents is given by

I1

I2=

Is eq Vbe1/nkT

Is eq (Vbe1−200mV )/nkT(8.12)

I1 + I2 = 20 µA (8.13)

⇒ I1 = 19.88 µA , I2 = 0.12 µA (8.14)

⇒ Vbe1 = 0.5 V (8.15)

Differential circuits are used anywhere that you require a high rejection of supply noise andother noise in the circuit. For example there are always regions of the circuit where the power

86

A A

BB

I

R R

I1 I2

P

Q1 Q2

Figure 8.18: A sample differential circuit.

consumption density is higher than normal. In such cases if the supply lines are not wide enough,there will be a localized drop in supply voltage.

So if a signal is output by a circuit in this region, it will be lower than it should be. If this werea single ended signal and it was lower than it should be it would be incorrect at the receiving end.But if the signal was differential in nature then both the signal and it’s inverse would be lowerthan they should be, and since only the difference voltage between the two is important, there isno error at the receiving end.

The common mode rejection ratio is defined as in equation 8.16 where Ad is the gain of thedifference signal and Ac is the gain of the common mode signal.

CMRR =∣∣∣∣Ad

Ac

∣∣∣∣ (8.16)

8.13 Transistor matching

D1 D2

S S

D1 D2

D1

S S

D2

S S

D2 D1

Figure 8.19: Common centroid layout.

In analog design there is often a need to have a good matching between two transistors of thesame size. One case for matching transistors is a differential circuit but there are many other cases

87

where there is such a need. The simulation of analog circuits will not show the effect of transistormismatch unless you do something specific to model that effect [77].

The most often used is the "common centroid" approach as shown in the figure 8.19. For thecase when this structure is used as a common source pair the matching can be improved by usingthe structure on the right. Some of the studies of matching in circuit design such as [78], [79] helpdecide what W/L ratios would give the best results and how to model the mismatch in a circuitsimulation.

In the two structures in the figure 8.19 there are two main effects that help matching. The firstis close physical proximity of the two FETs. If the FETs are placed side by side, they are more likelyto receive the same level of source, drain and threshold adjust implants, they will be exposed tothe same level of etching and oxide growth and go through the same thermal cycles.

The other effect is due to angular mismatch. Masks are aligned optically, so there will be someangular mismatch between one masking step and the next. This is the reason for splitting the FETsinto two parts and placing them diagonally to each other. Suppose the upper FETs were a littlenarrower and the lower FETs were a little wider it would be compensated for. Similarly for anymismatch effect that has an angular component to it.

8.14 Bode plots

Bode plots come in pairs. One plot is the log of the magnitude versus the log of the frequencyand the other is the phase angle versus the the log of the frequency. The unit for the magnitudeplot is the decibel or dB which is 20 log10 |G(jω)|. The magnitude plot can be easily drawn oncethe transfer function is factorized into poles and zeros. For example the transfer function in theequation 8.17 has zeros at b and c and poles at d, e and f .

G(jω) =a(b + jw)(c + jw)

(d + jw)(e + jw)(f + jw)(8.17)

b

c

d

e f

Figure 8.20: Bode plot for equation 8.17 if b < c < d < e < f .

First the poles and zeros are ordered in increasing frequency. The plot is started at the mag-nitude of G(jω) at zero frequency, but if b, c, d, e or f is zero, you can evaluate it at a higherfrequency and later on extrapolate back to zero. Then the frequency is incremented. Each time a

88

zero is reached the magnitude is incremented by 20 dB/decade, and each time a pole is reachedthe magnitude is decremented by 20 dB/decade. Having plotted the straight line graph as shownin the figure 8.20, the actual frequency response is obtained by correcting the response around thepoles and zeros because the straight lines just connect the asymptotes.

fedcb

0

Figure 8.21: Bode phase plot for equation 8.17 if b < c < d < e < f .

For the phase plot, each term in the equation 8.17 will contribute a phase angle which sum upin the numerator and subtract in the denominator i.e. the b+jw term gives an angle of tan−1(w/b)and the d + jw term gives an angle of −tan−1(w/d) as shown in the figure 8.21

8.15 Routh’s stability criteria

Amplifiers are often used in feedback applications as shown in the figure 8.22. Using Laplacetransforms the gain of the amplifier would be G(s) and the gain of the feedback loop is H(s). Thetransfer function of the block is given by the equation 8.18.

G(s)

H(s)

I(s) +−

O(s)

Figure 8.22: A feedback loop.

O(s)I(s)

=G(s)

1 + G(s)H(s)(8.18)

89

If a circuit is unstable it means that it will oscillate. All circuits have noise in them both dueto thermal noise, shot noise and 1/f noise but also due to the noise on the power lines due to theswitching of the transistors themselves. Some of this noise is bound to fall in the frequency regionwhere the circuit has a 180o phase relationship, and that noise will cause an oscillation at the inputwhich is fed back with the opposite phase with a larger amplitude and this will build up until thecircuit is non functional.

The Routh stability criteria is a way of analyzing the stability of a feedback loop without havingto solve for the poles and zeros. In order to use the Routh’s criteria the transfer function needs tobe written in the polynomial form of the equation 8.19 where a0 = 0.

O(s)I(s)

=b0 + b1s + b2s

2 + . . .

a0 + a1s + a2s2 + . . .(8.19)

Routh’s stability criteria requires that:

1. the coefficients an = 0 i.e. if a5 = 0 then it is required that a4 = 0.

2. all the an must be of the same sign.

3. If the first two conditions are met, then a table is created as shown in the table 8.1 where theb’s, c’s etc are given by the equations 8.20, 8.21, 8.22 and 8.23.

sn an an−2 an−4 . . .sn−1 an−1 an−3 an−5 . . .sn−2 bn−1 bn−3 bn−5 . . .sn−3 cn−1 cn−3 cn−5 . . .

Table 8.1: Table for Routh stability analysis.

bn−1 =an−1an−2 − anan−3

an−1(8.20)

bn−3 =an−1an−4 − anan−5

an−1(8.21)

cn−1 =bn−1an−3 − an−1bn−3

bn−1(8.22)

cn−3 =bn−1an−5 − an−1bn−5

bn−1(8.23)

In the table 8.1 the requirement is that there be no sign changes in the first column of coeffi-cients. If all the entries in this column are non zero and have the same sign, then the circuit isstable. If there are sign reversals the number of sign reversals equals the number of unstablepoles.

8.16 Nyquist path

The closed loop transfer function is given by the equation 8.18. According to the Nyquiststability criterion if the roots of the equation 8.24 lie in the left half of the s plane the system isstable.

1 + G(s)H(s) = 0 (8.24)

90

To get the Nyquist path you plot the value of G(s)H(s) in the s plane with the real value onthe x axis and the imaginary part on the y axis. The point −1 + j0 would be the origin if you wereplotting 1 + G(s)H(s). The graph is symmetric about the x axis. The Nyquist stability criteria isas follows

1. The Nyquist path cannot pass through poles or zeros of G(s)H(s).

2. If there are no poles or zeros on the jω axis.The requirement is that:

Z = N + P (8.25)

whereZ = number of zeros of 1 + G(s)H(s) in the right half of the s planeN = number of times the locus circles the −1 + j0 point in the same direction as the locusP = number of poles of G(s)H(s) in the right half of the s plane

3. If there are poles or zeros on the jω axis.In this case to meet the first requirement the contour is modified so that the locus does notpass through these points but goes around it at a infinitesimally small distance ε → 0.

For example the plot shown in the figure 8.23 is stable if the contour encircled one pole andno zeros in the right half of the s plane, because it encircles the −1+ j0 point once in the clockwisedirection.

Imaginary

Real−1

Figure 8.23: A sample Nyquist path.

8.17 Sample and Hold circuit

The figure 8.24 shows the simplest sample and hold circuit. It contains only three components.An input switch to gate the input. A capacitor to ground to hold the measured voltage. And anoperational amplifier configured as a voltage follower to isolate the capacitor from the loadingdue to the input of the circuit evaluating the voltage stored in the capacitor.

91

VoutVin

+

−

Figure 8.24: A generic sample and hold circuit.

8.18 Analog to digital conversion

In order to convert an analog signal into a digital signal you have to first decide two thingsnamely how often to sample and how many bits of accuracy each sample has to be converted to.For increased accuracy both need to be high together.

If you have a high sampling rate you can extract a higher maximum frequency from the digitaldata. In addition for a given time duration if you have a higher sampling rate, you will obtain alarger total number of samples which means that the frequencies obtained after a Fourier trans-form will be more closely spaced.

And finally, if you have a large number of closely spaced frequencies all the way upto a veryhigh frequency, then you need a large number of bits of accuracy for each sample, because other-wise you will see a smearing between frequencies, i.e. a peak at a frequency will be spread ontothe frequencies surrounding it.

−+

Vin

DACCounter

Figure 8.25: A simple A/D converter.

The serial A/D is counter based and is the slowest implementation. It’s speed is on the orderof 2n. It just uses a DAC whose input comes from a counter and the output is compared to theanalog input voltage by a comparator and the count when the comparator changes sign is thedigital value as shown in the figure 8.25.

The successive approximation is faster and in this approach each bit is sequentially tested soit’s speed is on the order on n as shown in the figure 8.26. It is based on the fact that in a binarynumber each bit has a value equal to the sum of all the less significant bits + 1. So starting at theMSB and decrementing, each bit is turned on. If the DAC output is higher than the analog input,then that bit should be 0 whereas if it not higher then that bit should be 1. Once a bit is set youleave it at that state and when the LSB has been obtained in this way, the final value at the DACinput is the digital value you need.

The parallel or flash A/D is the fastest but it requires a prohibitively large area. It is based on aKelvin voltage divider and it requires 2n resistors and 2n − 1 comparators and so it is only feasiblefor n = 2, 3 or 4. The block diagram is shown in the figure 8.27. The highest comparator that yields

92

−+

Vin

DACLogic

Figure 8.26: A successive approximation A/D converter.

−+

−+

−+

−+

−+

−+

−+

Vin

Digital out

Figure 8.27: A parallel or flash ADC.

93

a high indicates the level, so if the highest level reached is the 5th comparator then the bits 101 areoutput so the speed of the ADC is just the speed of the comparator + the combinational logic.

f N@ 2k

f Nf N+

−dtx[n]

2k

O

1−bit

D/A

Decimatork

Figure 8.28: A first order sigma-delta ADC.

The most often used is the Sigma-Delta analog to digital converter [80], [81], [82]. There aretwo types one using switched capacitor circuits and the other which is called continuous time[83], [84]. The Sigma-Delta A/D converter is useful to digitize a low frequency signal at a highresolution. A simple first order Sigma-Delta A/D converter is shown in the figure 8.28.

The cheapest implementation of a Sigma-Delta converter uses a 1-bit DAC which is a circuitwhich has one input bit and outputs either a positive reference level or a negative reference levelof equal magnitude. So the quantizer is also 1-bit, so it essentially a comparator which outputseither a positive pulse or zero and it is clocked at φ. If the bits of resolution is k then the analoginput signal has to be sampled at φ = 2kfN where fN is the Nyquist frequency of the analog signal.

From the balance condition when the integrator is at steady state, the time integral of thepulses output by the 1-bit DAC must be equal to the integral of the sample and holds x[n] duringthe same cycle of 1/fN . So this means that the output of the quantizer is a digitized equivalent ofthe analog input. But in order to obtain the k bits of resolution the bit stream at 2kfN has to beconverted into a k bit wide stream at fN and this is done by the decimator. That k bit wide streamis the digital output.

The reason that Sigma-Delta A/D converters are popular is that they don’t require a k bit DACwhich could occupy a large area but rather just use a 1-bit DAC operating at a high frequency. Theother advantage with this approach is that linearity is not a problem and there is no matchingrequirement as in the case of the k bit DAC.

8.19 Digital to analog conversion

The simplest DAC is based on the Kelvin voltage divider as shown in the figure 8.29. It issimilar to the parallel ADC in reverse. All that the bits do is to select which input is to be output.The linearity of the DAC is just the accuracy of the resistances. You can even make a specialnonlinear DAC by varying the resistances. The disadvantage is that you need 2n resistors.

The most common DAC is a binary DAC as shown in the figure 8.30. In the R-2R resistiveladder everything to the right of the current insertion point is a resistance of R. The most significantbit (MSB) controls the left most switch and the LSB controls the right most switch. In fact thecurrent is not actually turned on and off but rather switched into the R-2R network or switchedinto a dummy load by the use of a differential gate. This is done in order to keep the current drivesstable and to avoid transient fluctuations.

94

Digital in

Analog out

Figure 8.29: The simplest DAC.

2R 2R R

RR R

I I II

Vout

b0b1b2b3

Figure 8.30: A binary DAC.

95

The equivalent circuit when each of the bits are on is shown in the figure 8.31. An exercise forthe reader is to obtain the equivalent circuits when more than one bit is turned on and to showthat the voltages add linearly. There are many implementations which use matching FETs insteadof resistors but the issue is that the source and drain voltages will be different for the differentFETs in the ladder and so the on state resistance they exhibit will be different depending on whatbits are on and what bits are off.

2R

I

R

I

R R R

I

R2R I/4

IR2IR IR/2

Figure 8.31: Equivalent circuits.

The Sigma-Delta modulator is also used to make DACs as shown in the figure 8.32. In thiscase the input is a k bit wide digital input at a clock speed of fN and the first logic block uses thesek bits to output a bit stream at 2kfN .

k

f N

f N2k 1−bit

D/Adt

Figure 8.32: A first order sigma-delta ADC.

This bit stream is input to a 1-bit DAC which in turn outputs pulses of a reference voltage.When these pulses are passed into the integrator, they are integrated and the output is the analogequivalent of the digital stream and has the value of p · Vdd/2k where p is the number of pulseswhich is the value of the digital input.

As in the case of the Sigma-Delta A/D converter the Sigma-Delta DAC became popular be-cause it does not require all those resistors and it does not require the matching that a conventionalDAC requires and that linearity is not a problem.

8.20 Low power circuits

The total power consumed by a CMOS circuit [85] is given by the equation 8.26 where pt isthe probability of a transition, CL is the average load capacitance, Isc is the short circuit currentwhich flows when both the nfet and the pfet are turned on at the same time and dt is the durationfor which this happens, and Ileakage is the leakage current that flows when neither transistor isturned on.

Ptotal = fclk(pt(CL · V 2dd + Isc · Vdd · dt)) + (1 − pt)Ileakage · Vdd (8.26)

96

There are many techniques to lower this power [85].

• Lower the clock frequency in any portion of the chip where the speed is not required.

• Match the pull-up time to the pull-down time of the gates and this helps reduce the time dtand hence the Isc term.

• Lower the Vdd by lowering the Vth of the FETs. Although this will increase the Ileakage termthe drop in V 2

dd will more than offset it upto a point.

• Use dynamic logic instead of static CMOS logic wherever possible. There will be fewertransistors and Isc term goes away.

• Reduce spurious transitions where a transition occurs within a clock cycle only to be re-versed again within the same clock cycle due to different signals arriving at different timesi.e. race conditions.

• Define blocks of circuitry which can be powered down when not in use.

• If there is a choice between different implementations for subcircuits then pick the one whichuses less power. For example some structures are more parallel and are designed for higherspeed but may not be suitable for a low power application.

Bipolar transistors have been used to make micropower operational amplifiers such as [86].In the subthreshold region the MOSFET also behaves like a bipolar transistor [87] and has a hightransconductance and the current is an exponential function of the Vgs and this fact is used in somelow power circuits.

8.21 Laser trimming and other techniques

Many manufacturers of precision analog chips have used laser trimmable components to ad-just chips during the testing phase. Laser trimming is based on laser ablation where the energysupplied by the laser causes the material to heat up and vaporize. Typically resistors are thetrimmable components.

Laser ablation works better on materials with a high thermal resistance because it works betterwhen all the energy that is supplied is used to heat a small local area. The trimmed resistor canbe modeled by a matrix of resistances [88] where some of the resistors are the ones affected by thelaser trimming process. It is also possible to make structures that can be trimmed electrically [89]so that a laser is not required.

before after

Figure 8.33: A fusable connection.

Blowing fuse like connections is another way that chips are modified after fabrication. A fuseis just a wide interconnect structure with a narrow neck region as shown in the figure 8.33. Theremay be a small array of these fuses and a logic circuit that can select a fuse from the array basedon a control word. There may be a dedicated pin on the chip through which the instructions to the

97

logic circuit are clocked in serially. There is also a high current driver which supplies the currentthat is actually required to blow the fuse.

So based on the instructions clocked into the serial pin, the logic selects which fuse needs tobe blown and turns on the driver to pass current through that fuse. When the current is passed,the narrow neck region overheats and melts and forms beads on either side. At this point the fuseif blown. During the normal operation of the chip this logic circuit is not active or is disabled sothat it is isolated from the normal functionality of the chip and is only used during testing andcalibration. This type of fuse is especially popular with redundant circuits where where there is abackup component on chip to take the place of a design component that is not functional, so byblowing the suitable fuses, the backup can be included in place of the original component.

98

Chapter 9

Microprocessors

9.1 Binary number system

9.1.1 Integers

21

20

= 2

= 1

231

sign bit

= 2,147,483,648

Figure 9.1: Integers stored on a computer.

Almost all computer processors nowadays use 32 bit integers and many provide a 64 bit integerfor operations that require it. The way an integer is stored in a computer is shown in the figure9.1. If the integer is a signed integer then one of the bits is used as the sign bit. As a result thelargest positive number is 230 + 229 + 228 + . . . + 22 + 21 + 20 = 2, 147, 483, 647.

If the number is 64 bits long the largest signed number is 263−1 = 9, 223, 372, 036, 854, 775, 807.Integers are used in many applications for example most graphics is done in integers. Similarlybanks use integers to count the money, because the numbers need to be accounted for down to acent. The sign bit is 0 for positive numbers and 1 for negative numbers.

Negative numbers are represented as they are used in a form called two’s complement. To get thetwo’s complement you first invert each bit and then add 1. So for example the two’s complementof 0010110101000101 is 1101001010111011.

9.1.2 Floating point numbers

Floating point numbers represented by a 32 bit float can go as high as 1038 and as low as 10−38,those represented by a 64 bit float can go as high as 10308 and as low as 10−308. The float is dividedinto two parts the mantissa and the exponent. So the mantissa for the number 1.23495632× 10−36

is 1.23495632 and the exponent is −36. The mantissa of a 32 bit float has 8 significant bits i.e. youhave 8 digits after the decimal point. The mantissa of a 64 bit float has 16 significant bits.

99

This reveals a limitation of floating point numbers for example, 1.23495632 × 1010 − 1 =1.23495632 × 1010, since you only have 8 significant bits. The reason you want to use them ofcourse is that the largest 64 bit float can hold a a factor of 10289 larger than a 64 bit integer and onthe other end integers by definition cannot hold fractions.

9.2 µp block diagram

(DRAM)Extended memory

Microprocessorcache (SRAM)

Hard Drive

DMA

Microprocessorstack (SRAM)Main memory

ALU Interpreter

Figure 9.2: The components of a computer.

A microprocessor’s primary function is to run programs. The operating system is a program.All programs are processor dependent. They are a sequence of instructions interpreted by thatprocessor. All the logic is embedded in those instructions.

When a program is run all the instructions in the program are part of the stack. In addition theprogram may store runtime data in the heap. The stack and the heap together are the operatingspace of the processor i.e. all the instructions and data are stored in this space.

The typical program may be 10 megabytes long. The additional heap requirements may be anadditional 10 megabytes for a total of 20 megabytes. The processor itself does not have this muchmemory in it. The stack space is a few kilobytes, which means that the entire 20 megabytes needsto be rotated through the stack.

The way that the program is run is that it is loaded from the hard drive onto the RAM externalto the processor. Between the RAM and the stack is a memory space called the cache of about amegabyte located in the microprocessor. As the processor starts to execute the program the cachememory pre-fetches the next set of instructions from the RAM and thereby helps speed up theflow.

As the instructions and data flow through the cache, some instructions and some data arerequested more often by the processor and so the cache keeps those most often requested ratherthan flushing them after use. The reason for the cache is the speed of access. The stack is closestto the ALU right on the processor core and runs at processor speed.

The cache may be split into a portion on the processor core and another which may be a chipfabricated separately but just packed in the same package as the chip and connected via bond

100

wires. The cache is similar to the stack and is static RAM perhaps built using bipolar technologyfor speed.

The slowest is the off-chip RAM. It only operates at less than a giga-hetrz. On the other handit is dynamic RAM built using CMOS technology and therefore even a large amount of such RAMis cost efficient.

9.3 Arithmetic logic unit

The ALU performs addition, subtraction, multiplication and division.

9.3.1 Addition and subtraction

The one bit full-adder is the Table 9.1. If your integers are 32 bit signed integers then you willneed 32 one bit full-adders. The CO of the lower order bits are connected to the CI of the nexthigher order bit. The CI of the lowest order bit is 0. The CO of the highest order bit is thrownaway.

So in this way there is no difference between positive and negative numbers and to assureyourself of this try adding a 32 bit positive number with it’s two’s complement and you willget all 0s. So, when subtraction is done the number being subtracted is first converted to two’scomplement and then added to the first number to get the result.

CI A B Output CO0 0 0 0 00 1 0 1 00 0 1 1 00 1 1 0 11 0 0 1 01 1 0 0 11 0 1 0 11 1 1 1 1

Table 9.1: Truth table for a 1 bit adder with carry-in.

So the logic for the full-adder is:

Output = (CI · A · B) + (CI · A · B) + (CI · A · B) + (CI · A · B) (9.1)

CO = (A · B) + (CI · B) + (CI · A) (9.2)

Carry lookahead

One disadvantage with the addition of 32 bit integers by the full-adder of table 9.1 is that youneed to know what the CI is going to be. So if you have a 32 bit integer, then in order to evaluatethe highest order bit, you would have to evaluate the lower 31 bits to know the CI for the highestorder bit. If you assume that it takes a clock cycle for a full adder to evaluate, this would meanthat a 32 bit integer addition would take 32 clock cycles to complete.

Hence you need the carry lookahead logic. It is a combinational logic which predicts the CI ofa higher order bit without having to evaluate the output of the full adder at each of the lower order

101

bits. In order to obtain the logic for the carry lookahead, you can simply substitute the equation9.2 in place of the CI of the next higher bit and then simplify the expression. As you can see theexpression is going to use A · B and A + B at each bit.

COn = AnBn + COn−1(An + Bn) (9.3)

COn = AnBn + [An−1Bn−1 + COn−2(An−1 + Bn−1)](An + Bn) (9.4)

Since you are using combinational logic to predict the CI bits which the full adder would havecalculated anyway, you are adding a substantial number of gates to improve the speed. So youhave to make a tradeoff. Designers decide up front how many levels deep they can go with theequation 9.4. Then they repeat the unit to make it sequential that is you go back perhaps 4 bitsand at that level you use the CI obtained from the previous 4 bits and so on, so you are alternatingparallel and sequential to obtain as much speed as you can for the area and power consumption.

9.3.2 Multiplication

In [90], [91] the method used to do multiplication is to create an array of modified adder cellsas shown in the figure 9.3. The lower order bits on the left are input at the top whereas the lowerorder bits on the right are input at the bottom.

9 8 7 6 5

b2

234

1

output bits

b1

b0a2

a1

a0

Figure 9.3: The Guild multiplication array [91].

Each bit of one number interacts with each bit of the other number so there are 9 cells for theexample in the figure 9.3, and each bit is input to 3 cells. The output bits are output at the bottom.Temporary quantities z and u travel from right to left and from top to bottom respectively. Withineach cell for the implementation of [91], the logic used is given by equations 9.5 and 9.6.

un = un−1 ⊕ ab ⊕ zn−1 (9.5)

zn = zn−1ab + abun−1 + zn−1un−1 (9.6)

For a sample multiplication of 7 × 6, the tables 9.2, 9.3 show the value of u and z at the endof each cycle. The output is z9u9u8u7u6u5 = 101010. The references [92], [93] show methodsof pipelining multipliers for positive and negative numbers so that even though it takes many

102

Cycle u1 u2 u3 u4 u5 u6 u7 u8 u9

1 1 1 0 1 0 0 1 1 12 1 1 1 1 0 1 1 0 13 1 1 1 1 0 1 0 0 04 1 1 1 1 0 1 0 1 05 1 1 1 1 0 1 0 1 0

Table 9.2: u vales for each cell for 7 × 6.

Cycle z1 z2 z3 z4 z5 z6 z7 z8 z9

1 0 0 0 0 0 0 0 0 02 0 0 0 0 0 0 0 1 03 0 0 0 0 0 0 1 1 14 0 0 0 0 0 0 1 1 15 0 0 0 0 0 0 1 1 1

Table 9.3: z vales for each cell for 7 × 6.

clock cycles to compute a multiplication, a new set of operands can be input in each cycle and anew result is output at every cycle. The main method is to use latches to delay different sectionsof the flow so that you don’t have to keep the operands a and b constant for the duration of themultiplication process. The general concepts to make such arrays are described in [94].

9.3.3 Division

Division is done by using multiplication and addition. The most popular method of divisionis the SRT algorithm[95], [96] named after Sweeney, Robertson and Tocher who each implementedthe algorithm separately. The algorithm is defined by the following equation.

A = Q · B + R

The algorithm is used to divide A by B, so you iteratively guess the value of Q until the remain-der R is less than B. At that point Q is the value of A divided by B. Of course to get the completeanswer, the remainder R has to be multiplied by powers of 10 and divided by B exactly as we dowhen we do division manually on a piece of paper.

9.4 Shift register

The basic storage in the stack of a microprocessor is a shift register . You can make a shiftregister using the D flip-flops of figure 7.10 as shown in the figure 9.4. However assume that italso has a Clr control which can clear the D flip-flop and set Q to 0.

A simple serial in parallel (or serial) out shift register is shown in the figure 9.4. If you applya Clk signal, whatever is at the Input is clocked into the first D flip-flop, the first flip-flop’s Q isclocked into the second D flip-flop and so on. In the absence of the clock the bits are available inparallel from the outputs O3, O2, O1, O0. You can also clear the entire register by the use of theasynchronous Clr input.

The most popular shift register you can buy in discrete logic is the "universal" shift registerwhich can shift left, shift right, accept parallel or serial input and output parallel or serial output.

103

D

Clr

Q D

Clr

QD

Clr

Q D

Clr

Q

O3 O2 O1 O0

Clk

Input

Clr

Figure 9.4: A simple shift register.

D

Clr

Q

Previous Q or Serial D input

Next Q

Parallel DS0

S1

Figure 9.5: A "universal" shift register.

The way it does this is to modify the input to each D flip-flop as shown in the figure 9.5. Thesignals that decide what operation is performed are S0 and S1.

If S0 = 1 and S1 = 0, it performs the right shift which means that whatever is at the serial Dinput is clocked in. If S0 = 0 and S1 = 1, it performs a left shift so Q1 moves to Q0 and Q2 movesto Q1 and so on. If both S0 and S1 are 0, then it performs the parallel in operation i.e. the parallelD inputs are clocked in to the corresponding D flip-flop. As before an asynchronous Clr will clearthe entire register.

9.5 Instructions and operations

01010010 00100110 01111000

01010010 00100110 00101100

Add

Subtract

Increment 01010010 01010011

Decrement 01010010 01010001

Figure 9.6: Math instructions.

When a program is compiled a executable is created which contains the operations to be per-formed in a language understood by the hardware called the instruction set. The instructions usedby a microprocessor contain opcodes, register numbers or addresses and sometimes the data to

104

use. So an instruction may be as much as 8 or 10 bytes long or even more. If they use only oneoperand they are unary operators whereas if they use two operands they are binary operators.

01110110

01110110

01010010 00100110 00000010And

01010010 00100110Or

XOR 01010010 00100110

Invert

Left shift

Right shift

01010010 10101101

01010010 10100100

01010010 00101001

Figure 9.7: Binary instructions.

There are many basic instructions that are usually available in any microprocessor. Some likemove, push, pop are used to load the different registers. Input and output are used to talk todevices and ports. The basic math instructions are add, subtract, increment, decrement, compareas shown in the figure 9.6. Then there are the basic binary operations of and, or, invert, exclusiveor, left shift and right shift as shown in the figure 9.7. Then there are the two main programminginstructions the different jumps and the different loops.

9.6 CISC and RISC

CISC stands for Complex instruction set computing and RISC stands for Reduced instructionset computing. The difference is shown in table 9.4. Most personal computers are CISC whereasworkstations often used RISC chips.

CISC RISCInstructions are maybe 16 bytes long Instructions are maybe 4 bytes longMaybe 200 instructions Maybe 50 instructionsFew registers Many registersLarge interpreter Small interpreterFew hard-coded instructions Many hard-coded instructions

Table 9.4: CISC - RISC comparison.

The reason that these differences are important is that as a result of this, RISC processors aremuch smaller than CISC processors and in a smaller chip the routes are smaller and the chip isusually capable of running faster. Smaller chips tend to yield better meaning that a larger fractionof the chips manufactured function correctly. One reason is that the defect density is a fixed num-ber per unit area so bigger chips are more likely to have defects. Often defects are fatal in that thewhole chip simply won’t function correctly if even a single transistor malfunctions.

A large fraction (as much as 30%) of the processor is the interpreter that takes an instructionand decides what paths are turned on, what data is transferred from the stack to what internalbuffers and what paths are turned on in the ALU and what computation is done. By using a RISCprocessor you may be able to reduce the chip area by 15%.

105

Register

Register

Register

Adder

Multiplier

XOR

Register

Register Register

Register

Register

Register

Figure 9.8: The hard wired RISC style.

Register

Register

Register

Adder

Multiplier Register

XOR

Register Register

Figure 9.9: The comparable CISC style.

Another difference is in the hard coding of certain instructions. Basically in a RISC processorthe implementation of certain instructions is done in hardware as shown in the figures 9.8 and9.9. In the figure 9.8 you can see that you need to place the operands in specific registers if youwish to do an add operation and the output register used for the add operation is always the sameone. Similarly with the XOR operation or the multiply operation etc. This is the RISC style andyou will use as many registers as you need. However the CISC style is shown in the figure 9.9and here any of the registers may be used by the multiplier or the adder or the XOR so at leastfrom the human point of view this is more obvious.

Assume that after doing an addition you are going to use the output of the addition as onethe operands for the multiply. In the RISC case you would have to do a move operation from theoutput register used by the adder to one of the input registers used by the multiplier. Whereas inthe CISC case you can simply use the register that the output of the addition was placed in as oneof the input registers for the multiply operation. So you see that in this case the move operationwas not required. And there are many other such examples. This is one reason why the programsused by RISC processors have to be longer to do the same thing as a comparable program for theCISC processor.

In reality RISC processor makers have increased the number of instruction so much that theystart to look like a CISC processor and in the meantime CISC processor makers have started to useRISC concepts to keep the number of instruction to a minimum.

9.7 The critical path

Critical path design is a concept used by logic designers to define chip requirements and high-light the most timing critical components so that they can concentrate on them. In the figure 9.10the critical path is the path going from Input through Logic3, Logic2, Logic4 and nand gate to theOutput. This path is (2+2+6) = 10ns. All the other paths from the input to the output are shorter.So basically the longest path is the critical path because for example if you design Logic5 to be alot slower so it takes 8ns instead of 4ns it still would not affect this circuit because it still would not

106

be in the critical path because the path from the input through Logic1, Logic5 and then to outputfor a total time of (1 + 8) = 9ns which is less than the critical path which is 10ns long.

Logic1

Input

Logic3

Logic2

Logic5

Logic4

1ns 4ns

2ns

2ns6ns

Output

Figure 9.10: The critical path analysis.

However you can have more than one critical path if you have two paths of equal length whichis the maximum length. In doing critical path analysis you have to use the worst case time betweenthe input and output along a given path and not the average time. So if a logic block usually takes11ns but occasionally requires 12ns then the number you use is 12ns for that block from the giveninput to the given output because of course even within a logic block you may have different timesas shown in the figure 9.11. Here the D → Q′ is the longest at 1.1ns.

D −−> Q = 1ns

D −−> Q’ = 1.1ns

C −−> Q = 600ps

C −−> Q’ = 700psC

D Q

Q’

D

Figure 9.11: Different delays for different transitions.

9.8 Pipelining

Pipelining is used in almost all timing critical logic circuits nowadays. The basic purpose ofpipelining is to increase the usage of critical components. In the figure 9.12 you see a critical butslow logic block. After you apply the inputs it takes 3 clock cycles to compute the output. Until itfinishes computing the output you cannot apply the next set of inputs.

Slow logic blockOut

In

C3 clock cyles

Figure 9.12: A critical but slow logic block.

So the way you use it is shown in the figure 9.13. Here I#1 is when the first set of inputs areapplied, I#2 is when the second set of inputs are applied and so on. So here you can apply theinputs every third clock cycle. Assume it is in the critical path. Because it is slow you want tospeed it up. Perhaps you would like it to run thrice as fast.

You can achieve this as shown in the figure 9.14. In order to speed it up the first step is to breakthe logic block into 3 pieces each of which take about a clock cycle to execute. After breaking it

107

I#1 I#2 I#3

Figure 9.13: The usage of the version in figure 9.12.

up you can make adjustments so that each block takes less than a clock cycle to execute. Now youplace latches (L in the figure) after each of these 3 piece. For our discussion let us assume that thelatches are so fast that we don’t have to count how much time they take.

L L LOut

C

In

21 3

Figure 9.14: The block of figure 9.12 after pipelining.

The way this new circuit is used in shown in the figure 9.15. Because each piece of thepipelined circuit only takes one clock cycle to execute (including the time taken by the latch) theoutput of that piece is ready by the end of each clock, and that piece is ready to accept another setof inputs. So now you can apply the inputs every clock cycle instead of every third clock cycle asfor the original version of the circuit. This method of breaking the circuit up and using latches tohold the intermediate results is called pipelining.

I#1 I#2 I#3 I#4 I#5 I#6 I#7

Figure 9.15: The usage of the version in figure 9.14.

9.9 Intentional clock skewing

In the example we talked about for pipelining what if you cannot easily break the circuit intopieces that are one clock cycle long ? Since the latches are used to hold intermediate values youneed as many latches as there are intermediate values. So you want to break the circuit up in theexactly the right places so that you can latch something that is clearly an intermediate result andalso to minimize the number of intermediate results and hence latches used. But this may notbe easy and let us say you get 3 pieces of 1 clock cycle, 1.2 clock cycles and 0.8 of a clock cyclerespectively.

108

L L L

delay1

Out

C

In

21 3

C2

Figure 9.16: A pipelined circuit with a skewed clock.

Now obviously the second piece is much more than a clock cycle. And of course you don’twant to use two clock cycles for that piece because then it starts to get messy. Is all lost then ? No,there is another way out. The answer is that many designers intentionally skew the clocks used ineach section to adjust for such a problem. This is shown in the figure 9.16.

In this figure the delay1 maybe just a couple of inverters in series. Whatever is used is designedto delay the clock by a fifth of a clock cycle. So there are two clocks in this circuit C and C2 andthe operation of this circuit is shown in the figure 9.17. The input of the second piece of the figure9.16 is clocked in by the clock C, however it’s output is clocked in to the third piece by the delayedclock C2.

C

C2

1.2 clock cycles

Figure 9.17: The second piece gets 1.2 clock cycles.

Since the duration from the rise of the clock C to the rise of the delayed clock C2 is 1.2 clockcycles, the second piece actually gets 1.2 clock cycles to perform it’s calculation. Keep in mind youhave to make sure that the output of the second piece does not vanish within the delay1 time fromthe start of C because otherwise you will get the wrong result at the input of the third piece whenit is clocked in by C2. It is important to note that the clock C2 is not visible to any other circuit i.e.it is strictly an internal clock and is used exclusively by the second piece of this circuit.

9.10 Clock trees

The clock is global to the chip and is used to time everything in the chip. It is also usedto synchronize everything in the chip. But what if the clock itself is wrong. Maybe the clocksin different parts of the chip have the correct frequency but the wrong phase. This can happenquite simply by the effect of the capacitive loading of the clock signal by the interconnect linecapacitance and the gate input capacitance. So it is very possible that three clocks in the chip mayhave a clock relationship as shown in the figure 9.18.

In this figure let us call the region using the clock C1 as region C1 and that using the clock C2as region C2 and so on. What do you think will happen when the output of a logic circuit in regionC1 is used as an input in region C2 or region C3 ? As you guessed it would be utter chaos because

109

C1

C2

C3

C1

C2

C3

Figure 9.18: Three clocks that are skewed.

C1

C2

C3

C1

C2

C3

Figure 9.19: Clock source in the center.

they would be out of sync with each other.The chip industry uses a standard approach [97], [98] to avoid clock skew. Basically you start

with a master clock. Then you divide the distribution into several levels so that the lowest level achip the size of a microprocessor contains perhaps 40 clocks as shown in the figure 9.20.

Having done this, the level 2 is synchronized with the level 1 clock, the level 3 is synchronizedwith the level 2 and so on. In order to do this there are many phase detectors. Once the phasedetectors determine whether a clock comparison shows that a clock is ahead or behind the higherlevel clock, the deskewing is performed iteratively.

The deskewing of the clocks is done using digital bits that control programmable delays. Ineach iteration the bits are incremented or decremented based on the phase detector’s result andthe comparison is repeated. This is done continually because the clock skews can have so manycauses.

All this flexibility has the additional advantage that the designers can hard code intentionalclock skewing as described in the figure 9.16 to get optimum performance out of the existinglogic.

The clocks that have a particularly high frequency may be doubly controlled wherein boththe rising edge and the falling edge of the clock is controlled. So here you would use two phasedetectors to generate two sets of control bits, one for the rising and the other for the falling edge.Then you have two programmable delays where the first deskews the rising edge and the secondcontrols the width of the pulse.

110

1.2.1

1.1 1.2 1.3 1.4

1.2.2 1.2.41.2.3

Level 2

Level 3

1 Level 1

Figure 9.20: A standard clock tree.

Delay

Phase

Detector

Logic

Control bitsClock out

n

Figure 9.21: Controlling the skew.

111

Chapter 10

Phase-Locked Loops

Phase-locked loops originally become popular in making radio receivers. One of the problemsthat all radio receivers have is that the signal that they are receiving shifts slowly back and forthin frequency because it is modulated by the medium that it passes through when going from thetransmitter to the receiver.

A good reference for phase locked loops is [99]. PLLs are based on a feedback loop so it is agood idea to also read a more general control systems book such as [100].

10.1 Ring oscillator

A PLL is a very complicated circuit to understand, so let us start slowly and discuss some ba-sics before we get to the PLL circuit itself. The circuit in figure 10.1 shows four cascaded inverterswith the output of the fourth inverter fed back into the input of the first inverter. So we know thatA = E, because they are shorted together. Let us suppose that A is logic 0, then B is logic 1, C is 0,D is 1 and E is 0 which is the same as A. Once such a state is reached it will not change as long asthe power is left turned on, because it is a stable state. Similarly let us suppose that A is logic 1,then B is logic 0, C is 1, D is 0 and E is 1 which is the same as A. Once this state is reached it toowill not change as long as the power is left turned on, because it is also a stable state.

D ECBA

Figure 10.1: Four inverters looped back.

Now let us modify the circuit by adding an inverter as shown in figure 10.2. The behavior ofthis circuit is very different. Let us suppose that A is logic 0, then B is logic 1, C is 0, D is 1, E is 0and F is 1 but we have a problem now because F is shorted to A and we started with A as 0, andA cannot be both 0 and 1 simultaneously.

As long as we treat the inverters as pure logic elements, the circuit in figure 10.2 appears tobe impossible, because A must be equal to F and that is not the answer we are getting when weanalyze the circuit. But let us make the situation more realistic by adding a propagation delay to

112

D ECBA F

Figure 10.2: Five inverters looped back.

each inverter. The propagation delay is the time difference between when the input crosses a logicthreshold until when the output crosses the same logic threshold.

Now at time 0s let A be 0. Let the propagation delay of the inverter be 10s. So then at time 10s,B becomes 1, at time 20s C becomes 0, at 30s D becomes 1, at 40s E becomes 0 and at 50s F becomes1. So therefore, at 50s A becomes the same as F i.e., A becomes 1. But notice that now we don’thave a problem because if A was 0 at time 0s that still allows it to become 1 at time 50s. After thatA becomes 0 again at 100s, and then 1 again at 150s and so on forever as long as the power is leftturned on. In reality the logic level at the point A will be as shown in figure 10.3. From 40s to 50sthe voltage at A slowly rises from logic 0 to logic 1, and then from 90s to 100s it falls from logic 1to logic 0, then again from 140s to 150s it rises from logic 0 to logic 1 and so on.

0

1

50 10040 90 150

Figure 10.3: Logic at A of figure 10.2.

The circuit in figure 10.2 belongs to a family of such circuits called ring oscillators. The ring partis because the output of the last inverter is fed back to the input of the first inverter. The oscillatorpart is because the voltage keeps going up and down i.e., it oscillates and so it is an oscillator.The members of this family all have an odd number of inverters in the ring. We already discussedwhat happens when you have an even number of inverters in the ring i.e., it does not oscillate.The period of oscillation is equal to twice the number of inverters multiplied by the propagationdelay of each inverter. So, if you want a ring oscillator which oscillates very rapidly, you needto reduce the number of inverters in the ring. Believe it or not, a working oscillator has beenbuilt with just a single inverter, however the inverter was not built using FETs but rather with adifferent type of transistor called a bipolar junction transistor or BJT. In order for a ring oscillatorto oscillate properly with very few inverters in the ring, each inverter must have an amplificationmuch higher than unity.

So based on what we know if we build an oscillator with some odd number, say 7 elements,and if the propagation delay of each inverter is 100 picoseconds, then the period of the oscillatoris 1.4 nanoseconds and it’s frequency is the inverse of that i.e. 714 MHz. But what if we reallywant 700 MHz ? Or perhaps 702 MHz ? Obviously we want a way to modify the propagationdelay so that we can generate different frequencies. Such a circuit is called a voltage controlled

113

oscillator or VCO. In a VCO, the inverter elements have additional control inputs which are usedto vary the propagation delay. Keep in mind that there are many types of oscillators which are notbased on inverters, but for now we will only talk of ring oscillators because they are the easiest tounderstand.

10.2 Subsystems

O IId

IrCV

Referencesignal

OI

Low−pass filter

Phase ComparatorDivider

VCO

Output

Figure 10.4: A schematic of a phase-locked loop or PLL.

As you can see in figure 10.4 there are only four components in the PLL namely the VCO, thedivider, the phase comparator and the loop filter.

10.2.1 Voltage controlled oscillator

An inverter whose propagation delay can be controlled is shown in figure 10.5. This is called acurrent starved inverter. Another type is the loaded inverter where you vary the load capacitanceattached to the output of the inverter.

I O

Cp

Cn

Cp

Cn

Figure 10.5: A circuit for an inverter with a variable delay.

114

In figure 10.5, the two controls are Cn and Cp. These two are a pair i.e. for each value of Cn

there is a unique value of Cp. So the way you generate Cn and Cp is to make a circuit that takes asingle input and generates both Cn and Cp. The requirement is that for any pair of Cn and Cp, themaximum current that can be sourced by the upper half of the circuit must equal the maximumcurrent that the lower half of the circuit can sink. Otherwise the circuit will not be symmetric andthe logic levels will be unusable.

This works by slowing down the inverter by a controlled amount. Adding the series transistorsabove and below is a way to reduce the current flow and because the current flow is needed tocharge the gate capacitances of the next inverter in the ring, reducing the current is a way toincrease the propagation delay. So the highest frequency that can be obtained from the oscillatoris obtained when Cn is the supply voltage and Cp is ground.

Another way of making a fast VCO is to use an Astable multi-vibrator. Multi-vibrators are thegrand daddy’s of digital circuits and date back to when digital circuits were just being born. Theyare divided in Bi-stable and Astable families. The first of the two is stable in either of two stateswhile the latter is stable in neither state and constantly switches states giving rise to a square waveoutput.

Vdd

R1

C C

R2R2

Q1 Q2

A BO

M

Figure 10.6: A fast VCO based on an Astable multi-vibrator.

At startup both capacitors are discharged. Let us assume that at startup Q1 turns on then thecapacitor on the left charges up. By design R2 is much larger than R1. The voltage at the collectorof Q1 drops substantially and so the voltage at A which is the voltage at the base of Q2 is belowone diode drop above ground. But as the capacitor on the left charges the voltage at A rises abovea diode drop and Q2 starts to conduct. This immediately forces the voltage at O down below adiode drop above ground. So Q1 turns off and the voltage at A rises because the voltage at Mrises toward Vdd. So while the capacitor on the left discharges and until the capacitor on theright charges up until the voltage at B rises above a diode drop above ground the transistor Q2remains on and the transistor Q1 remains off and then Q1 turns on and Q2 turns off and so onindefinitely. The period of the output depends the R1C product so by adjusting the resistance R1you can control the output frequency.

10.2.2 Divider

The divider is a purely digital circuit. It’s function is to take a digital input square wave andincrease the period n times. For example n may be 8. Usually for high performance applicationsone does not use a very large value for n. Of course people do assemble PLLs from individually

115

packaged components on a printed circuit board using values of n as large as 2000 but these arenot critical applications.

When you need a very high accuracy output waveform you need a small n of less than 20 totightly couple the output to the reference signal. The signal output by the divider is compared tothe reference signal by the phase comparator. If the PLL is locked and is working properly, thenthe signal output by the divider should be almost identical to the reference signal. Dividers arejust counters. You can make a pretty decent counter with a D flip-flop and nand gates as shownin the figures 10.7 and 10.8.

Divided output

Clock input Q

Q’

D

Figure 10.7: A divider based on a D flip-flop.

Figure 10.8: The output from circuit in the figure 10.7.

10.2.3 Phase comparator

The phase comparator could be either digital or analog in nature. The purpose of the phasecomparator is to compare the reference signal to the divided signal and try to make them identical.Actually the loop is what tries to make them identical, but what the comparator does is to pointout whether the divided signal is leading or lagging the reference signal and by how much.

The simplest phase comparator is based on the Set-Reset flip-flop as shown in the figure 10.9.It is just an SR flip-flop with nothing else. The reference signal is sent into the set input while thedivided signal is sent into the reset input. They are both active on the falling edge. So the outputof this phase detector is a pulse whose width is the length of time from the falling edge of thereference signal to the falling edge of the divided signal. The approximate balance point dependson the VCO’s control voltage function but basically you set it at about π phase difference betweenthe reference and the divided signal as shown in the figure 10.10.

The most popular phase detector is called the phase-frequency detector and a simple version isshown in the figure 10.11. It is called a phase-frequency detector because it locks both phase andfrequency at the same time i.e. when both the reference signal and the divided signal have the

116

Reference signal

Divided signal

OutputS

R Q

Q

K

J

Figure 10.9: A detector using the SR flip-flop.

R

D

O

Control voltage

Figure 10.10: The balance point for figure 10.9 .

117

same frequency and are in phase. Here the rising edge of the reference is used to set one flip-flop,the rising edge of the divided signal is used to set another flip-flop and a nand gate is used to resetthem both if the outputs of both flip-flops are high. The upper output is used to control a pull-upNFET while the lower output is used to control a pull-down NFET.

R

1

R

1

Vdd

D

RLoop filter

QD

QD

Figure 10.11: The phase-frequency detector.

R

D

PU

PD

difference period

Figure 10.12: The case when the VCO is 33% too fast.

In the figure 10.12 you see the output from the phase-frequency detector for the case wherethe VCO is running 33% faster than it should be. In this figure you can see that because D is 33%faster than the reference signal R, 3 periods of R fit in the same time period that 4 periods of D do.In this book I will call this the difference period. In general this period is simply given by:

Tdiff =1

f1 − f2(10.1)

118

In the figure 10.12 for a single difference period it can be seen that the total duration of thepull-up signal PU is less than the total duration of the pull-down signal PD. So for the durationsof the pull-up signal, the filter capacitor is being charged and for the durations of the pull-downsignal, the filter capacitor is being discharged and so if the the total duration of the pull-up signalPU is less than the total duration of the pull-down signal over the difference period, then at theend of the difference period, the filter capacitor has a lower voltage which means that the VCOfrequency is reduced which is what is needed.

10.2.4 Loop filter

The loop filter is a low pass filter. In the figure 10.4 the loop filter shown is a simple RC filter.It is known as a first-order filter. In general filters are classified by their order as obtained fromtheir transfer function as shown in the figure 10.13. The transfer function is used as:

O(s) = f(s) I(s) (10.2)

f(s) =(s − s1)(s − s2)(s − s3)(s − s4)

(10.3)

In the equation above s is jω = j 2 π f and f(s) is the transfer function. In the transferfunction s1 and s2 are called zeros because that is where the transfer function goes to zero ands3 and s4 are called the poles because the transfer function reaches toward it’s maximum at thesefrequencies. The location of the poles is what filter designers have to worry about to keep thecircuit from oscillating. You can make more complicated filters using more capacitors or perhaps

f(s)I(s) O(s)

Figure 10.13: The transfer function of a filter.

even operational amplifiers. But in reality you don’t want to have too high an order of filterbecause it really does increase the probability that the PLL will become unstable and then it isbasically useless. Any designer should be aware that process variations can cause the componentsin your circuit to vary by as much as 30% or so and that your circuits should be stable even withall this random (but usually concerted) variation. The PLL itself is a feedback loop and it’s orderis (1 + filter order) where the 1 comes from the VCO because the phase of the VCO is a perfectintegral of the control voltage applied to it and integration counts as a pole. The Fourier transformof the integral sign is 1

s .

10.3 Loop operation

So the way that the loop works is that you release the PLL so that it’s starting VCO control volt-age is lower than it should be. So the starting frequency is lower than it should be. So the dividedfrequency is lower than the reference frequency. So the control voltage needs to be increased untilthe divided signal matches the reference signal in frequency and phase. We already discussed thecase when the VCO was too fast as shown in the figure 10.12, and in the figure 10.14 you see theopposite case when the VCO is too slow.

119

R

D

PU

PD

difference period

Figure 10.14: The case when the VCO is 25% too slow.

As you can see in this figure the total duration of the PU signals is larger than the total durationof the PD signals over the course of one difference period. So at the end of each difference periodthe voltage of the filter capacitor is slightly higher thus speeding up the VCO until such time thatthe PLL is locked.

When the VCO is very far from where it should be, you want a quick pull-in to the desiredfrequency. The way to do this in this case is to simply reduce the RC time constant. But once theVCO gets close to lock a small RC time constant causes a larger ripple in the VCO frequency whichis undesirable. You can see this ripple in the VCO control voltage shown in the figure 10.10.

For this reason many PLLs have two modes for the loop filter, one mode provides lots ofcorrection and is used to lock the PLL in the first place and the second mode provides a muchlarger RC time constant and therefore a slower correction to damp the oscillations in the VCOfrequency. Some people simply use a large RC time constant such as by using a large capacitorbut this can backfire because the PLL may simply not lock at all because the pull-in will be soslow that it is simply drowned out by other causes of frequency variation such as power supplyfluctuation or other coupling in of signals perhaps by capacitive coupling between interconnectlines or other causes.

10.4 Delay-locked loop

A delay-locked loop or DLL is very different from a PLL even though they are often used inthe same applications. The difference is in the way the signal is generated. In the PLL the VCOwas free-running and the only control that the loop had on the phase and the frequency of theVCO output was by raising or lowering the control voltage of the VCO. A DLL does not use aVCO. Instead a DLL uses variable delay sections. If the n you wish to use is 5 then you will have5 delay sections. The way a DLL works is shown in the figure 10.15.

In the figure 10.15 the reference signal is used in two places. It enters the first delay sectionat the bottom left and proceeds toward the right going through each of the 5 delay sections. Thereference signal R and each of the outputs P1 thru P5 are shown in the figure 10.16. Each of the5 phases P1 thru P5 are delayed by a fifth of a cycle w.r.t the previous phase. Note that the fifthphase P5 is identical to the reference signal R except that it is delayed by a whole cycle. This isused to lock the DLL to the reference frequency.

The Edge comparator in the figure 10.15 is used to compare the rising edge of the reference

120

Low−pass filter

CVId

Ir

Edge comparator

CV CV CV CV CVP1 P2 P3 P4 P5R

R

Figure 10.15: A delay locked loop.

signal R to the rising edge of the fifth phase P5. If P5 rises after R then the delays are too long andthey need to be reduced by speeding up the delay sections by increasing the control voltage CV. Onthe other hand if P5 rises prior to R then the delays are too short and they need to be increased byslowing down the delay sections by decreasing the control voltage CV.

R

P1

P2

P3

P4

P5

Figure 10.16: The phases of the DLL.

Keep in mind that the DLL output is not the same as the PLL output. The PLL’s VCO outputsa nice clean waveform of frequency n R however in the case of the DLL you need to constructthis output waveform using the rising edges of P1 thru P5. Since whatever digital logic you useto construct the output waveform of frequency n R has delays of it’s own, the output waveformis not a clean square wave as in the case of the PLL’s output. For this reason DLLs are often usedwhen the frequency required is not super high or alternately when you can use the outputs P1thru P5 directly to control your circuitry. If you do this remember to consider the effect of the

121

loading of P1 thru P5 and make sure they are identical to each other otherwise the phases will beskewed and will not represent exactly a fifth of a cycle phase difference w.r.t each other.

But in fact the increased transparency of the DLL behavior and the increased reliability makesa lot of designers use the DLL in place of the PLL. Keep in mind that there is no concept of thedifference period here because the rising edges of P1 thru P5 are based on the rising edge of thereference signal.

10.5 Tracking and re-sync PLLs

LPF

LPF

Antenna

Multiplier

VCODetector

Pure carrier signal

Demodulated signal

90o

Figure 10.17: Phase-locked loop used in radio reception.

Often all you are really looking for is a way to track an incoming signal and maintain a lock onit. The earliest use of this was in radio FM receivers. In the figure 10.17 the signal from the antennais fed into the detector and the multiplier. The other input to the detector comes from the VCO.The output of the detector passes through a low pass filter and becomes the control voltage for theVCO. The reason you don’t have a divider is that you are trying to extract the carrier frequency.

So the output of the VCO is both phase and frequency synchronized with the carrier. So if it isphase shifted by 90o and multiplied with the antenna signal and passed through a low pass filteryou will get the demodulated signal. The use of PLLs in radio receivers was really the applicationthat drove the development of PLLs for a long time.

122

Chapter 11

Digital Signal Processors

Digital Signal Processors are similar to microprocessors except they perform a more specificpurpose. They are used to process digital signals. If you take a time domain waveform anddigitize it at 1 GHz using a 16-bit analog to digital converter, you will get a sequence spaced 1 psapart with each sample containing 16 bits.

Since a DSP is embedded in the device and performs only one function, it is expected to do itin a known time, usually real time. Because it is dedicated to a function it performs you wouldselect different DSPs for different purposes. In addition DSPs have specialized circuits to performtasks that affect performance in hardware instead of software.

11.1 Fourier transform

Signal processing is understood in the frequency domain and the Fourier transform is the wayto convert a time domain waveform into a set of frequencies, amplitudes and phases. If you takethe Fourier series representation of a series of rectangular pulses you will get equation 11.1. Ifyou use the first 1, 3, 9 and 30 terms the pulse will look as shown in the figures 11.2.

1π− 2

π(sin 1 cos x +

sin 2 cos 2x

2+

sin 3 cos 3x

3+ . . .) (11.1)

−0.4

1

−0.2

1.2

−0.2

1.2 1.2

−0.2

(11.2)

The Fourier transform and the inverse Fourier transform are defined by the equations 11.3and 11.4. A good book on the Fourier transform is [101]. The Fourier transform is different from

123

the Fourier series in that the output terms are complex.

F (ω) =∫ ∞

−∞f(x) e−ixω dx (11.3)

f(x) =12π

∫ ∞

−∞F (ω) eixω dω (11.4)

It is generally accepted that a time domain waveform that is measured will always have aFourier Transform. The time domain waveform is a scalar i.e it is not complex. The FourierTransform is complex, but keep in mind that if you want the inverse Fourier Transform to be ascalar then although you can attenuate the phasors representing the Fourier Transform terms, youcannot change their angle because otherwise the inverse Fourier Transform will not be a scalar.The reason the inverse Fourier Transform is scalar is that

e−ixω × eixω = 1 (11.5)

In order to use either time domain or frequency domain information you have to discretizeit. Basically you have to cut continuous functions into slices and treat each slice as a unit. So theDiscrete Fourier Transform or DFT is used and it is defined as:

F (ν) =1N

N−1∑ν = 0

f(τ) e−i 2π ( νN

) τ (11.6)

f(τ) =N−1∑ν = 0

F (ν) ei 2π ( νN

) τ (11.7)

If you start with N time slices then you will get complex amplitudes at N frequencies. Thelowest frequency is the inverse of the total time for which you sampled. The zeroth frequency ofcourse is the constant term. The method used to compute the DFT is the Fast Fourier Transformor FFT [102]. This method requires a computation time of 2 N log2N and gives substantial com-putation time reduction for large N. This method also extends to the two and three dimensionalcases.

The method used in this algorithm is based on [103]. So the DFT can be split into two DFTs ofhalf the size made up of the even and odd sequences as in the equation 11.8.

N−1∑ν=0

e−2πiτν/N · fν =N/2−1∑

ν=0

e−2πiτ(2ν)/N · f2ν +N/2−1∑

ν=0

e−2πiτ(2ν+1)/N · f2ν+1 (11.8)

=N/2−1∑

ν=0

e−2πiτν/(N/2) · f2ν + e−2πiτ/N ·N/2−1∑

ν=0

e−2πiτν/(N/2) · f2ν+1 (11.9)

The FFT speeds up the DFT by recursively dividing it in half as shown in the figure 11.1.Then you perform the DFT’s of each subsection and then multiply the odd’s by a constant beforeadding them to the evens. In each case the definition of odd is based on the previous level. Forexample 3 is an odd when you divide the sequence the first time, but it is an even when you divideit the second time then it becomes an odd for the third division etc. Because of this division of thesequence, the time domain sequence must have a length of a power of 2 i.e. 2, 4, 8, 16 etc. In caseswhere this condition is not met, the sequence is increased in length by adding trailing zeros untilthe condition is met. However this does have some side effects.

124

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

2

4

6

8

10

12

16

14

4

8

12

16

2

6

10

14

1

3

5

7

9

11

13

15

3

7

11

15

1

5

9

13

1

9

5

13

3

11

7

15

2

10

6

14

4

12

8

16

Figure 11.1: Successive reduction of the sequence.

The Nyquist theorem states that if you have a time-domain sequence with samples taken Ttime units apart then the maximum frequency you can represent using this stream is given by:

fmax = fN =1

2T(11.10)

So this means that if you wish to transmit a signal representing 10 kHz then you need to sampleat 20k Samples/second.

actual frequency

alias at lower frequencysample points

Figure 11.2: Aliasing due to low sampling rate.

Aliasing occurs when the sampling rate is too low. In other words the physical waveform youare sampling contains frequency components higher than fN . But these frequencies can give theillusion of being a lower frequency as shown in the figure 11.2.

As you can see in this figure the actual frequency is higher than the Nyquist criteria requires.But when you sample it at a low frequency you still get the up and down variation of the signalat a much lower frequency. In this case the alias appears at about a third of the original frequencyand it has quite a significant amplitude, in fact it has the full amplitude of the original frequency.

125

LPF FFTA to Dconverter

Figure 11.3: The use of an anti-aliasing filter.

This is a problem because when you take the FFT of this waveform after sampling it too slowly,you will believe that the alias is real i.e. you won’t know that it is an alias. For this reason somesystems use an anti-aliasing low pass filter with a cut-off frequency of fN as shown in the figure11.3. This removes the problem and in this situation you will not see the alias anymore.

1

f

Amplitude ripples

Ideal filter

N

Figure 11.4: Frequency response of anti-aliasing filters.

Over-sampling is a technique used to remove problems such as aliasing among other things.It is a better way to fix the problem of aliasing than the anti-aliasing filter. The problem withthe anti-aliasing filter is shown in the figure 11.4. If you use a simple filter you get the smoothcurve (the lower one). The problem with this curve is that there is significant loss of amplitude(it is supposed to be 1) near the Nyquist frequency fN . If you increase the order of the filter theamplitude improves but you get ripples as shown in the upper curve. So whether you use a loworder or a high order filter, the amplitude of the signal is wrong near the cutoff frequency fN .

Over-sampling is a solution to this problem. Instead of setting the cutoff frequency of the lowpass filter at fN you set it at 2 fN or higher perhaps even 8 fN . By doing this the amplitude iseven all the way upto and beyond fN . Now you sample this filtered signal at 2x or 8x the originalsampling rate. So these would be 2x and 8x over-sampling respectively. Then you use digitaltechniques to throw away the frequencies higher than fN . Because your sampling rate is severaltimes higher than it should be there is no aliasing below fN and although the amplitudes abovefN are wrong you don’t care because you will throw them away anyway.

Window functions are used to remove the effect of the edges of a time domain sequence. Sincethe Fourier transform assumes that the time sequence repeats, you will get the best results if yousample an integral number of cycles of the time waveform. Since this is easier said than done itis not normally the case. So window functions are used to mitigate the effects of the data pointsnear the beginning and the end of the time sequence.

The most popular window functions are the Hamming, the Hanning and the Kaiser. You canalso use a Gaussian. The general shape of the window function is as shown in the figure 11.5. It

126

Figure 11.5: The basic shape of a window function.

will be multiplied by the time sequence before the FFT is performed. Even this windowing willcause distortion but you make a choice between two evils. As a general rule you are better offwith a time sequence of several cycles instead of just one.

11.2 Compression

Compression is used anywhere that digital information needs to be transferred from one lo-cation to another and the speed of the link is the limiting factor. A cellphone is one place wherecompression plays an important role as shown in the figure 11.6.

A to Dconverter

DSP

converterD to A

receiver

transmitter

Figure 11.6: A simplified cell phone.

The antenna signal comes in from the antenna at the carrier frequency and the receiver extractsthe baseband signal from it. The analog to digital converter converts the baseband to a digitalsignal. The DSP processes that and sends it to the loudspeaker. It also takes the signal fromthe microphone converts it to analog using the digital to analog converter and transmits it afterupshifting the baseband to the carrier frequency.

In the figure 11.7 you see what the DSP does in the transmit path. The bit stream coming infrom the left is coming into the DSP twice as fast as it is going out of the DSP to the right. Sothe compression is 2. So in any time period say 1 second if 8k bits enter from the left only 4kbits are going out on the right. Now unlike a picture image this cannot be lossy i.e. when thebit stream is decompressed after receiving it, you should have exactly the identical stream to thatbefore compression. If it was just voice it might not matter, but of course nowadays cellphones areused to connect computers to the internet so data has to be transmitted accurately.

127

transmitterconverterD to ADSP

n bits/sec2n bits/sec

Figure 11.7: The DSP compresses the bit stream.

4−bits

out

4−bits 5−bits 5−bits 4−bits

in

8−bits 8−bits 8−bits 8−bits 8−bits

Figure 11.8: Huffman compression of data.

Huffman coding [104] is a very popular compression scheme. The way this is accomplished isshown in the figure 11.8. In this figure the data bit stream is entering on the left and is divided intochunks of 8 bits. So if you look at each 8 bit chunk, it can only have one of 256 values. Now if thedata has a pattern to it, for example if it is a text file, then some of the 256 values will occur moreoften than others. The alphanumeric text is 62 characters which means that most of the remaining256 − 62 = 194 characters are rarely used.

In addition the letter "a" will occur very often whereas the letter "q" should occur much lessoften. So in Huffman coding you assign a 3-bit number for the letter "a" and you assign a 6-bitnumber for the letter "q". Since you only send 3 bits when you send the letter "a" and 6 bits whenyou send "q" and you may send a 10-bit number to represent "@", the compression you get is verysignificant. So Huffman compression works very well on text files.

It also works quite well on voice signals because voice signals are not random. The human earcan hear sounds in the range from 20 Hz to 20 kHz. But the human voice box does not use theentire spectrum uniformly during speech. For example a woman’s voice uses a higher frequencyrange than does a man’s voice. Also different languages use phonetic sounds differently. So forthis reason some bit sequences repeat more often than others. So Huffman codes work quite wellin a cellphone.

Since you use different bit lengths to represent different 8-bit words, how does the decompres-sion process work. How do you know when you have a valid sequence of bits to convert backinto the 8-bit word. The answer is to make a tree. A binary tree is a way to do sorting and in thiscase to look up the meaning of a code as shown in the figure 11.9. In this figure you start at thetop. If the first bit is a zero you go left if it is a 1 you go right. The diamonds are decision blocksrepresenting this choice. The parallelograms are actual codes so they are dead ends. When youreach the parallelogram it means you have deciphered a letter and you can go back to the top

128

Start

0 1

0 110

0 1 0 1 0 1 0 1

"a" "i" "s"

Figure 11.9: A Huffman decoding scheme.

and start again. This particular case is not really a sorting operation but in general the speed of abinary tree is very high and is of the order log2N where N is the number of objects to search.

11.3 Digital Filters

If you wish to filter a time domain signal with a filter you could take the FFT of the signalmultiply it by the filter and then take an inverse FFT. But doing FFTs in a DSP is computationallyexpensive. Convolution is a way to do this in the time domain. Convolution in the time domainis multiplication in the frequency domain. Convolution is done by the equation 11.11.

C(t) =∫ ∞

−∞A(u) B(t − u) du (11.11)

So this equation gives you the value of the convolved output C(t) at each value of t. Now letus limit B(t) in order to get rid of the ∞ signs. So if B(t) is non-zero between say −5 and +5and is zero everywhere else then you only need to integrate between −5 and +5, so the equationbecomes 11.12 and if you discretize it it becomes equation 11.13.

C(t) =∫ t−5

t+5A(u) B(t − u) du (11.12)

Cn =5∑

k=−5

An − k Bk dk (11.13)

Filtering is the most time honored task normally performed by DSPs because it used to be doneon input signals well before DSPs became possible. Signals used to be filtered using specializedelectronic circuits. Names like Butterworth or Chebychev filters are well known to analog, radiofrequency or microwave circuit designers. These filters were designed by order of the filter mean-ing that a higher-order filter had higher order polynomial terms allowing a more precise filteringthat allowed both better blocking of unwanted frequencies and better pass through of wantedfrequencies.

The figure 11.10 shows a Finite Impulse Response filter implementation. There are manyreferences [105], [106] to obtain the coefficients bn. They are based on the convolution integral ofequation 11.11. You start by defining the frequency response H(ω) of the filter in the frequencydomain. Then you take it’s inverse FFT. Now you will get many terms and you select the set youwant to use. And this set is the set of numbers bn. The more terms you use the better the filtering.

129

+ + + + +

b0 b1 b2 b3 b4 b5

x(n)

y(n)

delays

Figure 11.10: An FIR filter.

Because the summation of products as in equation 11.11 is very common usage of the DSP, itnormally has both accumulators and a Multiply and Accumulate (MAC), in addition to more than1 ALU. An accumulator just adds whatever you give it to a summation, whereas a MAC multiplestwo numbers and gives the result to an accumulator.

11.4 Pattern recognition

Figure 11.11: Digitized intensity pattern.

The cross-correlation between two processes f(t) and g(t) is given by

CCfg(t1, t2) = E f(t1)g∗(t2) (11.14)

[107] suggested an easy way of comparing images of equal size and number of samples. Firsttake the FFT of both images. Take the complex conjugate of one of the FFTs and multiply it bythe other FFT. Then take the inverse FFT of the product. This is the cross-correlation of the twoimages. If you find a peak in this cross-correlation that gives you the displacement at which thecorrelation is highest.

For example suppose you wish to recognize the letter B as shown in the figure 11.11, youwould first place the grid on the left over the letter and average the black part over the area of thesquares. Since you are going to use the FFT, the dimensions need to be a power of 2. Then youwould use the technique above to detect a "B" pattern in another image of the same size. Becauseintensities vary you would have to normalize them first.

130

11.5 Error correcting codes

Error correcting codes are not compression schemes. Here the number of bytes after codingincreases and the excess bits are called "redundant". The purpose of error correcting codes is totransmit information reliably even in the presence of noise that corrupts a few bits in the receiveddata. The most popular ECCs are Reed-Solomon coding and decoding, Convolutional coding andViterbi decoding. There are other ways of decoding convolutional codes but the Viterbi decodingis the most efficient.

11.5.1 Reed-Solomon code

Reed-Solomon codes [108] convert a sequence of mn bits into a sequence of n2n bits. So ifn = 3 and m = 3 then a block of 9 bits will become a block of 24 bits, with each m bits having acorresponding 2n bits. At least (2n −m− 1)/2 bits can be corrected so if n = 3 and m = 3 then any0, 1 or 2 bits within each block can be incorrect.

The theory of Reed-Solomon codes is rather complex so the following is just an exampleslightly different from that in the original paper [108]. We start by picking n = 3 and m = 3.Then we define a polynomial function f(x) = x3 + x2 + 1. We use this to generate a recurringsequence [109] using the difference equation an = an−1 + an−3 which is used for n = 3, 4, 5 . . .,with modulo-2 addition which is 1 + 1 = 0.

Now we choose a0 = 1, a1 = 1 and a2 = 0. The set of an = (1, 1, 0, 1, 0, 0, 1) and after this itrepeats so that a7 = a0, a8 = a1 and so on. Now you define a table as

0 = (0, 0, 0)α = (a0, a1, a2) = (1, 1, 0)α2 = (a1, a2, a3) = (1, 0, 1)α3 = (a2, a3, a4) = (0, 1, 0)α4 = (a3, a4, a5) = (1, 0, 0)α5 = (a4, a5, a6) = (0, 0, 1)α6 = (a5, a6, a7) = (0, 1, 1)α7 = (1, 1, 1)α8 = α7 × α = α

(11.15)

Now you define another polynomial of degree m − 1 as P (x) = b0 + b1x + b2x2. The code is

then the correspondence of the equation 11.16 where (0, 1, 1) + (1, 0, 1) = (1, 1, 0) and so on.

(b0, b1, b2) → (P (0), P (α), P (α2), P (α3), P (α4), P (α5), P (α6), P (1)) (11.16)

Now if you wish to send the sequence (101001100) that is (α2, α5, α4) and P (x) = α2 + α5x +α4x2 so you will get the sequence (α2), (α2 + α6 + α6), (α2 + α7 + α8), (α2 + α8 + α10), (α2 + α9 +α12), (α2 + α10 + α14), (α2 + α11 + α16), (α2 + α5 + α4) i.e. the output sequence is

(101001100) → (101, 101, 100, 001, 001, 000, 100, 000)

In order to decode the message the an needs to be determined for the equation P (x) = a0 +a1x+ . . .+am−1x

m−1. So any m = 3 equations need to be solved from the 2n = 8 equations 11.17.The P ’s are the incoming data. The polynomial f(x) is known. By solving different sets of m youshould get the same an. The an you get often is the correct set. So if for a set of m you don’t get the

131

same an then one of the P ’s is incorrect and by using the an obtained from another set of m youcan obtain the correct P value.

P (0) = a0

P (β) = a0 + a1β + a2β2 + . . . + am−1β

m−1

P (β2) = a0 + a1β2 + a2β

4 + . . . + am−1β2m−2

. . .P (1) = a0 + a1 + a2 + . . . + am−1

(11.17)

11.5.2 Convolutional coding and Viterbi decoding

An example of convolutional coding [110] is shown in the figure 11.12. There are two outputbits O1 and O2 for each input bit X . Because of the delays the last four input bits are used tocompute O1 and O2. The adders shown are modulo-2. The encoder can be defined as the equation11.18.

+

D D D +X a

b

Figure 11.12: An example convolutional encoder.

a = xn−3 + xn−2 + xn

b = xn−3 + xn−1(11.18)

At the end of the sequence three trailing zeros are added until all the output bits are computed.Usually the tree for the encoding system is written as shown in the figure 11.13.

1

0

0

1

0

1

ab

ab

ab

ab

ab

ab

Figure 11.13: Tree code diagram.

In the figure 11.13 the data bit is shown below each branch while the corresponding codedbits are shown above the branch. If the number of delays used is K then there are K +1 branches.

132

Each branch will have one data bit and set of coded bits except the last branch which has K databits and K sets of coded bits to account for the K − 1 trailing zeros. So all possible states of theencoder are covered.

xn−1xn−2xn−3 Output xnxn−1xn−2

000 00 000001 11 000010 10 001011 01 001100 01 010101 10 010110 11 011111 00 011000 10 100001 01 100010 00 101011 11 101100 11 110101 00 110110 01 111111 10 111

Table 11.1: Data for the trellis diagram.

Viterbi decoding [111] is used to decode messages created by convolutional encoding. Theencoder is first depicted by a trellis diagram which looks as shown in the figure 11.14. So for thetable 11.1 there are 9 states on the left and right and they are connected by lines with the outputbits in the center. Now if you received the code 01 then it could only be one of the four transitions011 → 001, 100 → 010, 001 → 100 or 110 → 111.

Xn−1 Xn−2 Xn−3 Xn Xn−1 Xn−2

ab

ab

ab

Figure 11.14: A trellis diagram.

So in Viterbi decoding the previous states xn−1xn−2xn−3 which were already decoded are usedto decide what the next bit should be. In the presence of noise the coded bits that you receive maybe wrong. So suppose that 01 was sent but 00 was received. Then it has to be one of the fourtransitions 000 → 000, 111 → 011, 010 → 101 or 101 → 110. But suppose the bits already decodedshow that xn−1xn−2xn−3 is 011. Since this does not fit any of the known transitions, the nextincoming code is checked as well and the most probable value that should have been received isused.

133

11.6 Motor control

DSP Motor

Transducer

Figure 11.15: Motor control with a DSP.

There are many reasons why motors need to be controlled. During startup a motor may needmore torque to begin the rotation. If there are sudden load changes the current may have to beincreased or decreased to compensate. Frictional forces may vary with temperature. So DSPs areoften used to control motors as shown in the figure 11.15.

AC in

S2

S1 S3

S4

−

+

Figure 11.16: DC motor control.

Both DC and AC motors can be controlled by a DSP. With a DC motor only the current needsto be controlled. This is done using a bridge circuit as shown in the figure 11.16. For the casewhen the motor needs to be driven in the forward direction, S1 and S4 are high while S2 and S3are low.

AC in To motor

Figure 11.17: Three phase motor control.

But if you wish to control the speed then you need to apply a pulse width modulated signal toS4. When S4 is off, current will still need to flow because of the inductance of the motor coils andfor this case the diode across S3 turns on. So the diodes across the transistors are called "flywheel"

134

diodes because the motor spins due to it’s induction much as a flywheel does due to it’s momentof inertia.

With an AC motor the phase and phase factor has to be controlled as well as shown in the figure11.17. The pulse width modulation signals are used to drive the bases of the control transistors.In this case the phase relationship between the PWM signals makes a lot of difference becausethe three phases of the motor have to be 120o apart and so the induction needs to be balancedso that the pull of the motor is smooth. When the current is not balanced it is somewhat like anunbalanced load and the motor will start to vibrate.

So the DSP is used in a standard feedback circuit to control the motor as in the figure 11.15.As the motor coils rotate in the magnetic field produced by the fixed poles each has it’s own backemf and so in cost sensitive applications there is no position transducer and instead the voltageand current patterns through the coils are analyzed to determine how fast the motor is rotatingand what it’s position is.

The speed is usually what is controlled and the degree of control can make a lot of difference.For example imagine a dvd player where the laser beam reflections are being read by a detectordigital circuit. The appearance of the ovals under the laser beam has to synchronize with the clockof the digital circuit. The more precise the speed control then the smaller the ovals could be.

So in fact the actual configuration of the DSP loop depends a lot on the specific application.But the general idea is always the same which is that the pwm signal output from the DSP controlsthe motor and a feedback of some kind is fed into the DSP and processed to generate the pwmsignals, so just as in a control system you can write transfer functions and analyze the stabilityand perform the suitable design.

135

Chapter 12

I/O circuits and pcb interactions

Printed Circuit Boards or PCBs arrived on the technological scene well before micro-chips did.The original radio receivers were built using vacuum tube triodes and a lots of discrete resistors,capacitors and inductors. PCBs were invented to make these types of circuits more reliable andeasier to assemble. For the person who is assembling the circuit, it is definitely easier becauseall you need to know is where on the PCB to put each component and then you solder all theconnections and then you are done.

Nowadays PCBs contains as many as a dozen or more layers. The reason is that when youuse more layers to route with, you can reduce the area of the board, so even if you pay more forextra routing layers you are paying less for a smaller area board. In addition by increasing thenumber of routing layers, you can reduce the length of the interconnections and that will improvethe performance and the improvement in performance may be worth the extra cost. So the boarddesigners trade off the cost of the board against the performance they want and the space availablefor the board to fit in.

Most PCBs nowadays use surface mounted components so there is no need to make holes toplace the components. The component pins are flat against the board and are held by solder ofless than a square mm. But you still make holes to create connections between the interconnectson different layers.

12.1 Design consideration

There are basically only a few important characteristics that are important to making and usinghigh quality boards and they are:

1. Capacitive loading

2. Transit time

3. Line impedance

4. Electro-static discharge or ESD

5. Line drivers

6. Line terminations

7. Impedance variation

136

8. Spread spectrum technology

9. Cross coupling

10. Antenna effect

11. Ground bounce

12. Ringing

12.1.1 Capacitive loading

The capacitive loading of the line is the most important characteristic because it eats up somuch of the IO drive. Board level interactions with long PCB lines is usually done at low frequen-cies of under 100 MHz. Even a 100 MHz signal translates to a 10 ns period. A PCB line is usuallyabout 15 ps per inch, so a 10" trace is probably about 150 ps long, i.e. well under the period. Soa signal driven down a line as a high is often expected to be held for the entire 10 ns cycle. Evenassuming that the input at the end of the line is not sinking significant current, the capacitance ofthe entire line needs to be charged. Of course, some outputs may drive many inputs so you haveto add up the capacitance of all these lines. You can actually reduce this if you don’t care about thethird item on this list which is the line impedance, because if you don’t have a ground or powerplane close to this line, then your capacitance reduces. Of course in this case you will still have thesum of the parasitic capacitances to all lines running nearby or crossing underneath, which are atvarying voltages that are time-dependent. For low speed signals you don’t care about ground orpower planes or about line impedance, you just use a driver than can drive the load. In figure 12.1you can see three traces A, B and C. There is a large capacitance between A and B because theyoverlap each other all the way, but because C is perpendicular to both A and B, their capacitanceto C is much smaller and as we discussed in the first chapter is proportional to the area of overlapand inversely proportional to the distance between the conductors. So the parasitic capacitanceshave to be factored into your circuit simulation as shown in figure 12.2. In this figure, although itis not obvious, assume that A is above B and C is below B, so we can draw one parasitic capacitorbetween A and B and another between B and C.

A

B

C

Figure 12.1: Overlapping PCB traces.

12.1.2 Transit time

The transit time becomes important for high speed lines running at 100 MHz or faster. In it’ssimplest form transit time effects could effectively reduce the period of each cycle and apply stress

137

B

A

C

Figure 12.2: Parasitic capacitances of the PCB traces.

on some portions of the circuit. Imagine a situation as shown in figure 12.3. There are two chips onthe board, one is sending and the other is receiving. The both use the same clock C. In figure 12.4,the setup time that is available is the first half of the clock cycle. During this time the sender hasto drive the line capacitance and the load at the input of the receiver and pull the line high. Now,when you order a part you get a range from fast to slow i.e. you may get a sender that drives a lotof current, and so it is fast, on the other hand you may get a sender that drives the minimum speccurrent and requires almost the entire setup time to pull the line high. In fact even the input of thereceiver will have some variation and you may get a receiver that has a small input capacitanceand a small diode current or you may get a receiver with a large input capacitance and a largediode current. So both the sender and the receiver decide how long a setup time is needed. Nowin figure 12.4, the signal F is the situation when the driver and receiver are fast. Here the setuptime used is small and even after delaying the signal by 150 ps to get DF, the line has already beenpulled high well before the evaluate begins. On the other hand, for the case where sender andreceiver are slow S, the signal S is pulled high just before the evaluate, however the 150 ps delayadded by the line transit time causes the setup time to exceed a half cycle and so the signal DSis not ready to be evaluated when it should be. So what this means is that the designer needs toexpect the setup time to be reduced by the transit time.

150 ps

Receive Send

C

Figure 12.3: Logic circuit with long data line.

12.1.3 Line impedance

If the signals you wish to transmit are high speed and if timing is critical then designers alwayschose to use transmission line style routing. What this means is that there is a ground or powerplane above or below your trace. This type of transmission line is called a microstrip. In fact thePCB traces are not genuine microstrips. The reason is that a microstrip has one side open and

138

C

S

F

DF

DS

Evaluate

Figure 12.4: The effect of transit time delay.

a ground plane on the other side, but in a real life PCB, although you may have a ground planebelow (or above) the trace, above it on the next level you may have other traces running parallel toyour trace and these other lines will affect the impedance. Also the effect is not that high but youwill get some coupling from the traces on either side of your trace. There are many good bookswritten about transmission lines so we will just talk about things we are interested in.

A transmission line transmits a signal. In the figure 12.5 the upper circuit shows what a sliceof transmission line looks like. So if you take a slice half that length just halve the inductance,capacitance and resistance. Normally we ignore the resistance because if it is large enough to con-sider then we open ourself to many other problems, so just assume that the thickness and widthof the copper trace is sufficient to give a very low resistance. So we are left with the inductanceand the capacitance. The inductance is what gives the transmission line directivity. It is not thatinductance is special compared to capacitance but it is just that the capacitor in the equivalentcircuit is in parallel or shunt configuration so there is no difference between left and right, but theinductance is in series configuration so it does know whether the current is flowing in from theleft or the right. We have already discussed inductors and what we know is that the current can-not change abruptly and that the current that flows is proportional to the integral of the voltageapplied and is in the direction from positive to negative voltage.

Output (Input)Input (output)

Figure 12.5: The equivalent circuits of a transmission line.

So assume that the left end of the transmission line is pulled high as shown in figure 12.6. Thevoltage V1 is initially 0 i.e. it is at ground potential. This is because the voltage across a capacitor(in this case C1) cannot change instantaneously. The entire supply voltage is applied across L1.As time passes the current through L1 increases from 0 as the integral of the voltage across it. The

139

voltage V1 also increases from 0 toward supply voltage as the current through the inductor L1charges the capacitor C1. As V1 rises above 0, the voltage across L2 increases from 0 and thereforethe current through L2 increases from 0 and therefore the voltage V2 rises from 0 as the currentthrough L2 charges the capacitor C2. This sounds like V1 gradually increases from 0 to supply butin fact that is not true. Remember that for the perfect transmission line the equivalent circuit offigure 12.5 really consists of an infinite number of LC sections that are infinitely small, so in factwe need to think of L and C as reducing to smaller and smaller numbers, so the charging of C1is essentially instantaneous and therefore the charging of C2 is essentially instantaneous as welland so on. So to sum it up, any voltage you apply to one end of the line propagates to the otherend of the line. Now keep in mind another fact, namely that just as you cannot start the currentflow through L1 instantaneously, you cannot stop the current flow through L1 instantaneouslyand this property of inductance is what causes the signal to propagate from one end of the linetoward the other, in this case from left to right. This also means that if you turn off the PFET thatwas pulling up the left end of the line, the wave that you initially started continues to propagatetoward the right. So if the line in figure 12.6 is 150 ps long and you turn on the FET for 25 ps,then a 25 ps pulse travels from left to right. In other words the voltage waveform has a start anda finish. The rate at which the pulse is propagated is proportional to

√LC and the impedance

offered is proportional to√

L/C. The most common impedance that is used for PCB traces is the50 Ω standard. However occasionally designers use 75 Ω as their design point as a way to increasethe speed by reducing the current required.

L1 L2

C1 C2

V2V1

Propagation

Figure 12.6: Propagation along a transmission line.

What happens when the signal reaches the other end of the line. Well, it has to transfer allit’s energy to the load. We have already discussed the maximum power transfer theorem and weknow that the load has to match the impedance of the transmission line in order for the transmis-sion line to simply transfer it’s energy to the load. If the load does not match the impedance ofthe transmission line, then the excess energy has to be reflected back as another wave. If the loadimpedance is higher than that of the transmission line, then the reflected wave has the same po-larity as the incident wave i.e. a positive pulse is reflected as a smaller but still positive pulse anda negative pulse is reflected as a smaller but still negative pulse, however if the load impedance issmaller than that of the transmission line, then the reflected wave has the opposite polarity i.e. apositive pulse is reflected as a smaller negative pulse and a negative pulse is reflected as a smallerpositive pulse.

140

12.1.4 Electro-static discharge or ESD

Due to chemical reactions that take place in our body and also due to static electricity we pickup as we move, all our bodies have built up electrical imbalances with our surroundings. Whenwe touch any object, there is a transfer of charge. The energy is stored in capacitances. When wetouch the pins of a micro-chip, charge flows between us and the circuits connected to those pins.The time that this a problem is when the pin is connected to a gate inside the chip because the gateoxide is designed for use with very low voltages of about 2.5 V and the transfer of charge couldeasily create a voltage far in excess of 2.5 V and at these high voltages the charge just crosses thegate oxide and redistributes itself inside the chip and in doing this will destroy that gate oxide.

To protect against this happening, almost all pins in micro-chips that we buy are protectedby ESD circuits. The primary circuit is as shown in figure 12.7. It is comprised of two back toback diodes reverse biased from the inputs to the supply and ground connections as shown in thedotted box. When the pin voltage drops more than a diode drop lower than ground potential ormore than one diode drop larger than supply voltage Vdd the reverse biased diode will turn onand conduct the incoming charge safely to the supply or ground. Of course for a chip that is notconnected to anything the ground or supply potential will be a floating potential but even so theprotection will work because everything is referenced off that voltage whatever it is.

Pin

Vdd

Ground

CMOS Logic

Figure 12.7: Pin showing ESD protection.

12.1.5 Line drivers

The simplest output circuit or line driver is basically a set of approximately 4 or 6 inverterscascaded, with an ESD circuit slapped on at the output. The ratio of the width of the driversincreases at as close to e as possible, where e is 2.71. So:

W4W3

=W3W2

=W2W1

≈ e

Another characteristic is that the ohmic resistance of the PFET and the NFET of the final stage,when fully turned on should be approximately equal to the impedance of the line that the linedriver is driving. This is so that any reflected signals see a matching termination when they arriveback at the line driver.

141

W1 W2 W3

Ground

W4

Pin

Vdd

Figure 12.8: Output driver circuit.

12.1.6 Line terminations

Obviously we don’t want any reflected waves. The goal of modifying the input load is simple.We already have some input impedance and we want to add either a series resistor or inductor,or a shunt resistor to make the input appear to be 50 Ω. The way to decide if you need a seriesor shunt resistor is to look at the impedance at the pin looking into the chip. If that impedanceis greater than 50 Ω then you need a shunt resistor. If it is less than 50 Ω then you need a seriesresistor. When calculating the impedance looking into the chip input, simply short power suppliesbecause they are assumed to have zero internal resistance. In the figure 12.9 and figure 12.10 thecapacitances C1 and C2 inside the dotted line are the reverse biased diode capacitance which isactually an active capacitance, not really passive because as either diode moves toward turn-on,it’s capacitance increases almost as a square law dependency. The capacitances C3 and C4 are thegate capacitances of the PFET and the NFET to source. Again C3 and C4 are active capacitancesbecause they depend on the level of inversion. The inductance L1 is the self-inductance of thefine bond wire that is used to connect the pin to the bond pad on the chip itself. In reality as thechips are getting smaller mutual inductance between the bond wires also becomes important. Tocombat this chips that are used in certain types of applications use a special tape that is a signalground transmission line pair at 50 Ω in place of the bond wire.

Pin

Vdd

Ground

C2

C1 C3

C4

L1

From transmission line

Rt

Figure 12.9: Series terminated chip input.

In figure 12.9 the series termination resistor is equal to the difference of the transmission lineimpedance and the input impedance at the chip input i.e.

142

Rt = 50Ω − |(jωL − j

ω(C1 + C2 + C3 + C4))|

Pin

Vdd

Ground

C2

C1 C3

C4

L1

RtFrom transmission line

Figure 12.10: Shunt terminated chip input.

In figure 12.10 the shunt termination resistor is obtained such that the parallel combinationof the shunt resistor and the input impedance at the chip input is equal to the impedance of thetransmission line. So:

Rt =50|(jωL − j

ω(C1+C2+C3+C4))||(jωL − j

ω(C1+C2+C3+C4))| − 50Ω

Impedance unlike resistance is a function of frequency and in the two equations for Rt wesubstituted ω which is 2πf and we need to decide what f to use. Now, the fourier transform ofa square wave contains appreciable energy at fundamental frequency f, at 3f and at 5f, in thatorder. Board designers use a quantity called the knee frequency which is the 3f. So if the signal isa 10 MHz signal, the knee frequency is 30 MHz. So we just use the 3f frequency to calculate the Rtrequired and use it in the correct configuration and what this should do is to match the impedancelooking into the chip with the impedance of the transmission line and so there should be muchless reflection of energy.

12.1.7 Impedance variation

One fact that all designer whether chip designers or PCB designers have to live with is thatall process has built-in variation. So, even if you design the perfect PCB, you will always have asignificant percentage of boards where there is a difference between the line impedance and theimpedance of the driver circuit (including it’s termination if any) and between the line impedanceand the impedance of the receiver input (including it’s termination if any). So what is the effect ofthis mismatch on the signals that travel along the transmission line connecting these mismatchedchip to transmission line pairs ? Well normally this effect is easy to determine, just apply a highimpedance probe to either end and you should see a pattern as shown in figure 12.11. It isnormally called a staircase pattern. The probe you use could affect the mismatch but the thingto look for is the spacing in time between the steps in figure 12.11. This spacing is twice thetransit time between the two points of reflectance, so you can tell where the other mismatch is.Note however that even when you determine the cause of the mismatch it is very bad policy totry to fix it on that particular board, such decisions should only be made to the design as a wholei.e. to affect a change to all boards of that design.

143

Time

Figure 12.11: Staircase patterns on PCB lines due to impedance mismatches.

12.1.8 Cross coupling

In figure 12.12 we have a chip on the left sending a set of signals to the chip on the right.For example maybe the chip on the left is a DRAM and it is sending a byte of information. Soassume that line 1 and line 2 are transmitting 0 and line 3 are transmitting 1 and line 4 and line5 are transmitting 0. Normally even if there is a certain amount of cross coupling between thelines, there is no problem. The problem is statistical in nature. Companies that build boardsbuild them in large volumes and they use parts from different parts bins stocked with chips fromdifferent manufacturers. So for every few hundred boards you build a situation will arise wherethe receiver chip i.e. the one on the right in figure 12.12 contains PFETs on the fast corner andNFETs on the slow corner.

What this means is that this receiving chip will struggle to recognize 0s. And of every fewhundred boards built some will have also have senders that have PFETs that are on the fast cornerand NFETs that are on the slow corner. Such boards may fail stringent testing but may pass a lightstandard of testing where only a small number of tests are run. In such boards, when the line3 pulls high, it will cause cross coupling effects that try to cause lines 2 and 4 to pull up a little.Again we should note that normally this would not be a problem, but in the specific cases wetalked about with weak NFETs and strong PFETs, this cross coupling will try to fight the sender’spull-down current and will act to try to confuse the receiver and will succeed occasionally and thisagain is really a design issue i.e., instead of using only the chip specs on the sender and receiverchips, if the designer also adds a little margin in the calculation to account for the effect of cross-coupling, then the problem goes away. Keep in mind that with profit margins as slim as they are,trying to enforce quality control through testing and binning only will invariably fail because nomanufacturer can afford really stringent testing.

1

2

4

5

3

Figure 12.12: Cross coupling between PCB traces.

144

12.1.9 Antenna effect

The worst thing you can have in a PCB design is a trace that is connected to something on oneend but is not connected to anything on the other end. If this kind of trace is long enough, it canpick up electromagnetic energy just like an antenna. So this means that the voltage on the tracesconnected to this stub will fluctuate and this fluctuation can cause a high to appear as a low or alow to appear as a high. Usually this kind of trace occurs when a board designer changer his/hermind about where to place some component and they forget to remove the old trace.

12.1.10 Ground bounce

Ground bounce is a phenomenon that has killed many a good design and left the designersand testers scratching their heads alike wondering what the heck is going on. Usually in situationslike this the designers are frantically looking up the specs for the chips they used trying to see ifthe current drive was sufficient to drive the length of the transmission line used. But the answerwith ground bounce has nothing at all to do with the chip specs and has everything to do withthe PCB as shown in figure 12.13. It looks normal, you think the supply and ground lines are alittle long, but the trace resistances are low so no big deal but the picture becomes clearer whenyou look at figure 12.14, and this is what circuit sees when the output of the chip turns on and istrying to drive a large current.

Long ground line

ChipSmall output current

Figure 12.13: Circuit that misbehaves due to ground bounce.

ChipSmall output current

Supply voltage V1

Lg Current

Current

Figure 12.14: The effective circuit seen by the chip.

The long ground line is actually an inductor due to it’s self-inductance but normally you won’tsee this inductance Lg because the chip is drawing a steady current and as long as the currentstays approximately the same, there is no voltage drop across Lg but the instant the output driveris turned on the current required by the chip from the power supply increases quite a bit, andthe the current through Lg cannot change instantaneously, so a voltage develops across Lg and so

145

the actual voltage V1 seen by the chip is less than the Supply voltage by this amount of voltagethat is dropped across Lg. But since the effective supply voltage V1 is lower than normal, theoutput drivers cannot drive a sufficient current. This phenomenon of effective reduction of supplyvoltage V1 is called ground bounce because the ground appears to bounce whenever the outputis turned on. The solution to ground bounce is quite simple and is shown in figure 12.15, all thatneeds to be done is to buffer the supply with a small capacitor. Remember that the voltage across acapacitor cannot change instantaneously, so when the output drivers turn on, the surge in supplycurrent that is required by the chip is supplied by the buffer capacitor and once the drivers turnoff, this energy is replenished from the supply line. So effectively the buffer capacitor isolates thechip from the ground inductance and so the circuit works fine.

Chip

Lg Current

Current

Supply voltageNormal output current

Figure 12.15: The solution to ground bounce.

12.1.11 Ringing

Ringing is not really a technical term, rather just a colloquial term to describe an effect thatis very commonly encountered when transferring a signal from a source to a destination via atransmission line. It’s effect can cause a perfectly normal signal to be read incorrectly. In the figure12.16 you see a driving circuit i.e. it is sending the signal and the receiver circuit which has a loadcapacitance. If the resistance of the transmission line is not sufficient to damp the driving currentit will cause overshoots and undershoots at the load capacitance at the input of the receiver asshown in the figure 12.17. This type of circuit is called under-damped.

VoutVin

Figure 12.16: Driving point inductance and load capacitance.

What happens is that as the current charges the capacitor, it is storing energy in the inductorand even when the capacitor is fully charged the inductance forces the current to continue flow-ing as the energy in the inductor dissipates and this extra current flow causes the voltage at thecapacitor to go beyond what it otherwise would. The problem as seen in the figure 12.17 is thatthe oscillation as energy is swapped back and forth between the inductor and the capacitor is that

146

Hi

Lo

Figure 12.17: Overshoot (left) and Undershoot (right).

the voltage at the Vout crosses the Hi and the Lo signal levels more than once and so it confusesthe receiver and whether the correct logic level of 1 or 0 is read really depends on when the signalat Vout is sampled by the receiving circuit. These ripples are called ringing. Proper terminationwill remove this problem.

12.2 Spread spectrum technology

Believe it or not PCBs generate a certain amount of radio energy that is regulated by the FCC.The reason is that clock speeds and therefore signal speeds are getting so high that the overtonesapproach the RF frequencies used. For the most part these high frequencies are generated due tothe dV

dt and the dIdt at the signal edges. As shown in figure 12.18 if the sudden surge of current at

the onset of the signal going high or going low is reduced, the amount of high frequency energygenerated is reduced. But SST is more than just that. In SST the shape of the signal edge is variedeven between different pins of the same chip especially if there are many signals going in andout. The reason is that even if you slow down these current surges, the high frequency energygenerated from the signals of different pins may all have their energy at the same frequenciesand so they start to add up, but if you vary the rate of current change to be different for differentsignals from the chip, then their high frequency energies don’t add up and are spread over thehigh frequency spectrum and hence the name spread spectrum technology. So the name of thegame in SST is to vary the output drivers’ current so that the rate of current change is more variedacross the outputs of each chip.

Less high frequency energy

Lots of high frequency energy

Figure 12.18: Reducing the rate of voltage and current change.

12.3 Input/Output or IO circuits

This chapter would not be complete without a discussion of an IO circuit. In figure 12.19 isshown a simple Input/Output or IO circuit. It is called an IO circuit because it can either read-in

147

from the pin or it can output to the pin. In fact is is usually a tri-state device because it can be setto a third state where-in it neither reads-in or writes-out but instead has a high impedance lookinginto the chip. The table in Table 12.1 shows the values of C1 and C2 required to control the stateof the IO circuit.

C1 C1b C2 C2b State0 1 1 0 Output1 0 0 1 Input1 0 1 0 High impedance

Table 12.1: States of the IO circuit.

PG2A PG2

PG1 PG1A

Ground

Pin

Vdd

W1 W2 W3 W4C1

C1b

C2b

C2

C1

C1b

C2b

C2

S1 S2 S4

S5

S6

S3

Figure 12.19: A simple IO circuit.

In figure 12.19, the boxes marked PG1, PG1A, PG2 and PG2A are called pass-gates. and arediscussed below, but for the IO circuit itself, when the signal pair C1 and C1a turn on PG1 andPG1a and when at the same time the signal pair C2 and C2a turn off PG2 and PG2a, the effect isto allow the signal inside the chip to propagate out to the pin through the output driver circuit.But when the signal pair C1 and C1a turn off PG1 and PG1a and when at the same time the signalpair C2 and C2a turn on PG2 and PG2a, the effect is to cause the signal at the pin to propagate inthrough the input buffer, and proceed into the chip i.e. the signal is an input. If you turn off PG1,PG1A, PG2 and PG2A all at the same time, then the pin is isolated from the signal inside the chipand signals cannot propagate either in or out and the isolation is high impedance. The table 12.2shows the purpose of the four pass gates used by figure 12.19.

148

Mode pass-gate Bad effects if permanently on or removed:Input PG1 The driver inverter chain will amplify the signal on S1 and so S4 will

move move up and down causing noise on S6. A lot of power is wasted.Input PG1A Since the output driver chain is so strong, the signal it is driving at S4

will completely overcome S6, so inputs will never make it into the chip.Output PG2 If this pass-gate is removed it will create some noise because the buffer

will follow S6.Output PG2A Similar to removing PG1A, removing PG2A will cause the signal to

become clamped. Here signal S1 gets clamped to the value of S3 i.e. thesignal driven by the input buffer.

Table 12.2: The purpose of each pass-gate.

149

Chapter 13

Automatic Test Equipment

Only a small fraction of all chips are tested. Typically a lot contains 25 wafers. Each wafercontains 10 - 20 chips. So if one chip per lot is tested, that would be normal. But a high end chipsuch as a microprocessor may have 400 pins. So the number of possible input combinations isincredibly large. So again it is not possible to test all the chip’s functionality.

DUTBoard

Pin Driverboards

Tester boards

Computer

Figure 13.1: An ATE setup.

A typical ATE setup is shown in the figure 13.1. The DUT is mounted on the DUT board whichis mounted over the pin driver boards, and this whole unit is called the test head. The test head isconnected to a cabinet full of tester boards by the use of 50 Ω or 75 Ω cables. There is a lot of databeing transmitted from the test head to the tester boards and vice versa.

Suppose there are 100 pin driver boards in a test head. Assume each board is collecting infor-mation for four pins of the DUT, with each cycle divided into 5 slices. Assume that the DUT isbeing tested at 4 GHz. This means the test head is collecting 100 × 4 × 5 × 4 × 109 = 8 × 1012

pieces of information per second.

Because such a large amount of information needs to be processed, you cannot connect thetest head directly to the computer and you need a cabinet full of tester boards to interact with thetest head. As a general rule the chips used tend to get cheaper as you move away from the DUTtoward the computer.

150

13.1 DUT board

The Device Under Test is mounted on a board which is usually circular as shown in the figure13.2. The DUT is placed in a chip socket in the center of the board and the connections to thepins radiate outward to the edge of the board. The socket used is usually specially created foreach chip and is designed for good contact without the need for solder. Often there may be a chipcover which exerts force on the chip pressing it against the connections in the socket.

Figure 13.2: The DUT board of a tester.

The DUT board needs to be connected to as many as a hundred pin driver boards. These pindriver boards may need to be changed occasionally. So at the edge of the DUT board are coaxialconnectors as shown in the figure 13.3. So when the DUT board is placed over the 400 connectorsfrom the pin driver boards and pressed down, the two connectors mate and a connection is made.

Dut board connection

Tester board connection

Figure 13.3: The pin driver boards connecting to the DUT board.

From the connector on the pin driver board to the inside of the chip, the equivalent circuit mayappear as shown in the figure 13.4. Ideally you want the PCB trace on the DUT board to be asshort as possible.

151

50 ohms 50 ohms

Parasitics from connectors

10 ps

Figure 13.4: Trace from pin driver board.

13.2 Main computer

The tester boards, the pin driver boards and the DUT board perform specific logic and theydo not use microprocessors. So a computer is the starting point to convert the test requirementsinto the instructions sent to the testing boards. The computer does this part. It does not actuallyparticipate in the testing but it directs it.

From the point of view of the tester the behavior of the DUT’s pin is usually looked at from anevent driven stand-point. The reason is that when a pin does malfunction it’s behavior is reallycomplex, so the tests that are run usually pinpoint specific questions to be answered. So the testingis split up on two fronts, one is time i.e. when a test is run and the other is IO behavior i.e. whattest is run.

T1 T2 T3 T4 T5 T1

Figure 13.5: How a cycle is divided.

In the example in figure 13.5 the output pin is shown to go high for some period of time andthen goes back to low. The cycle is divided into 5 time-slices as shown in figure 13.5, with thecycle starting at T1 and ending at the next T1. The interval from T1 to T2 is the time that youexpect that the output will not change i.e. T2 is the earliest that the chip spec specifies that theoutput can change. Between T2 and T3 you do expect the output to go high. Between T3 and T4you expect the output to stay high. Between T4 and T5 you expect the output to pull to a low andfrom T5 to the next T1 you again expect no change to the output. This is shown in the table 13.1.

time-slices start-time end-time action at DUT pin1 T1 T2 high impedance2 T2 T3 0 → 1 transition3 T3 T4 hold at 14 T4 T5 1 → 0 transition5 T5 T1 high impedance

Table 13.1: Setting up a test.

152

13.3 Tester boards

The main function of the tester boards is to isolate the controlling computer which can onlyhandle a small amount of data from the test head which is generating large amounts of data.

The computer used is usually a fast 64-bit workstation. In this computer there may be placedone or more cards with a dedicated mode of communication with the cabinet full of tester boards.So the side of the tester board that interacts with the computer may have chips that do this com-munication. Between this side of the tester board and the other side which communicates withthe pin driver boards, all the logical processing needs to be done.

The type of information that needs to be sent to the pin driver boards must include the timinginformation for each edge, the voltage that needs to be applied at each timing edge, whether thepin is expected to be an input or an output, the drive current if it is an input, the load impedanceif it is an output, the result i.e. whether the pin passed or failed and a failure code if applicable.

13.4 Pin driver boards

Pin Driver

(Analog)

Instructions

15 ps

@ 50

DUT pin

Failure data

T1

T3

T4

T5

T2Logic

chip

n

Main Sync

Figure 13.6: Tester circuit leading to DUT pin.

The logic chips as shown in the figure 13.6 perform the interaction between the test head andthe tester boards and they output the binary bits that are used by the timing chips. They alsooutput the control bits for the pin driver chip i.e. it’s operating mode and everything except thetiming edges.

13.4.1 Timing generation chip

The timing chip is the one in the center of the figure 13.6. The timing generation is dividedinto two cascaded sections, the coarse delay and the fine delay. Suppose you wish the tester to beable to test circuits at frequencies of 1 GHz and above. Then each of the timing edges maybe maybe anywhere from 0 to 1 ns from the start of the cycle.

Suppose you need the timing accuracy to be 1 ps. This will require a 10 bit timing instruction.Then the coarse delay will be designed to output edges at 0 ps, 128 ps, 256 ps, 384 ps, 512 ps, 640ps, 768 ps and 896 ps. The fine delay will be capable of outputting delays of 0 - 127 ps in units of1 ps.

153

Vdd

T R

DAC0 − 7

+

−

Figure 13.7: The fine delay.

The coarse delay is generated by a Delay Locked Loop as shown in the figure 10.15. Essentiallyit divides a 976.56 MHz clock cycle into 8 parts. The selection of which edge to be used is the topthree bits i.e. bits 8, 9 and 10.

T100 ps

DACvoltage

Figure 13.8: The actual ramp and the cross-over point.

The fine part of the delay on the other hand is generated differently. It is purely analog innature. It is usually based on what is known as a ramp circuit. Ramp circuits are derived froma larger family of circuits known as multi-vibrators. Multi-vibrators date back to when digitalcircuits were just being born. They are divided in Bi-stable and Astable families. The first of thetwo is stable in either of two states while the latter is stable in neither state and constantly switchesstates giving rise to a square wave output. It is the Bi-stable family that we will look at as shownin the figure 13.7.

So the way that the fine delay works is that you first apply the reset signal R. This drives theoutput low and the circuit is ready to be triggered. Meanwhile you apply the lower 7 bits of thedelay instruction to the input of the DAC and so the output of the DAC is the reference voltageat which the delay is set. Now the output signal from the coarse circuit is connected to the triggerinput T of the fine circuit so when the coarse signal arrives it triggers the fine delay and the outputof the ramp starts to rise as shown in the figure 13.8. When the ramp voltage becomes higherthan the voltage from the output of the DAC the operational amplifier which is comparing thetwo voltages changes states and the output of the operational amplifier goes from low to high.

So as you can see by working together the coarse circuit and the fine circuit are able to converta 10-bit instruction into a delay of 0 to 1024 ps and they do this perhaps 4 billion times per sec-

154

Del

ay o

btai

ned

Increasing digital input

obtained

expected

INL

DNL0

0

Figure 13.9: Integral and differential non-linearity.

ond. The two metrics that are usually used to determine the quality of the timing chips are INLand DNL meaning Integral non-linearity and Differential non-linearity. We talked about how thetiming chip generates a delay based on the digital input you supply it. So let us suppose that wesupply a sequence from zero to the maximum digital value allowed and plot the delay that we getas shown in the figure 13.9.

In this figure assume that each step is supposed to be 4 ps. At zero, you were supposed to geta zero delay but instead you got 1 ps. So the DNL here is 1 ps. The INL is also 1 ps. At the nextstep you were supposed to get 4 ps, but instead you got 6 ps. The DNL is again 1 ps, but the INLis now 2 ps. So as you can see the INL is the integral of the timing error upto that point. whereasthe DNL is the error for each step. Notice that the INL in this figure increases upto 3 ps and thenreduces back to zero. As a general rule people are more interested in the INL because it actuallyrepresents the error you are going to get in your testing. Ideally both INL and DNL should be zerobut this rarely happens.

13.4.2 Pin driver chip

The pin driver chip is more like a linear amplifier than an IO circuit. It’s block diagram looksas shown in the figure 13.10. The timing signals from the timing chips come in as differentialsignals. The bits setting the drive current and the output voltage come in as digital signals fromthe logic chip at the beginning of the pin driver boards.

The output driver is usually a push-pull stage. The pin driver chip will require a separate Vcc

and Vee to allow the fast slew rate all the way down to 0 V and up to the supply voltage. Fortiming accuracy as well as drive current the pin driver chips are usually bipolar.

The bits that control the output voltage are used to control the upper and lower supply voltagesthat the push-pull stage actually sees and the bits that control the slew rate simply set the drivecurrent of the current sources driving the push-pull stage so then the differential timing signalsswitch the push-pull stage from one set of control bits to the next. A pin driver used in the industryis described in [112].

155

Push−Pull

Drive current 1 Output voltage 1



T1

T2

T3

Figure 13.10: A pin driver block diagram.

156

Chapter 14

MMICs

Monolithic microwave integrated circuits are usually expensive to build because on the onehand the devices need to extremely fast meaning the feature size has to be as small as possible buton the other hand waveguides are fairly large and occupy a large area. For this reason most RFcircuits are built using discrete elements placed on a high quality printed circuit board which alsoholds the waveguides. But as the frequencies rise and with the increase in the digital portions ofthe circuit more and more functionality can be placed on a single chip and reducing the complexityof the assembly ofsets the increased cost of the chip so MMICs are becoming larger.

14.1 Lumped and distributed elements

By physical reality all elements are distributed elements. But at low frequencies many devicescan be modeled as ideal elements such resistance, inductance and capacitance or combinations ofthem with almost no loss in accuracy. Then you can call them lumped elements.

At high frequencies however most devices show their distributed properties. Usually it is justa case of parasitics displaying themselves for example a 1µF capacitor at 1 kHz may appear asan RC network at 1 GHz. Bond wires connecting a chip pad to the chip pin exhibit their selfinductance as the frequency is raised.

Sometimes parasitics have even more impact for example two bond wires which are shorts at1 KHz exhibit their mutual inductance at 1 GHz and I/O energy may be coupled from one pininto another pin to which it is not even physically connected.

But, in this chapter we are not looking at the effect of parasitics. We are looking at structureswhich are designed to be operated at microwave frequencies. Structures whose physical dimen-sions are only several wavelengths at the operating frequency. When analyzing lumped circuitswe were interested in current flow as a function of time, but here we are interested in the propa-gation of electromagnetic waves in microwave structures.

14.2 Maxwell’s equations

The four Maxwell’s equations are:

∇× H = J +∂D

∂t(14.1)

∇× E = −∂B

∂t(14.2)

157

∇ · D = ρ (14.3)

∇ · B = 0 (14.4)

Equation 14.1 is Ampere’s law, equation 14.2 is Faraday’s law and equation 14.3 is Gauss’slaw.

14.3 Transmission lines

The equation for the impedance looking into a transmission line of characteristic impedanceZ0 of length l and terminated by a load of ZL is given by

Z = Z0ZL + Z0 tanh(α + jβl)Z0 + ZL tanh(α + jβl)

(14.5)

α is the loss coefficient, so that eαl gives the fraction transmitted and β is the phase constant,so that ejβl will give the phase change over a distance of l. α is usually negative but in the field offiber optics, there are special sections of optical cable which amplify the signal as it goes throughthem which is a better alternative to using repeaters.

The figure 14.1 shows a junction between two transmission lines of different characteristicimpedance Z1 and Z2. The reflection coefficient for a wave on Z1 incident on the junction is givenby the equation 14.6. The transmission coefficient on Z2 is given by the equation 14.7. T and Γare for voltage so from power conservation you get the equation 14.8.

Z1 Z2

Reflected

Incident

Transmitted

Figure 14.1: Reflection and transmission at a junction.

Γ =Z2 − Z1

Z2 + Z1(14.6)

T = 1 + Γ =2Z2

Z2 + Z1(14.7)

| Γ |2 +

∣∣∣∣∣ T

√Z1

Z2

∣∣∣∣∣2

= 1 (14.8)

Nowadays the design of integrated waveguides is usually done using a partial differentialequation (pde) solver. The reason is partly that computers have become some much more acces-sible, but also because the analytical solutions often require certain simplifying assumptions. Buta pde solver can solve any structure with equal accuracy and the accuracy is often higher thanan analytical approach can obtain. In a numerical solution you can also apply local variation inparameters such as for example using a dielectric whose permittivity varies. But equations such

158

as [113] were done at a time when computers were not easily available and so they are very im-portant.

The figure 14.2 shows a microstrip line. It is comprised of a signal line of width W andthickness t separated from a ground plane by a dielectric layer of height h. Most of the electricfield goes through the dielectric under the signal line but some field lines go through the air sothe permittivity is an effective value between air and the dielectric material. The equation for themicrostrip line characteristic impedance is given by the equations 14.9, 14.10, 14.11 from [114].

W

h=

8√

X11(7 + 4

εr) + 1

0.81(1 + 1εr

)

X(14.9)

X = exp

(Z0

√εr + 1

42.4

)− 1 (14.10)

Z0 =42.4√εr + 1

ln

1 +

(4h

W

) (

14 + 8/εr

11

) (4h

W

)+

√(14 + 8/εr

11

)2 (4h

W

)2

+1 + 1/εr

2π2

(14.11)

Signal

Ground

t

h

W

Figure 14.2: A microstrip line.

The figure 14.3 shows a stripline. It is different from a microstrip because it has two groundplanes instead of one and the field is contained in the dielectric. The characteristic impedance isgiven by the equation 14.12 from [113], [115].

Z0 =30√εr

ln

1 +

(8h

πW

) (

16h

πW

)+

√(16h

πW

)2

+ 6.27

(14.12)

The figure 14.4 shows a slotline. It is comprised of a slot of width W in a single metalizationlayer over a dielectric of height h and there is no ground plane on the other side of the dielectric.The electric field between the metal on either side of the slot has three paths to follow. Through the

159

Figure 14.3: A stripline line.

dielectric, through the air of the slot, and through the air above the metalization. The characteristicimpedance is given in [116] and [117] however the closed form solution is obtained in a piecewisemanner and so there is a different equation for different ranges of W/h and εr.

h

W

t

Figure 14.4: A slotline.

The figure 14.5 shows a coplanar waveguide. It has a center strip which carries the signal andground planes on either side all of which is placed above a dielectric of height h.

The figure 14.6 shows a coplanar strip. It has two strips which carry the signal and groundand are placed on a dielectric of height h.

14.4 N-port circuits

The h, y and z parameters are based on the figure 14.7. The y parameters are based on admit-tance because they all have the units of I/V , the z parameters have the units of impedance, the hparameters are transfer parameters.

160

h

t

W

Figure 14.5: A coplanar waveguide.

h

W

t

Figure 14.6: A coplanar strip.

I1 I2

V1+

−

+V2−

Figure 14.7: Two port network.

161

14.4.1 h-Parameters, y-Parameters & z-Parameters

V1 = h11I1 + h12V2 (14.13)

I2 = h21I1 + h22V2 (14.14)

So if you wish to obtain h11 you can short the port 2, so V2 = 0. Similarly to obtain h12 measurethe voltage V1 with the input open-circuited. Similarly you repeat these two measurements withthe ports reversed to get h22 and h21. However, you can extract the h parameters even withouteither short circuiting or open circuiting either port by simply fitting to multiple data points. They-parameters are given by equations 14.15 and 14.16 and the z-parameters are given by equations14.17 and 14.18.

I1 = y11V1 + y12V2 (14.15)

I2 = y21V1 + y22V2 (14.16)

V1 = z11I1 + z12I2 (14.17)

V2 = z21I1 + z22I2 (14.18)

14.4.2 S-Parameters

a1

a2

a3

b3

b2

b1

Figure 14.8: S-parameter measurement.

Unlike the h, y and z parameters, the S parameters do not require the use of short circuits oropen circuits. Short circuits may not be possible to implement for certain circuits because it maycause the circuit to become unstable.

The s parameters are thought of more in terms of power transmission and reflection as in thefigure 14.8. At high frequencies you can use devices such as circulators to separate input fromoutput and so it more convenient to apply a sinusoidal radio frequency input and measure the

162

amplitude and phase of the signal going in and coming out at either port. The signal coming outcould be either just reflected back or transmitted through from the other side.

b1

b2

b3

=

S11 S12 S13

S21 S22 S23

S31 S32 S33

a1

a2

a3

(14.19)

14.5 Balun

3L/4

1 2

34

L/4

L/4

L/4

Figure 14.9: A rat race coupler.

A rat race coupler and a balun both use quarter wavelength sections. But a balun [118] hasonly one input and one output and is used to isolate a balanced impedance from an unbalancedimpedance which is where the word balun comes from. One example is isolating an antenna fromthe receiver connected to it.

A rat race coupler is shown in the figure 14.9. If the impedance of the line providing the inputis Z0 then the impedance of the coupler should be Z0

√2. There are 3 quarter wavelength sections

between ports 1 and 2 and a quarter wavelength between 1 and 4, between 2 and 3 and between 3and 4. The input at 1 is output at 2 and 4 but not at 3. The output at 1 and 4 are 180o out of phase.

A balun is shown in the figure 14.10. It is a modification of the rat race coupler. It’s advantageis that it has a larger bandwidth than the rat race coupler. The starting values used for t1 and t2are λ/4 and λ/2 but they can be adjusted to improve the bandwidth. The port 3 of the rat race isopen circuited. The input is at 1 and the outputs are 2 and 4.

14.6 Circulators

A circulator [119] uses a ferromagnetic material to give a different impedance for a signaltraveling in one direction from a signal traveling in the opposite direction. The structure of acirculator is shown in the figure 14.11.

It is a T junction except that above and below the T are placed magnetized ferrite slabs of asuitable magnetic orientation. The magnetic field due to the ferrite slabs is directed into the faceof the junction. For the input mode where the magnetic field vector lies in the plane of the ferriteslabs, a resonance is set up [119] such that the standing wave pattern within the center disk is

163

3L/4

L/41

2

4

t1

t1

t2

Figure 14.10: A balun.

FerritedisksConductor

H

1

2

3

Figure 14.11: The structure of a circulator.

164

rotated by 30o at the design frequency so that a signal entering at 1 is output at 2 but is null at 3.Similarly a signal entering at 2 is output at 3 and is null at 1.

Circulators are very useful in isolating signals going into a port from the signals coming out ofthe port. So if port 2 is connected to the input of a following section, port 1 can be connected to asignal source and port 3 can be connected to a detector to measure the signal reflected back fromthe input of the following section.

14.7 Impedance transformers and filters

Γa =Z2 − Z1

Z1 + Z2(14.20)

An impedance transformer allows a signal on a line of impedance Z1 to be transferred onto aline of impedance Z2 with minimum reflection. Without an impedance transformer the reflectioncoefficien is Γa. It is asymmetric and the key is that looking into the left of the transformer theinput appears to be of impedance Z1 whereas looking into the right of the transformer the inputappears to be of impedance Z2. Of course the catch is that it only does this at a given frequencyand at higher or lower frequencies there is plenty of reflection as you move further away from thedesign frequency.

L

Z1 Z2

Figure 14.12: A tapered line.

Often a tapered transmission line can be used to connect two transmission lines of impedanceZ1 and Z2. There are many options such as a linear change, an exponential change etc, the mostpopular is the Klopfenstein taper [120], [121] shown in the figure 14.12. The design equations are14.21, 14.22 and 14.23.

A = acosh

(Γa

Γreq

)(14.21)

φ(x,A) =∫ x

0

I1(A · √1 − y2)A · √1 − y2

dy (14.22)

ln(

Z(x)Z1

)=

12

ln(

Z2

Z1

)+ Γreq A2 · φ

(2x

L,A

)(14.23)

The simplest and most used impedance transformer is the quarter wavelength transformer asshown in the figure 14.13 [122]. The design is simple because the impedances of the sectionseither increase or decrease monotonically and they are selected such that if you have n steps the Γat each step is approximately the same at all the n + 1 steps. But if you are designing a filter usingquarter wavelength sections it will appear more as in the figure 14.14.

165

Figure 14.13: A quarter wavelength impedance transformer.

Figure 14.14: A sample line and stub filter.

The approach of using quarter wavelength transmission line sections like lumped elements indesigning a filter was proposed by [123] using the complex frequency variable S such that a highpass lumped filter response in Ω in the frequency range −∞ ≤ Ω ≤ ∞ transforms to the range−ω0 ≤ ω ≤ ω0 repeating every 2ω0.

S = jΩ = jπω

2ω0(14.24)

A quarter wavelength section with an open circuit for load is a capacitance whereas if the loadis a short circuit it would be an inductance. As in the figure 14.14, by using shorted and opensections in series and in parallel, you can construct a filter with your desired frequency response[124], [125].

Initially the polynomial representation of the filter function needs to be obtained and then theelements need to be extracted. Each quarter wavelength section is represented by the matrix inthe equation 14.25. There are three type of elements used namely the shorted and open quarterwavelength sections, the unit elements which are quarter wavelength sections connected at bothends and the redundant elements which are used to physically separate the shorted and opensections. [

b1

a1

]=

1s21

[−∆s s11

−s22 1

] [a2

b2

](14.25)

If you use n u.e.’s and m distributed L’s and C’s in the filter, then you can use a standardrepresentation such as a Butterworth or Chebyshev representation so for example the ratio ofreflectance to transmittance for a Butterworth high pass would be represented as in the equation14.26 or a Chebyshev low pass would be represented as in the equation 14.27 where Sc is thecutoff frequency and Tm(x) = cos(m acos x) and Um(x) = sin(m acos x).

|ρ|2|t|2 =

(Sc

S

)2m(

S√

1 − S2c

Sc

√1 − S2

)2n

(14.26)

|ρ|2|t|2 = ε2

[Tm

(S

Sc

)Tn

(S

√1 − S2

c

Sc

√1 − S2

)Um

(S

Sc

)Un

(S

√1 − S2

c

Sc

√1 − S2

)]2

(14.27)

|ρ|2 + |t|2 = 1 (14.28)

166

The elements are extracted from the Zin using Richards’ theorem [123], for example removing au.e. gives the remaining impedance as in the equation 14.29 and this process needs to be continueduntil all elements are extracted. You may have to use the Kuroda identities which are listed in[125].

Z ′in(S) = Zin(1)

SZin(1) − Zin(S)SZin(S) − Zin(1)

(14.29)

Zl

L

Figure 14.15: A coupled line filter.

An alternative way to make a filter is to use coupled lines [126] as shown in the figure 14.15.Here too the coupled sections are a quarter wavelength long at ω0. In either the microstrip or thestripline there is a center signal line and the ground plane(s). If a second signal line is placed inclose proximity to the first signal line, as much as 100% of the power can be transferred from thefirst to the second and back.

L

Figure 14.16: Each section of the coupled line filter.

167

Chapter 15

Transducers

Einstein received the Nobel prize for his work on the photoelectric effect and his equationE = hν where h is Planck’s constant, E is the energy of the photon and ν is the frequency of thelight emitted.

15.1 Direct gap

Some materials absorb and emit light more easily than others because they are of a class calleddirect gap semiconductors. If you draw the E-k plot for Si, you will find that the lowest level in theconduction band does not lie at the same wave number as the highest level in the valence band asshown in the figure 15.1.

GaAsSilicon

A

C

B

B

A

Figure 15.1: The forbidden gap of silicon.

The lower band is the valence band and the upper band is the conduction band. The freeelectrons occupy the lowest levels available in the conduction band and one says electrons sinkwhereas holes occupy the highest available levels in the valence band because holes float (becausethe electrons in the valence band sink).

In the figure 15.1 the direct gap of Si is from A to B and is larger than the indirect gap from A toC. However in GaAs the smallest gap is the direct gap. In GaAs if a photon were to supply enoughenergy, it could be absorbed [21] and a valence band electron could jump from A to B. Also, if youpump electrons into the conduction band and ensure that there are holes in the valence band, inGaAs the conduction band electron could jump to the valence band and release a photon of thatenergy gap, but this is unlikely to happen in Si because if you have electrons at B they will simplyslide down the conduction band to C. For this reason it is difficult to get Si to emit light. On the

168

other hand if you supply photons of energy equal to the gap from A to B, then Si can absorb lightand for this reason there are plenty of manufacturers making light detectors in Si.

15.2 Semiconductor lasers

The acronym LASER stands for Light Amplification by Stimulated Emission of Radiation.There are basically two types of semiconductor lasers, edge emitting and surface emitting. Edgeemitting came first because it uses a much simpler structure.

15.2.1 Edge emitting

The structure of an edge emitting laser is that of a P-i-N diode as shown in the figure 15.2.

i

N

P

−

+

Figure 15.2: Edge emitting laser.

The P layer on top is connected to the positive of the battery and the N layer on the bottomis connected to the negative of the battery. Holes enter from the top, electrons enter from thebottom and in the intrinsic layer there are no dopant ions so the state of lowest energy wouldbe for all the holes to combine with all the electrons (due to current continuity they are the samenumber) and emit the bandgap energy as photons. The two edges are created by simply cleavingthe semiconductor by striping it and breaking it and you will get a clean edge. The index ofrefraction of GaAs is about 3.6 and that of air is 1 so the power reflection from the cleaved surfaceis

Γ2 =[3.6 − 13.6 + 1

]2

= 0.32 (15.1)

32% reflection is not very large but it is sufficient to induce lasing if the diode is pumpedhard enough. All you need is that the gain in photons going from one cleaved surface to theother cleaved surface multiplied by the reflectance is greater than 1, which will ensure a positivefeedback loop and therefore lasing. So in this case the photon gain has to be more than 1/0.32 = 3.2so the reflected photon needs to induce the stimulated emission of 2.2 photons so that when theyreach the other cleaved surface 2.2 photons are transmitted out through the mirror and one photongets reflected back. So you have the dependencies

E = hν (15.2)

νλ =c

n(15.3)

L = m · λ (15.4)

G(ν) ∝ egL (15.5)

G · Γ2 > 1 (15.6)

169

E = Eg + ∆E (15.7)

The first equation gives you the frequency, the second gives you the wavelength, the thirdgives you the different lengths L you can choose from, the fourth gives you the photon gain G youcan expect from that length at that frequency, the fifth just requires the feedback be greater than1, the sixth is just the ∆E that you need to factor in because when you pump the diode hard, theaverage electron is higher than the bottom of the conduction band and the average hole is lowerthan the top of the valence band so your frequency will be slightly higher than that given by theband gap energy.

In order to output only a single frequency, you need to make sure that the loop gain is greaterthan 1 only at a single frequency and this will get more difficult as the length L is increased becausem gets larger and so you may have appreciable gain at a longer wavelength such that (m−1)·λ = Lor at a shorter wavelength such that (m + 1) · λ = L. On the other hand if you make L too short,you will have to pump the diode very hard indeed which creates other problems.

So to summarize, in order to make the laser function as you wish it to, you have to properlyestimate how hard you need to pump it, how narrow the intrinsic layer needs to be, how long thelaser has to be, and the operating temperature of the laser (because it causes expansion and hasother effects) and so on. In addition the way lasers are often used is to abut the edge of the laserto the end of an optical fiber and use an index matching glue to connect them, and since glass hasa refractive index of 1.5, this will almost certainly affect the lasing, because it drops the reflectanceof the cleaved edge of the laser below 32%.

15.2.2 Surface emitting

Nowadays surface emitting lasers are much more popular than edge emitting lasers, you canmake an array of them on a single chip. Unlike edge emitting lasers Vertical Cavity Surface Emit-ting Lasers or VCSELs do not use a cleaved edge as a mirror. Instead they use epitaxially grownmulti layer mirrors called Bragg reflectors as shown in the figure 15.3.

Light out

P mirror

N mirror

−

+

Figure 15.3: Surface emitting laser.

We are accustomed to light reflection by a glass mirror, but there is another kind of reflectionthat is more distributed and based on interference of the wave, given by Bragg’s law:

λ

n= 2 · d · sin(θ) (15.8)

170

Here n is the effective refractive index of the Bragg structure, d is the spacing of the structurei.e. the minimum repeatable unit of high and low refractive index. The figure 15.4 shows Braggreflection of light.

Figure 15.4: Bragg reflection.

The effective refractive index n is the algebraic mean over the distance traveled. So if the Braggstructure is 40% at index 1.5 and 60% at index 2.25, then the effective index is 1.95. In additionif the Bragg condition is met, there is reflection, but the amount of light reflected depends on theabruptness and index difference between the layers and so different Bragg structures with thesame periodicity and the same effective index will reflect different amounts of light. Ultimately ifthe number of repeatable units in the Bragg structure is increased indefinitely, all the light will bereflected.

In this type of laser the gain is obtained in the intrinsic layer sandwiched between the upperp-type region and the lower n-type region in exactly the same way as in the edge emitting laserdiode. The difference is that the light reflection is vertical not horizontal. A VCSEL has betterfrequency selectivity than an edge emitting laser because here even the mirror helps in the fre-quency selection via the Bragg condition. Unlike the case of the cleaved edge acting as a mirror,in this case you can obtain the reflectance you wish for by increasing the number of layers in theBragg stack. In addition, you can have more layers at the bottom so that you don’t waste lasingby reabsorption in the substrate.

In the early days people did everything they could to avoid passing the current through themirrors. But in this case it was difficult to flood the intrinsic region between the Bragg mirrors withholes and electrons which are needed to cause photon gain. When they did pass current throughthe mirrors they would heat up [127] and expand causing frequency distortion and excessivepower loss. Experimental studies [128], [129] showed that modulation doping could alleviate theproblem of thermionic barriers. My paper [37] showed how to reduce the thermionic barriers inp-type and n-type Bragg structures to a theoretical minimum.

15.2.3 Bulk vs. distributed gain

The light emitted by a laser is said to be coherent meaning that the plane waves representingthe photons are in phase. Stimulated emission is in phase with the wave that stimulated it. Atthe mirrors the phase has to be zero. It is sufficient to provide hole electron pairs at the points ofmaximum amplitude of the waves because this is where most of the stimulated emission occurs.In fact in an edge emitting laser as you move across the laser [130] you will find regions wherethe available hole electron pairs are depleted by the stimulated emission and the lasing is limited

171

by the diffusion of hole and electrons from surrounding areas into these regions, it is called "holeburning".

If you are building a VCSEL, you can easily place the intrinsic slices where the wave has max-imum amplitude and thereby increase the efficiency of the lasing. This type of structure is calleda distributed gain structure.

15.3 Junction detectors

The opposite of emitting light is to detect light. The simplest detector structure is a reversebiased diode [131] as shown in the figure 15.5.

P

N

Light input

Figure 15.5: Diode detector.

When you reverse bias a diode, the holes are pulled away from the junction toward the topwhereas the electrons are pulled away from the junction toward the bottom and you have a deple-tion region where the junction used to be and there is a high field in this depletion region becauseall the voltage is dropped across it.

Now if you shine light at this region, at an energy higher than the band gap, it will be absorbedand will result in a hole electron pair which will immediately separate in the field and registersas a current flow which is proportional to the amount of light that is absorbed. Due to the Franz-Keldysh effect [132], [133] the energy gap in the presence of a high electric field is reduced and soeven photons of energy slightly lower than the Eg may be absorbed as shown in the figure 15.6.

hv

Eg

Figure 15.6: The Franz-Keldysh effect.

When you reverse bias a diode the depletion width increases rapidly at first but the rate ofincrease of the depletion width falls off rapidly and if you reverse bias the junction too much youwill simply cause a reverse breakdown of the diode. For this reason the junctions in high qualitydetectors are often designed with very thin interlaced fingers as shown in the figure 15.7.

Avalanche photodiodes [134] are different from a normal photodiode due to the high reversefield that they have to tolerate. Unless the diode is carefully constructed discontinuities in the

172

Figure 15.7: Interlaced junction increases area.

semiconductor or surface states can cause a breakdown before the avalanche electric field is reached.But if the electric field is higher than that required to cause avalanche ionization, then the carri-ers generated due to photon absorption accelerate in the field and cause the generation of morecarriers, so there can be a gain of as much as a few thousand.

Phototransistors [135] are different because the field is the normal field but due to the transistorstructure the carriers generated in the base emitter junction will cause β× as much collector currentto flow.

15.4 Accelerometers

Accelerometers are based on the piezoresistive effect [136]. A typical structure used is shownin the figure 15.8. There is usually a cavity created by etching and in that cavity is a mass ofany shape attached only on one end and free to flex within the cavity. It can be just a cantileverand to avoid orientation problems it can be a combination of two or three cantilevers orientedorthogonally to each other.

Figure 15.8: A typical accelerometer.

From Newton’s first law F = ma and so the torsion applied on the cantilevered section isa linear function of the acceleration. Due to the piezoresistive effect, the resistance of any loopwhich has paths going through the cantilever will change with the stress i.e. with the accelerationand thus can be detected.

Or there are many other options such as the change in capacitance between a surface sectionof the mass and the wall of the cavity, or perhaps optical effects such as interference. But thepiezoresistive effect is easiest to work with. For example a Wheatstone bridge configuration couldbe used to detect imbalance due to the change in resistance.

173

Chapter 16

Technology CAD

Until about 1985 TCAD was not even possible. So devices were analyzed analytically and themost elegant approximations were based on series expansion such as the Taylor’s and McLaurinseries etc. After numerical analysis became possible, it started with one-dimensional (1D) analysis,then 2D analysis and 1D transient, then 3D etc. For most of the history of numerical analysiscomputing power was a limited resource so parsimony in computing was a virtue.

To start learning numerical techniques [137] is easy to read. For probability theory [138] is agood place to start and for a good collection of formulae [12], and for finite elements [139], andif you are interested in C implementation then [140]. Semiconductor equations use exponentialsa lot and are said to be "stiff" meaning that small changes in one variable cause large changes inanother. For this reason finite differences are often more popular than finite elements.

Nowadays most engineers do not write partial differential equation solvers because there areso many commercial packages available which can be incorporated into your simulation flow sothat you only provide the data and the parameters of solution and the solvers do the job betterthan you could implement yourself. Device simulation often yields sparse matrices [141] andthere are commercial packages that can solve them. Or to do least squares fitting that is often usedsuch as the Levenburg-Marquardt algorithm [142], [143]. But it is still worthwhile to understandthe basics of numerical solution of equations for the same reason we learn to do mathematicsmanually even after the advent of pocket calculators.

16.1 Basic numerical techniques

n−3 n−2 n−1 n n+1 n+2 n+3

0

Figure 16.1: Discrete data.

174

A lot of the difference equations are based on the Taylor’s series given by the equation 16.1.

f(x) = f(a) + f ′(a)(x − a) +f ′′(a)(x − a)2

2!+ . . . (16.1)

16.1.1 Differentiation

If you use only the first two terms of equation 16.1 then you will get the first derivative as

f ′(a) =f(x) − f(a)

x − a(16.2)

If the spacing between the samples is ∆x, then you can write two equations 16.3, 16.4 based onthe Taylor’s series and when you add them you get equation 16.5 which is the second derivativein central difference form i.e. the second derivative at n based on points evenly spread before andafter n.

fn+1 = fn + f ′n ∆x + f ′′

n

∆x2

2(16.3)

fn−1 = fn − f ′n ∆x + f ′′

n

∆x2

2(16.4)

f ′′n =

fn−1 − 2fn + fn+1

∆x2(16.5)

So in this manner you can obtain finite difference representations for any level of derivativeeither centered about n or using different combinations of points ahead or behind n. This can bedone even if the points are not evenly spaced as shown in the figure 16.2. So equations 16.3, 16.4become equations 16.6, 16.7.

n−1 n n+1

a b

Figure 16.2: Unevenly spaced points.

fn+1 = fn + f ′n b + f ′′

n

b2

2(16.6)

fn−1 = fn − f ′n a + f ′′

n

a2

2(16.7)

Multiplying equation 16.6 by a/b and then adding it to equation 16.7 and you will get equation16.8 which is the second derivative at n. The equations for the different finite differences are inmany textbooks such as [137].

f ′′n =

fn−1 − (1 + (a/b))fn + (a/b)fn+1

(b/2a) + (a2/2)(16.8)

175

Figure 16.3: Simple integration.

16.1.2 Integration

The simplest integration is to use the value at n from midway to n-1 to midway to n+1 asshown in the figure 16.3. The Simpson’s rule is a better method given by equation 16.9 and the2 point, 4 point and 5 point are in equations 16.10, 16.11, 16.12 [137]. There are many referenceswhich explain why in general integration generates less error than differentiation.

∫ x2

x0

f(x)dx =∆x

3(f0 + 4f1 + f2) (16.9)

∫ x1

x0

f(x)dx =∆x

2(f0 + f1) (16.10)

∫ x3

x0

f(x)dx =3∆x

8(f0 + 3f1 + 3f2 + f3) (16.11)

∫ x4

x0

f(x)dx =2∆x

45(7f0 + 32f1 + 12f2 + 32f3 + 7f4) (16.12)

16.1.3 Interpolation

Suppose you need to know the value of a function f at x for the one dimensional case or x, yfor the two dimensional case as shown in the figure 16.4. The equation you use has to be basedon the equation that describes f(x) and f(x, y). The reason is shown in the table 16.1.

f(x2)f(x) f(x2,y1)

f(x2,y2)

f(x,y)

f(x1) f(x1,y1)

f(x1,y2)

Figure 16.4: Interpolation to find f(x) and f(x, y).

In the table 16.1 there are three functions shown namely the sine wave, the exponential and acubic. In all three cases the value of x is midway between x1 and x2, but the value of f(x) is not

176

Equation x1 x x2 f(x1) f(x) f(x2)Sine(θ) 30o 45o 60o 0.5 0.707 0.866ex 1 2 3 2.718 7.389 20.085x3 1 2 3 1 8 27

Table 16.1: Interpolation.

midway between f(x1) and f(x2). Normally if you need to interpolate you don’t know what theshape of the function is.

The Taylor’s series works regardless of what relationship creates the data. It is based on theidea that you can fit a polynomial dependence to the data and then the prediction is based on thatpolynomial. So if you keep increasing the order of the interpolation scheme i.e. if you utilize thehigher order derivatives then you will get progressively better results.

In order to use the higher order derivatives you need to use more points and then you aremaking the assumption that the relationship creating the data points is the same across thosepoints. The first order or linear interpolation is given in equation 16.13 for one dimension. Youcan find tables of coefficients in [137] for many different types of interpolation such as Lagrange,Stirling, Bessel, Everett, Steffensen and Newton.

f(x) =(x2 − x)f(x1) − (x1 − x)f(x2)

x2 − x1(16.13)

For the two dimensional case you use the two dimensional Taylor’s series as in equation 16.14.Then substitute for the partial derivatives using the points (x2, y1) and (x1, y2). Then repeat using(x2, y2) as the starting point and average. So then you have used all four points to estimate thevalue at (x, y).

f(x, y) = f(x1, y1) + (x − x1)∂f

∂x

∣∣∣∣x1,y1

+ (y − y1)∂f

∂y

∣∣∣∣x1,y1

+ ldots (16.14)

16.2 Grid selection

The figure 16.5 shows a uniform grid. Here all the points are evenly spaced along the x and yaxis. If there is only one ∆x and one ∆y it has to be as small as the finest requirement along x andy axis. But if it does not have to be uniform you can make the grid as in the figure 16.6.

x

y

Figure 16.5: A uniform rectangular grid.

177

Figure 16.6: A rectangular grid.

In the figure 16.7 is a different approach which is the adaptive grid [144]. You start by settingup a coarse grid containing triangles, and then you refine it by dividing those triangles into subtriangles [145] for example by connecting the mid points of the sides in areas where a high accu-racy is needed as shown in the figure 16.8. When refining the grid you don’t want to make thegrid too fine because otherwise you might get terms in the matrix which are almost zero.

x

y

Figure 16.7: A 2 dimensional adaptive grid.

Figure 16.8: Refining a grid.

The use of an adaptive grid can reduce the matrix size by a large factor. Suppose you startwith a 100 × 100 uniform rectangular grid. Now if you need a 10× finer grid for 6 grid spacingsalong the x axis and 4 grid spacings along the y axis, then the size of the matrix increases to154 × 136 = 20, 944 as opposed to 10,000. But in the case of an triangular adaptive grid you can

178

get an increased accuracy in a local area without affecting the entire column and entire row as inthe case of a rectangular grid. In the case of the 3D solution the difference between a rectangulargrid and an adaptive triangular grid is much larger than the 2D case.

16.3 Device simulation

2D device simulation [144], [146] was first done in the early 1980s using the computing facilitiesavailable then. Computers are a lot faster nowadays so finer grids and even 3D simulations [147]are feasible. To get the general principles, a 1D simulation is enough to understand, and the bestexample is a compound semiconductor such as AlxGa1−xAs [148].

The main equations to solve are the Poisson’s equation and the continuity equations for holesand electrons. These are written as equations 16.15 and 16.17 for electrons and 16.16 and 16.18for holes, where χe is the electron affinity.

d

dx

(ε

d

dx

(Ec + χe

q

))= q(N+

d − n + p − N−a ) (16.15)

d

dx

(ε

d

dx

(Ev + Eg + χe

q

))= q(N+

d − n + p − N−a ) (16.16)

dn

dt=

1q

dJn

dx− R + G (16.17)

dp

dt= −1

q

dJp

dx− R + G (16.18)

As the fraction x varies so too will the material properties [36], [130] so for example

ε(x) = 13.18 − 3.12x (16.19)

Eg(x) =

1.424 + 1.247x if x < 0.451.9 + 0.125x + 0.143x2 if x > 0.45

(16.20)

Eg(x) =

4.07 − 1.1x if x < 0.453.64 − 0.14x if x > 0.45

(16.21)

For the Fermi-Dirac statistics you can use the Joyce-Dixon approximation [16] of equations16.22 and 16.23. You would need the Ncand Nvwhich are also available in [130]. The net electricfield is simply (−1/q)dEf/dx.

Ef − Ec

kT= ln

(n

Nc

)+

n

2√

2Nc

(16.22)

Ev − Ef

kT= ln

(p

Nv

)+

p

2√

2Nv

(16.23)

For the recombination terms you can use models fitted to measured data [149], [150]. TheSchockley-Read-Hall is modeled as in equations 3.23, 3.24. The Auger recombination is modeledas in equation 3.25.

Once the grid has been setup, each point is filled in with it’s material properties and the sim-ulation can begin. Initially the holes and electrons are located where they are generated i.e. theyare located at the ionized dopants and all areas are charge neutral. In order to obtain the steadystate a pseudo time step is used to compute the carrier movement using the continuity equations.

179

It is a pseudo time step because you are using it to obtain the steady state rather than performinga transient analysis.

There are two choices of iterations namely the Gummel [151] and the Newton. The Gummelis easier to implement but you will need a large number of iterations to achieve convergence. TheNewton is more difficult to implement and each iteration requires more computational resourcesbut fewer iterations are required.

In the Gummel iteration you will first solve the Poisson’s equation to obtain the potential forthe given values of n and p at the different nodes. Then you will solve the continuity equations toget the new values of n and p at the different nodes. So in each iteration each equation is solvedseparately and you iterate until you reach convergence.

The Newton iteration is the one that is normally used in device simulation. In each iterationa system of equations is solved to obtain the new values of the potential, the n and the p. TheNewton-Raphson method [137] to iteratively solve for the root of f(x) = 0 is equation 16.24where the derivative f ′(z) at the guess value z is used with the value of f(z) in the kth iteration toobtain the new value of z. So in order to create the system of equations you first need to calculatethe partial derivative of each quantity with respect to the others as in the equation 16.25 wherethe FV , Fn and Fp are the functions for V , n and p.

zk+1 = zk − f(zk)f ′(zk)

(16.24)

∆V

∆n∆p

= −

∂FV∂V

∂FV∂n

∂FV∂p

∂Fn∂V

∂Fn∂n

∂Fn∂p

∂Fp

∂V∂Fp

∂n∂Fp

∂p

−1

FV

Fn

Fp

(16.25)

Vk+1 = Vk + ∆Vnk+1 = nk + ∆npk+1 = pk + ∆p

(16.26)

16.4 Fabrication process simulation

Process simulation starts with defining the process. A typical process may contain about 140steps or so. Although all of the wafer is affected by all of the steps, different regions of the waferwill be masked during different steps. Although there may be 140 processing steps, there may beonly 10 different types of steps. For example the implantation step may be repeated more thantwenty times, similarly for the oxidation steps etc.

So the process definition starts by listing the steps in chronological order and assigning theirdetailed parameters, for example an implantation step would require you to specify the dopant,the energy, dose and angle. Similarly for all the other steps. Then for each device or structure thatyou plan to simulate, you need to tell the simulator what the masking will look like as the deviceproceeds through the fabrication simulation.

The equation used to model an implantation step is given by equation 16.27 where Rp is theprojected range for that energy, φ is the dose and σ is due to the straggle.

N(z) =(

φ

σ√

2π

)exp

(−(z − Rp)2

2σ2

)(16.27)

The equation used to model diffusion is given by equation 16.28 where D is the diffusioncoefficient. There are explicit diffusion steps where you apply a gel containing the impurity on an

180

exposed silicon surface and then heat it as a way to dope the silicon. But even dopants that wereimplanted in a previous step will diffuse anytime heat is applied for example both oxidation andannealing are high temperature steps. The effect is particularly noticeable for the finely tailoreddrain engineering implants.

∂n

∂t= D ∇2n (16.28)

Apart from deciding what steps are performed in what sequence and what the masking is forthe different steps, the simulation of the process is no different from the device simulation. It isprobably easier because the equations are a lot less complex. The most difficult part of processsimulation is deciding what coefficients to use and how they change when the different steps aremixed with each other.

For example if you implant into a clean silicon surface you may get a different profile than ifyou implant into silicon that you just implanted into with a different dose and a different implant.Similarly the diffusion coefficients of impurities when they are the only dopant in clean siliconmay be different than when other impurities are also present.

So just as in the case of the device simulator the purpose of process simulation is mainly tounderstand the relative effect of a change in a process recipe rather than to determine the outcomeof a given recipe and so a mistake that happens all too often when the simulator is calibrated is totreat the calibration parameters as fudge factors.

16.5 Monte Carlo analysis

This is the most effective method of analyzing the yield of a circuit. Monte Carlo analysis isusually done primarily when the number of independent variables is large and their interdepen-dence is characterized by intractable equations. Because equations used in semiconductor devicesoften involve exponentials, substituting variables with distribution functions can lead to equationswhich have nested exponentials which are not that easy to solve analytically.

Monte Carlo analysis has some advantages such as that it is easy to fully utilize the computingcapability of a massively parallel system. It has some disadvantages in that each subsequent sig-nificant digit takes progressively longer to obtain so that after the initial convergence, any furtherimprovement is minimal and unreliable.

The way you would do MC analysis is to first identify the portions of the circuit that cause aloss in yield. Then you need to set up a simulation with a pass or fail condition. For example fora digital circuit you may set up the test simulation so that the inputs arrive at time zero and theoutput has to change within a specified time such as within a 100 ps.

n1

x

Figure 16.9: Randomly selecting a value.

Now select the variables which you are going to vary. They have to be independent of eachother otherwise the analysis would be incorrect. For example you could vary Tox, Vth, Ldiff andcontact resistance Rc.

181

Now you select a set of (Tox, Vth, Ldiff , Rc) by the use of 4 random numbers n1, n2, n3, n4 lyingbetween 0 and 1 obtained from 4 different pseudo-random number sequences using 4 different seedvalues. The values x are obtained from the random numbers by satisfying the equations 16.29and 16.30 as shown in the figure 16.9.

n1 =1√2π

∫ x

−∞e−x2/2 · dx (16.29)

Vth = Vthmean + (x · σVth) (16.30)

You now repeat the simulations many thousand times and count the number of successes andfailures. The yield is the fraction of successes out of the total number of tries. It is critical to use apseudo-random number sequence which has a "good spectral response" meaning that there is nodiscernible pattern to the sequence.

182

Chapter 17

Power electronics

High power devices are primarily devices with a very large area. By distributing the currentover a large area the velocity of the carriers is kept to a reasonable number similar to that in adevice operating at lower currents so the current density is about the same as in any device. Theother issue is the heat generated due to the flow of current so the casing may be metal and thechip is in good thermal contact with the casing.

High voltage is a different issue from high power and is usually achieved using longer spacecharge regions. Here the goal is to keep the electric field at the same level as it is in a normal devicefor example a 0.13µ FET operating at 1.8 V will have an electric field in the channel of about 140kV/cm so high voltage devices would also use similar field strengths. The doping density willalso be lower so that at junctions the depletion width can be large, but even so the electric fieldwill be the highest at the junction. The long space charge region has the effect of making the deviceslower due to a longer transit time for the carriers.

17.1 Alternating current

Nikola Tesla invented alternating current and made it possible to transmit electricity over verylarge distances over the power lines. The power lines that you see usually transmit electricity atabout 66,000 volts. The reason is shown in the figure 17.1.

RL

Rt

Vsup−

+

Figure 17.1: Using direct current.

Let us suppose that the supply voltage in the figure 17.1 is 120 volts DC. If you have anappliance that requires 1 A of current, it’s resistance is 120 Ω. Let us now suppose that 1 mile ofpower line (both ways) is ≈ 4Ω. So if you are a 100 miles from the power station, Rt = 400Ω. Soif the voltage at the supply station is 120 V, what the load actually sees is 0.23 A. Even if the loadresistance is reduced almost to zero, the maximum current supplied by the station 0.3 A. Nowconsider the situation for alternating current as shown in the figure 17.2.

183

RL

Rt

Vsup

Figure 17.2: Using alternating current.

In this figure the load is on the other side of a step down transformer from the power station.On the primary side we use a supply voltage of 66,000 volts AC. The current in the primary de-pends on the load on the secondary. So if you have the same requirement of 1 A on the secondary,that works out to 1 ×√

2 = 1.414 A peak to peak, and that only requires 0.0026 A peak to peak at66,000 V on the primary side, which means that the voltage dropped across the transmission lineis 0.0026 × 400 = 1.028 V which is a very very small part of 66,000 V indeed!

17.2 Transformers

When you pass current through a coil in any medium, the magnetic flux caused by elementsof that coil are seen by any conductor in the vicinity and is given by the Biot-Savart law. If there isonly one coil then that coil has a self-inductance. However if there are two coils then the magneticflux caused by one coil causes a current to flow in the other coil due to Ampere’s law. Such adevice is a transformer.

Figure 17.3: A transformer.

If the coil is wrapped around a closed loop of ferromagnetic material such as iron as shown inthe figure 17.3 then the magnetic flux is almost completely contained in the iron loop. So the coilon one side is the input and causes the flux to flow and the coil on the other side is the output anddevelops an electro motive force or emf which when connected to the output circuit will drive acurrent.

If the number of complete turns on the input side is n1 and the number of complete turnson the output side is n2, then the ratio of the output voltage to the input voltage is given byV2/V1 = n2/n1.

17.3 Rectification

The biggest single usage of diodes is in rectification as shown in the figure 17.4. When theupper pin on the left drives positive so too does the upper pin on the right. But when the upper

184

pin on the left drives negative, the upper pin on the right will drive positive. Since the upper pinon the right is always positive rectification is said to have occurred.

Figure 17.4: Rectification.

The rectification is followed by a low pass filter to convert the rippled output into a constantvalue. It does not have to be an RC section as shown but it can be an active regulation circuit thatuses timed switching to keep the output voltage and current constant.

17.4 DC to AC conversion

A fail safe power supply stores about a half hour of energy in a battery and if the power goesoff it outputs AC without a glitch. In order to do this it has to convert a DC voltage into 60 Hz AC.The way this is done is by switching the DC on/off to produce a series of positive and negativepulses as shown in the figure 17.5.

Figure 17.5: DC to AC conversion.

If this is followed by a low pass filter the pulses shown as solid lines will smear into the dottedline which appears like a sine wave. Many reactive oscillatory circuits can then be used to furtherconvert the dotted curve into a real sine wave.

17.5 DC to DC conversion

Load−

+

Figure 17.6: Increasing the voltage.

DC to DC converters are most often used in circuits where you wish to allow the customer touse a single battery cell of 1.5 V but the circuit needs a 3 V supply to function. The circuit used is

185

of a type as shown in the figure 17.6.The switch shown in the figure 17.6 is turned on and off at perhaps 50 kHz. When the switch

is turned on the inductor current rises. When the switch is opened this current is forced intothe capacitor until it starts to reverse direction at which point the diode stops the current flow.If the LC oscillation frequency is 50 kHz as well, then the voltage across the capacitor will havea peak voltage of twice the input voltage. By using a different LC combination or a differentswitching frequency any voltage between the supply voltage and twice the supply voltage can begenerated. Similarly any voltage between 0 and the supply voltage can be generated by using theconfiguration in the figure 17.7.

Load−

+

Figure 17.7: Decreasing the voltage.

The oscillation frequency of the LC pair is given by 1/√

LC, so by using a high frequency of 50kHz or so, the inductance and capacitance used can be quite small. This type of circuit can onlybe integrated onto a chip for small power supplies because otherwise you would require a largecapacitance. The inductor and the capacitor used may be off chip.

17.6 Silicon Controlled Rectifier

n p n p

Figure 17.8: The structure of the scr.

SCRs are also called Thyristors because they are similar to transistors but behave like thyra-trons. Thyratrons have a structure similar to a triode except that the tube is not evacuated butfilled with hydrogen at 1 mtorr. When the triode current flows the hydrogen is ionized and a highcurrent can flow. Thyratrons and thyristors are used in power circuits.

An scr has four contacts and it is formed by adding a fourth implant to the bjt as shown in thefigure 17.8. It is called a rectifier because once it is turned on it it will conduct current in a singledirection much like a diode. Unlike a diode it needs a turn on signal.

The way an SCR is understood is usually by the use of the equivalent circuit shown in thefigure 17.9. You have a p-n-p transistor back to back with an n-p-n transistor. I1 is the base currentof the n-p-n transistor causing a collector current I2 to flow and this is the base current for theupper p-n-p transistor.

186

np

p

nI1

I2

Figure 17.9: An SCR is two back to back BJTs.

So once the current flow is started by turning on the n-p-n transistor, the gating signal is nolonger needed because the collector current of the one transistor forms the base current of theother and the feedback loop keeps the current flowing.

17.7 Power BJTs

BaseEmitter

ballast resistor base region

Collector region (back plane)

Figure 17.10: A power BJT.

Power BJTs are usually laid out in an interdigitated manner [152] as shown in the figures 17.10and 17.11. The lowest layer is the collector which forms the back plane of the device. The dottedlayer is the base region which is contacted by contacts from the interdigitated base fingers on theright.

The emitter fingers come in from the left. A few fingers are connected together and to theemitter contact through a ballast resistor [153]. The reason that the resistor has to be integrated intothe transistor structure is that otherwise it could happen that most of the current flows through

187

base

emitter

collector

Figure 17.11: A power BJT.

a small portion of the transistor leading to a burn out. The current through a BJT increases withtemperature so if any region gets more current it will heat up and then the current increases againand so forth. So the ballast resistors cause the current to flow evenly through the transistor andavoid localized heating.

188

Bibliography

[1] R. C. Miller et al. IEEE Journal of Quantum Electronics, 1965.

[2] W. T. Read. A Proposed High-Frequency, Negative-Resistance Diode. Bell System TechnicalJournal, 1958.

[3] B. C. Loach et al. Avalanche transit-time microwave oscillators and amplifiers. IEEE Trans-actions on Electron Devices, 1966.

[4] D. Scharfetter and H. K. Gummel. Large-signal analysis of a silicon Read diode oscillator.IEEE Transactions on Electron Devices, 1969.

[5] J. B. Gunn. Microwave Oscillations of Current III-V Semiconductors. Solid State Communi-cations, 1963.

[6] B. K. Ridley and T. B. Watkins. The Possibility of Negative Resistance Effects in Semicon-ductors. Proceedings of the Physical Society, 1961.

[7] C. Hilsum. Transferred Electron Amplifiers and Oscillators. Proceedings of the Institute ofRadio Engineers, 1962.

[8] J. S. Blakemore. Major properties of GaAs. Journal of Applied Physics, 1982.

[9] Shyh Wang. Fundamentals of semiconductor theory and device physics. Prentice Hall, 1989.

[10] Herbert Kroemer. Quantum Mechanics. Prentice Hall, 1994.

[11] Ramamurti Shankar. Principles of Quantum Mechanics. Plenum, 1994.

[12] Murray R. Spiegel. Mathematical handbook. McGraw-Hill, 1968.

[13] F. Bloch. Zeitschrift fur Physik, 1928.

[14] R. de L. Kronig and W. G. Penney. Proc. R. Soc. London, 1930.

[15] G. Dresselhaus et al. Physical Review, 1955.

[16] W. B. Joyce and R. W. Dixon. Analytic approximations for the Fermi energy of an ideal Fermigas. Applied Physics Letters, 1977.

[17] W. Shockley et al. Physical Review, 1950.

[18] J. R. Haynes and W. Shockley. Physical Review, 1951.

[19] W. Schockley and W. T. Read. Physical Review, 1952.

189

[20] R. N. Hall. Physical Review, 1952.

[21] H. C. Casey et al. Journal of Applied Physics, 1976.

[22] E. M. Conwell. Properties of silicon and germanium. Proceedings of the Institute of RadioEngineers, 1958.

[23] S. M. Sze et al. Resistivity, Mobility, and impurity levels in GaAs at 300 K. Solid-StateElectronics, 1968.

[24] B. I. Halperin et al. Physical Review, 1966.

[25] J. R. Brews. A Charge-Sheet Model of the MOSFET. Solid-State Electronics, 1978.

[26] V. G. Reddi et al. Source to drain resistance beyond pinchoff in metal-oxide-semiconductortransistor. IEEE Transactions on Electron Devices, 1969.

[27] D. M. Caughey et al. Carrier mobilities in silicon empirically related to doping and field.Proceedings of IEEE, 1967.

[28] J. A. Cooper et al. Measurement of high field drift velocity of electrons in the inversion layeris silicon. IEEE Electron Device Letters, 1983.

[29] L. D. Yau. A simple theory to predict the threshold voltage in short-channel IGFETs. Solid-State Electronics, 1974.

[30] F. C. Hsu et al. IEEE Transactions on Electron Devices, 1983.

[31] H. C. Pao et al. Effects of Diffusion Current on Characteristics of Metal-Oxide Insulator-Semiconductor Transistors. Solid-State Electronics, 1966.

[32] R. F. Pierret et al. Simplified long-channel MOSFET theory. Solid-State Electronics, 1983.

[33] W. M. Werner. The Work Function Difference of the MOS-System with Alluminum FieldPlates and Polysilicon Field Plates. Solid-State Electronics, 1974.

[34] L. A. Akers et al. A model of a narrow-width MOSFET including tapered oxide and dopingencroachment. IEEE Transactions on Electron Devices, 1981.

[35] L. A. Akers. The inverse narrow-width effect. IEEE Electron Device Letters, 1986.

[36] S. Adachi. GaAs, AlAs, and AlxGa1−xAs: Material parameters for use in research and deviceapplications. Journal of Applied Physics, 1985.

[37] Sitaramarao S. Yechuri et al. Design of Flat-Band AlGaAs Heterojunction Bragg Reflectors.IEEE Transactions on Electron Devices, 1996.

[38] R. Bashir et al. Atomic force microscopy studies of self assembled si1−xgex islands producedby controlled relaxation of strained films. Journal of Vacuum Science Technology, 2001.

[39] P. P. Debye et al. Physical Review, 1954.

[40] A. S. Grove. Redistribution of acceptor and donor impurities during thermal osidation ofSi. Journal of Applied Physics, 1964.

[41] P. Burggraaf. Wafer steppers and lens options. Semiconductor International, 1986.

190

[42] P. R. Gray and R. G. Meyer. Analysis and design of analog integrated circuits. John Wiley, 1993.

[43] Nobuhiko Mutoh et al. New empirical relation for MOSFET 1/f noise unified over linearand saturation regions. Solid-State Electronics, 1988.

[44] Fernando Colombani et al. Extraction of microwave noise parameters of FET devices. IEEEMTT-S Digest, 1990.

[45] Alfy Riddle. Extraction of FET model noise-parameters from measurement. IEEE MTT-SDigest, 1991.

[46] Alain Cappy et al. High-frequency FET noise performance: A new approach. IEEE Transac-tions on Electron Devices, 1989.

[47] Sam Pritchett et al. Improved FET noise model extraction method for statistical model de-velopment. IEEE MTT-S Digest, 1995.

[48] G. A. Lang et al. Chemical polishing of silicon with anhydrous hydrogen chloride. RCAReview, 1963.

[49] R. Nuttall. The dependence on deposition conditions of the dopant concentration of epitax-ial layers. Journal of the Electrochemical Society, 1964.

[50] W. H. Shepherd. Doping of epitaxial silicon. Journal of the Electrochemical Society, 1968.

[51] D. J. Sykes. Recent advances in negative and positive photoresist technology. Solid StateTechnology, 1973.

[52] L. N. Lie et al. High pressure oxidation of silicon in dry oxygen. Journal of the ElectrochemicalSociety, 1982.

[53] E. A. Irene et al. Silicon oxidation studies: The role of H2O. Journal of the ElectrochemicalSociety, 1977.

[54] R. R. Razouk et al. Kinetics of high pressure oxidation of silicon in pyrogenic steam. Journalof the Electrochemical Society, 1981.

[55] J. Klerer. On the mechanism of the deposition of Silica by Pyrolytic decomposition of Silanes.Journal of the Electrochemical Society, 1965.

[56] N. Goldsmith et al. The deposition of vitreous silicon dioxide films from silane. RCA Review,1967.

[57] M. Miyake et al. Incidence angle dependence of planar channeling in boron ion implantationin silicon. Journal of the Electrochemical Society, 1983.

[58] J. Narayan et al. Characteristics of ion-implantation damage and annealing phenomena insemiconductors. Journal of the Electrochemical Society, 1984.

[59] C. Murray. Wet etching update. Semiconductor International, 1986.

[60] D. R. Turner. On the mechanism of chemically etching germanium and silicon. Journal of theElectrochemical Society, 1960.

[61] J. Kleinberg et al. Inorganic Chemistry. Heath and Co., 1960.

191

[62] W. R. Runyan and K. E. Bean. Semiconductor integrated circuit processing technology. Addison-Wesley, 1990.

[63] D. L. Flamm et al. Basic chemistry and mechanisms of plasma etching. Journal of VacuumScience Technology, 1983.

[64] D. F. Downey et al. Introduction to reactive ion beam etching. Solid State Technology, 1981.

[65] V. Hoffman. High rate magnetron sputtering for metallizing semiconductor devices. SolidState Technology, 1976.

[66] G. Harbeke et al. Growth and physical properties of LPCVD polycrystalline silicon films.Journal of the Electrochemical Society, 1984.

[67] T. Chung. Study of alluminum fusion into silicon. Journal of the Electrochemical Society, 1962.

[68] G. L. Schnable et al. Alluminum metallization - advantages and limitations for integratedcircuit applications. Proceedings of IEEE, 1969.

[69] A. J. Learn. Evolution and current status of alluminum metallization. Journal of the Electro-chemical Society, 1976.

[70] P. Burggraaf. Silicide technology spotlight. Semiconductor International, 1985.

[71] A. E. Morgan et al. Characterization of a self-aligned cobalt silicide process. Journal of theElectrochemical Society, 1987.

[72] K. Venkat et al. Timing verification of dynamic circuits. IEEE Journal of Solid-State Circuits,1996.

[73] R. J. Widlar. Some circuit design techniques for linear integrated circuits. IEEE Transactionson Circuit Theory, 1965.

[74] R. J. Widlar. New Developments in IC Voltage Regulators. IEEE Journal of Solid-State Circuits,1971.

[75] J. S. Brugler. Silicon transistor biasing for linear collector current temperature dependence.IEEE Journal of Solid-State Circuits, 1967.

[76] Y. P. Tsividis et al. A CMOS voltage reference. IEEE Journal of Solid-State Circuits, 1978.

[77] K. R. Lakshmikumar et al. Characterization and modeling of mismatch in MOS transistorsfor precision analog design. IEEE Journal of Solid-State Circuits, 1986.

[78] M. J. M. Pelgrom et al. Matching properties of MOS transistors for precision analog design.IEEE Journal of Solid-State Circuits, 1989.

[79] S. J. Lovett et al. Optimizing MOS Transistor Mismatch. IEEE Journal of Solid-State Circuits,1998.

[80] B.E. Boser. The design of sigma-delta modulation analog-to-digital converters. IEEE Journalof Solid-State Circuits, 1988.

[81] G. F. Landsburg. A charge balancing monolithic a/d converter. IEEE Journal of Solid-StateCircuits, 1977.

192

[82] S. R. Norsworthy et al. Delta-Sigma Data Converters: Theory, Design and Simulation. JohnWiley, 1996.

[83] R. Schreier et al. Delta-sigma modulators employing continuous time circuitry. IEEE Trans-actions on Circuits and Systems, 1996.

[84] J. A. Cherry et al. Continuous-time Delta-Sigma Modulators for High-speed A/D conversion: The-ory, Practice and Fundamental Performance limits. Kluwer Academic, 1999.

[85] A. P. Chandrakasan et al. Low-power CMOS digital design. IEEE Journal of Solid-StateCircuits, 1992.

[86] K. Fukahori. A high precision micropower operational amplifier. IEEE Journal of Solid-StateCircuits, 1979.

[87] G. W. Taylor. Subthreshold conduction in MOSFETs. IEEE Transactions on Electron Devices,1978.

[88] J. Ramirez-Agulo et al. Characterization, evaluation and comparison of laser trimmed filmresistors. IEEE Journal of Solid-State Circuits, 1987.

[89] J. A. Babcock et al. Precision electrical trimming of very low TCR Poly-SiGe resistors. IEEEElectron Device Letters, 2000.

[90] J. Deverell. Pipeline iterative aritmetic array. IEEE Transactions, 1975.

[91] H. H. Guild. Fully iterative fast array for binary multiplication and addition. ElectronicsLetters, 1969.

[92] J. V. McCanny et al. Completely iterative, pipelined multiplier array suitable for VLSI. Pro-ceedings of IEE, 1982.

[93] R. F. Lyon. Two’s complement pipeline multipliers. IEEE Transactions, 1976.

[94] H. T. Kung. Why systolic architechtures ? Computer Magazine, 1982.

[95] A. Robertson. A new class of digital division methods. IRE Transactions Electronic Computers,1958.

[96] Tocher. Techniques of multiplication and division for automatic binary computers. QuarterlyJ. Mech. and Applied Math, 1958.

[97] N. Kurd et al. Multi-GHz Clocking Scheme for Intel Pentium 4 Microprocessor. IEEE Inter-national Solid-State Circuits Conference, 2001.

[98] T. Xanthopoulos et al. The Design and Analysis of the Clock Distribution Network for a 1.2GHz Alpha Microprocessor. IEEE International Solid-State Circuits Conference, 2001.

[99] Floyd M. Gardner. Phaselock techniques. John Wiley & Sons, 1979.

[100] Katsuhiko Ogata. Modern control engineering. Prentice Hall, 1997.

[101] Ronald N. Bracewell. The Fourier transform and its applications. McGraw-Hill, 1986.

[102] James W. Cooley and John W. Tukey. An algorithm for the machine calculation of complexfourier series. Mathematics of Computation, 1965.

193

[103] G. C. Danielson and C. Lanczos. Some improvements in practical fourier analysis and theirapplication to x-ray scattering from liquids. J. Franklin Inst., 1942.

[104] D. A. Huffman. A method for construction of minimum redundancy codes. Proceedings ofthe Institute of Radio Engineers, 1952.

[105] J. A. Miller. Maximally flat nonrecursive filters. Electronics Letters, 1972.

[106] B. C. Jinaga et al. Coefficients of maximally flat nonrecursive digital filters. Signal Processing,1984.

[107] C. E. Willert and M. Gharib. Digital particle image velocimetry. Experiments in Fluids, 1991.

[108] I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of the societyfor industrial and applied mathematics, 1960.

[109] N. Zierler. Linear recurring sequences. Journal of the society for industrial and applied mathe-matics, 1959.

[110] P. Elias. Coding for noisy channels. IRE Convention Record, 1955.

[111] A. J. Viterbi. Error bounds for convolutional codes and an asymptotically optimum decod-ing algorithm. IEEE Transactions on Information Theory, 1967.

[112] T. Jung et al. A 500 mhz ate pin driver. Bipolar/BiCMOS Circuits and Technology Meeting, 1992.

[113] S. B. Cohn. Characteristic impedance of the shielded-strip transmission line. IRE Transactionson MTT, 1954.

[114] H. A. Wheeler. Transmission-line properties of a strip on a dielectric sheet on a plane. IEEETransactions on Microwave Theory and Techniques, 1977.

[115] H. A. Wheeler. Transmission line properties of a stripline between parallel planes. IEEETransactions on Microwave Theory and Techniques, 1978.

[116] S. B. Cohn. Slotline on a Dielectric Substrate. IEEE Transactions, 1969.

[117] K. C. Gupta et al. Microstrip lines and slotlines. Artech House, 1996.

[118] H. Bex. New broadband balun. Electronics Letters, 1975.

[119] C. E. Fay et al. Operation of the Ferrite Junction Circulator. IEEE Transactions on MicrowaveTheory and Techniques, 1965.

[120] R. W. Klopfenstein. A transmission line taper of improved design. Proceedings of the Instituteof Radio Engineers, 1965.

[121] R. E. Collin. The optimum tapered transmission line matching section. Proceedings of theInstitute of Radio Engineers, 1965.

[122] S. B. Cohn. Optimum design of stepped transmission line transformers. IEEE Transactionson Microwave Theory and Techniques, 1955.

[123] P. I. Richard. Resistor-transmission-line circuits. Proceedings of the Institute of Radio Engineers,1948.

194

[124] M. C. Horton et al. General Theory and Design of Optimum Quarter-Wave TEM Filters.IEEE Transactions on Microwave Theory and Techniques, 1965.

[125] L. Young. Microwave Filters –1965. IEEE Transactions on Microwave Theory and Techniques,1965.

[126] B. J. Minnis. Printed circuit coupled-line filters for bandwidths up to and greater than anoctave. IEEE Transactions on Microwave Theory and Techniques, 1981.

[127] R. F. Kopf et al. N- and p-type dopant profiles in distributed Bragg reflector structures andtheir effect on resistance. Applied Physics Letters, 1992.

[128] E. F. Schubert et al. Elimination of heterojunction band discontinuities by modulation dop-ing. Applied Physics Letters, 1992.

[129] F. Capasso et al. AlGaAs/GaAs staircase avalanche photodiodes with high and extremelyuniform avalanche gain. IEEE IEDM, 1988.

[130] H. C. Casey et al. Heterostructure Lasers. Academic, 1978.

[131] O. K. Kim et al. A low dark-current, planar InGaAs p-i-n photodiode. IEEE Journal ofQuantum Electronics, 1985.

[132] W. Franz. Z. Naturforsch., 1958.

[133] L. V. Keldysh. Sov. Phys. JETP, 1958.

[134] N. Susa et al. IEEE Journal of Quantum Electronics, 1981.

[135] J. C. Campbell et al. Journal of Applied Physics, 1982.

[136] R. W. Keyes. The effects of elastic deformation on the electrical conductivity of semiconduc-tors. Solid State Physics, 1960.

[137] F. B. Hildebrand. Introduction to numerical analysis. Dover, 1974.

[138] Athanasios Papoulis. Probability, random variables, and stochastic processes. McGraw-Hill,1991.

[139] R. K. Livesley. Finite elements: an introduction for engineers. Cambridge University Press,1983.

[140] W. H. Press et al. Numerical recipes in C:The art of scientific computing. Cambridge UniversityPress, 1992.

[141] I. A. Duff. A survey of sparse matrix research. Proceedings of IEEE, 1977.

[142] K. Levenberg. Quarterly Applied Math, 1944.

[143] D. W. Marquardt. Journal of the society for industrial and applied mathematics, 1963.

[144] C. S. Rafferty et al. Iterative methods in semiconductor device simulation. IEEE Transactionson Electron Devices, 1985.

[145] R. E. Bank et al. An adaptive, multi-level method for elliptic boundary value problems.Computing, 1981.

195

[146] S. Selberherr et al. MINIMOS - A Two-Dimensional MOS analyzer. IEEE Transactions onElectron Devices, 1980.

[147] A. Yoshii et al. A three-dimensional analysis of semiconductor devices. IEEE Transactions onElectron Devices, 1982.

[148] M. K. Lundstrom et al. Numerical analysis of heterostructure semiconductor devices. IEEETransactions on Electron Devices, 1983.

[149] G. B. Lush. A study of minority carrier lifetime versus doping concentration in n-type gaasgrown by metalorganic chemical vapor deposition. Journal of Applied Physics, 1992.

[150] G. Bemski. Recombination in semiconductors. Proceedings of IEEE, 1958.

[151] H. K. Gummel. A self-consistent iterative scheme for one-dimensional steady state transistorcalculations. IEEE Transactions on Electron Devices, 1964.

[152] R. Allison. Silicon bipolar microwave power transistor. IEEE Transactions on MicrowaveTheory and Techniques, 1979.

[153] R. P. Arnold et al. A quantitative study of emitter ballasting. IEEE Transactions on ElectronDevices, 1974.

196

Index

1-bit DAC, 96

Accelerometers, 173Aliasing, 125Ampere’s law, 158

Balun, 162Band diagrams, 20band-gap reference, 85binary DAC, 94Bloch theorem, 16Bootstrapping, 84Bragg reflectors, 170Burn in, 54

cascoded current source, 80Circulators, 164clock skewing, 108Clock trees, 109common centroid, 87convolutional encoding, 132coplanar

strip, 161waveguide, 161

Coupled line filters, 167critical path, 106cross-correlation, 130cyclotron resonance, 17

Davisson-Germer experiment, 14de Broglie wavelength, 14De Morgan’s theorem, 69depletion approximation, 23DFT, 124DLL, 120Domino logic, 74driving point impedance, 77DUT, 151

Early voltage, 27Edge emitting laser, 169ESD, 141

Etching, 61

Faraday’s law, 158Fermi-Dirac distribution, 18FFT, 124Finite state machines, 72FIR filter, 129Flip-flops, 70Free electron theory, 15full-adder, 101

Gain bandwidth product, 85Gauss’s law, 158grid

uniform, 177adaptive, 178

Ground bounce, 145Guild multiplication array, 102Gunn diode, 11

hang states, 74Haynes-Schockley experiment, 19Huffman coding, 127

Implantation, 60

Junction detectors, 172

Karnaugh maps, 71Kirchoff’s laws, 3Klopfenstein taper, 165Klystron tube, 8Kronig Penney model, 16

Level shifting, 82Line and stub filters, 166Line terminations, 142Liquid Czochralski pull, 56

Maximum power transfer theorem, 5Mesh equations, 4microstrip, 159

197

Miller’s theorem, 84

Node equations, 4Noise, 51normal distribution, 182Norton’s theorem, 5Nyquist theorem, 125

Ohm’s law, 2Over-sampling, 126

Parallel A/D, 92pass gate, 70phase-frequency detector, 116Pin driver, 155Pipelining, 107Power BJTs, 187probe card, 42probe station, 42process independent resistance, 79Production monitors, 50

quarter wavelength transformer, 165

R-2R ladder, 94Ramp circuit, 154random numbers, 182Read diode, 10Reed-Solomon codes, 131reflection coefficient, 158Ring oscillator, 112Ringing, 146

Schroedinger’s wave equation, 14Serial A/D, 92Shift register, 103shift register, 103Sigma-Delta ADC, 94siliciding, 63Silicon Controlled Rectifier, 186slotline, 159Small signal equivalent, 29Sputtering, 63Striping, 50stripline, 159Sub-diffusion, 35Sub-threshold swing, 33Successive approximation, 92Surface emitting laser, 170

Thevenin’s theorem, 5Timing chip, 153transmission coefficient, 158truth table, 69

Vacuumdiode, 7triode, 8

VCO, 114Viterbi decoding, 133

Widlar current source, 79Window function, 126

198

Microchips 2ed

Documents

Transcript of Microchips 2ed