Post on 06-Aug-2021
HIGH SPEED ROM FOR DIRECT DIGITAL SYNTHESIZER
APPLICATIONS IN INDIUM PHOSPHIDE
DHBT TECHNOLOGY
By
Sanjeev Manandhar
B.S. University of Maine, 2004
A THESIS
Submitted in Partial Fulfillment of the
Requirements for the Degree of
Master of Science
(in Electrical Engineering)
The Graduate School
The University of Maine
August, 2006
Advisory Committee:
David E. Kotecki, Associate Professor of Electrical and Computer Engineering,
Advisor
Donald M. Hummels, Professor of Electrical and Computer Engineering
Richard O. Eason, Associate Professor of Electrical and Computer Engineering
LIBRARY RIGHTS STATEMENT
In presenting this thesis in partial fulfillment of the requirements for an advanced
degree at The University of Maine, I agree that the Library shall make it freely available
for inspection. I further agree that permission for “fair use” copying of this thesis for
scholarly purposes may be granted by the Librarian. It is understood that any copying
or publication of this thesis for financial gain shall not be allowed without my written
permission.
Signature:
Date:
HIGH SPEED ROM FOR DIRECT DIGITAL SYNTHESIZER
APPLICATIONS IN INDIUM PHOSPHIDE
DHBT TECHNOLOGY
By Sanjeev Manandhar
Thesis Advisor: Dr. David E. Kotecki
An Abstract of the Thesis Presentedin Partial Fulfillment of the Requirements for the
Degree of Master of Science(in Electrical Engineering)
August, 2006
An increasing demand for multi-GHz direct digital synthesizers has prompted research
for high-speed phase to amplitude converters. A high-speed read-only memory (ROM)
look-up table operating at 20-40 GHz range clock frequency can be used as a high-speed
phase to amplitude converter. A 16x6-bit ROM, employing an architecture suitable for
use as a phase to amplitude converter for DDS, has been implemented in InP double
heterojunction bipolar transistor (DHBT) technology. The ROM uses a -3.8 V power
supply and dissipates 1.13 W of power. The ROM is implemented in a test circuit that
includes an 8-bit accumulator and a 6-bit digital to analog converter (DAC) to facilitate
demonstration of high-speed operation. The maximum operating clock frequency is
measured to be 36 GHz. To increase the bit size of the ROM two 32x6-bit ROMs were
designed. The first 32x6-bit ROM is designed for low power consumption; the second
32x6-bit ROM is designed to operate at maximum clock frequency. The schematic
simulation results showed these ROMs operated at 20GHz and 46GHz clock frequency
respectively. The low power design consumes 1.95 W and high-speed design consumes
7.07 W of power.
ACKNOWLEDGMENTS
This work has been supported by the U.S. Army Research Lab and Defense
Advanced Research Projects Agency (DARPA) under contract DAAD17-02-C-0115. I
would like to thank Mr. Frank Stroili and Mr. Richard Elder at BAE Systems, Dr. Steve
Pappert at DARPA MTO, and Dr. Alfred Hung at U.S. Army Research Laboratory for
their guidance and support.
I would like to thank my mom and dad for their support and guidance through
out my academic endeavor.
Thank you, Professor Kotecki for your guidance, suggestions and advices dur-
ing this research work. Thank you, Professor Hummels and Professor Eason for your
valuable advices. Finally, thank you, Alma, Steve, Ryan, Zhineng, Bingxin for you help
and happy hours during these past two years.
ii
TABLE OF CONTENTS
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
LIST OF TABLES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
LIST OF ABBREVIATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Chapter
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2. Purpose of the Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3. Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. ROM LUT and Bipolar ROM .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1. ROM Look-Up Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2. Bipolar ROM Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1. The Emitter-Coupled Logic Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2.2. ROM memory cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2.1. Schottky cell ROM model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2.2.2. Transistor cell ROM model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.3. Row Decoder using Bipolar Transistors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2.3.1. Schottky Barrier diode decoder combined with
a pull-up circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2.3.2. ECL-based Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.4. Bipolar Sense Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3. Bipolar ROM in DDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4. ROM Compression Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3. 16x6-Bit High Speed Bipolar ROM in InP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.1. Circuit Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.1. Row Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.1.2. ROM Memory Cell Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.1.3. Sense Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2. Test Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.3. Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.4. Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4. 32x6-bit ROM designs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.1. ROM Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
iii
4.1.1. ROM with 32x6-bit Memory core. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.1.2. 32x6-bit ROM with 2:1 MUX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2. Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.1. Summary of Accomplishment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.2. Recommendation for Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
BIOGRAPHY OF THE AUTHOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
iv
LIST OF TABLES
Table 2.1. The comparison of different ROM compression techniquesfor ROM LUT in DDS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Table 3.1. The bit pattern in the ROM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Table 4.1. The bit pattern in Bit 4 of accumulator, Bit 5 of ROM A andBit 5 of ROM B. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Table 4.2. The power dissipated by ROMs with different memory core sizes. . . . . 43
Table 4.3. Power and speed comparison of three different ROMs . . . . . . . . . . . . . . . . 44
v
LIST OF FIGURES
Figure 1.1. Block diagram of a Sine-output DDS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Figure 2.1. Basic ECL gate consisting differential pair and emitter follower. . . . . . . 8
Figure 2.2. Schottky diode memory cell array. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Figure 2.3. Transistor memory cell array. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Figure 2.4. Schottky Barrier Diode Decoder with Pull-up Circuit. . . . . . . . . . . . . . . . . . 12
Figure 2.5. Predecoder design in ECL-based Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Figure 2.6. RAM Cell Sense Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Figure 3.1. Block diagram of 16x6-bit ROM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Figure 3.2. The schematic of a single row of the row decoder circuit. . . . . . . . . . . . . . . 21
Figure 3.3. The simulation showing only one row of decoder output islogic high while others stay low. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Figure 3.4. The schematic of the ROM memory cell array with sense amplifier. . . 23
Figure 3.5. The schematic of the sense amplifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Figure 3.6. Functional schematic of the ROM test circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Figure 3.7. Simulated differential output voltage from the ROM underfour different simulation settings: (1) schematic of the rowdecoder, the ROM memory cell array, and the sense am-plifier without layout parasitics: (2) the row decoder withlayout parasitics, schematic of the ROM memory cell arrayand sense amplifier without layout parasitics: (3) schematicof the decoder without layout parasitics, the ROM memorycell array and the row decoder with layout parasitics: and(4) the row decoder, the ROM memory cell array, and thesense amplifier with layout parasitics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Figure 3.8. The skew in the row 16 of MSB when the ROM test circuitis clocked at 36 GHz and FCW set to 16. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Figure 3.9. Microphotograph of ROM test circuit.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Figure 3.10. The DAC output data plot of the ROM test circuit clockedat 36 GHz and FCW set to 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Figure 3.11. The DAC output data plot of the ROM test circuit clockedat 36 GHz and FCW set to 16. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
vi
Figure 3.12. The output of MSB of ROM programmed with bit pattern1111110111111110 when the accumulator is clocked at 36GHz and the accumulator FCW is set to 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Figure 3.13. The output of MSB of ROM programmed with bit pattern1111110111111110 when the accumulator is clocked at 36GHz and the accumulator FCW is set to 16. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Figure 4.1. Block diagram of 32x6-bit ROM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Figure 4.2. Functional schematic diagram of 32x6-bit ROM with 2:1 MUX. . . . . . . 36
Figure 4.3. The schematic diagram of 2:1 multiplexer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Figure 4.4. The MUX output with glitch that occurs when MUX isswitching ROM inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Figure 4.5. The schematic diagram of D-latch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Figure 4.6. Buffered glitch-free output from D-latch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Figure 4.7. The buffered and latched MUX output of Bit 5 from simu-lation of 32x6-bit ROM with 2:1 MUX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Figure 4.8. Simulated differential output voltage of ROMs: (a) schematicof 32x6-bit ROM with 2:1 MUX and D-latch: (b) schematicof ROM with 32x6-bit memory core: and (c) schematic of16x6-bit ROM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
vii
LIST OF ABBREVIATIONS
BVceo transistor break-down voltageCML current mode logicCMOS complementary metal-oxide-semiconductorCORDIC co-ordinate rotation digital computerDAC digital to analog converterDDS direct digital synthesizerDHBT double heterojunction bipolar transistorECL emitter coupled logicfclk clock frequencyfout output frequencyfmax maximum oscillation frequencyft transit frequencyFCW frequency control wordHBT heterojunction bipolar transistorInP Indium PhosphideLSB least significant bitLUT look up tableMSB most significant bitMSFOM mixed-signal figure of meritMUX multiplexersPLL phase-locked loopRAM random access memoryROM read-only memorySBD Schottky barrier diodeSFDR spurious-free dynamic range
viii
CHAPTER 1
Introduction
1.1 Background
Frequency Synthesizers, capable of operating in the microwave range, are in
growing demand in areas of communication. Direct digital synthesizer (DDS) and
phase-locked loop (PLL) circuits can be used for generating many different frequen-
cies from a single frequency source. DDSs are widely used to generate frequencies
from DC to multi-GHz and offer advantages over other methods. Unlike conventional
PLL frequency synthesizers, which suffer from an inherent inability to simultaneously
provide both fast frequency switching and high purity [1], the DDS can simultaneously
achieve both fast switching and high spectral purity.
A DDS can output arbitrary waveforms but this research focuses on DDS pro-
ducing Sine-output. A simple block diagram of a sine-output DDS is shown in Fig-
ure 1.1. The DDS system is composed of a phase accumulator, a phase to amplitude
converter, a digital to analog converter (DAC) and a low pass filter. The accumulator is
a phase counter whose phase increment can be controlled by the ”k” input. The phase in-
crement is added to previous output value of the accumulator on every clock cycle. The
phase output of the accumulator is used as an input to the phase to amplitude converter.
The phase to amplitude converter acts as a look-up table (LUT) and is usually imple-
mented with a read only memory (ROM). It generates the digital output representing the
desired output voltage of the waveform. This digital output is converted into an analog
sine wave using a DAC. The low pass filter eliminates high frequency components of
the discrete sine wave and provides a clean sine waveform.
The phase to amplitude converter is one of the critical parts of the DDS. Some
of the important characteristics of DDS systems such as speed, spurious free dynamic
1
Phase Accumulator
k
Phase to Sine Converter
DACLow Pass
Filter
Figure 1.1: Block diagram of a Sine-output DDS.
range (SFDR), power and chip area are dependent on phase to amplitude converter.
Vankka [2] has compared various methods and architectures that can be employed for
phase to amplitude converters. A ROM LUT is one of the most widely used methods to
implement the phase to amplitude converter because it has better SFDR and a simpler
circuit design than other architectures. The ROM is often the limiting factor for the
speed of the DDS [3, 4]. When the ROM is made larger to attain good SFDR, it slows
down, occupies a larger area and consumes significant percentage of DDS power. A
tradeoff optimized high speed ROM LUT can give both good SFDR and high operating
frequency.
CMOS technologies become faster with each generation, but CMOS devices
are an order of magnitude slower than state of the art bipolar transistor technology.
The increasing demand for higher speed DDS circuits and the frequency limitations in
CMOS technologies have necessitated the development of ROMs implemented in dou-
ble heterojunction bipolar transistor (DHBT) technology. ROMs designed in CMOS
technologies typically utilize NAND or NOR architectures [5] and provide operation
in the 1-2 GHz range because of limitations in the CMOS technologies. The fastest
recently reported CMOS ROM by Takahashi et al. operates at 1.1 GHz [6]. In compari-
son, recently reported InP DDS circuits have been shown to operate at clock frequencies
2
from 12 GHz to 28 GHz [7]-[9] and accumulators have been reported operating over
40 GHz [10].
The Vitesse VIP-2 InP DHBT process [11] provide transistors with ft and fmax
both over 300 GHz and a BVceo over 4 V. This is a self-aligned process with good yield.
The new technology performs as good as, if not better than, other technologies like
SiGe and InP HBT in terms of speed. The new technology provides current mode logic
(CML) ring-oscillator gate delay of 1.95 ps [11] and emitter coupled logic (ECL) static-
divider frequency of 152 GHz [11], compared to ring oscillator gate delay of 3.6 [12] and
3.21 ps [13] and static-divider frequencies of 96 [14] and 100 GHz [15] from SiGe and
InP HBT technology respectively. The high frequency and high break-down voltage
transistor of this process is advantageous when designing high speed circuits. Mixed
signal figure of merit (MSFOM) is one of the measures of evaluating the performance
of HBT mixed signal circuits. MSFOM has been reported to be directly related to BVceo
by Zolper [16]; MSFOM improves with higher BVceo. Higher BVceo in InP DHBT
technology also provides a higher dynamic range and linearity. Despite these advantages
ROM designs operating at 30-40 GHz remains a challenge. A high speed ROM design
implemented in InP DHBT process is investigated in this thesis.
1.2 Purpose of the Research
The purpose of this research is to design a ROM, suitable for LUT, operating at
clock frequency of 30-40 GHz range in InP DHBT technology. This research investi-
gates whether InP DHBT technology can provide a high-speed ROM suitable for DDS
application. There has been very little reported prior research work in HBT ROMs in
general. Likewise, InP DHBT processes are costly and not easily accessible, so very lit-
tle research has been reported for ROMs in InP DHBT technology. This thesis provides
the first reported research on ROMs designed and fabricated in an InP DHBT process.
3
The InP DHBT process is different from CMOS and other bipolar technologies.
The ROM designs from other technologies can not be directly adapted into InP DHBT
technology. In this research, the prior work done in bipolar ROM designs has been
used as a starting point for the new design. A 16x6-bit ROM has been designed in InP
DHBT technology. The simulation and the test results from the ROM were analyzed.
Further two 32x6-bit ROMs have been designed to increase the bit size of the ROM.
The simulation results from these two ROMs has been analyzed.
1.3 Thesis Organization
This thesis is structured to provide some background on a ROM designs, fol-
lowed by the circuit designs, simulation and hardware results of 16x6-bit ROM. Then
it describes two design approaches taken to increase the bit size of ROM to 32x6 bits
followed by simulation results.
Chapter 2 gives an overview of prior works on bipolar ROM cell, row decoder,
sense amplifier designs. The chapter also explains optimization of ROM space by using
compression techniques.
Chapter 3 describes the design blocks of a 16x6-bit ROM design implemented
in InP DHBT technology. The chapter compares simulation and test results of ROM test
circuits.
Chapter 4 describes two design approaches taken to increase ROM bit size. The
simulations results from two 32x6-bit ROM designs are compared in terms speed and
power.
Chapter 5 gives the summary of the thesis and recommendations for future work.
4
CHAPTER 2
ROM LUT and Bipolar ROM
This chapter describes the ROM LUT and its advantages over other phase to
amplitude conversion techniques. The chapter gives an overview of the bipolar ROM
memory cell, row decoder and sense amplifier designs done in the past. The later part
of the chapter describes ROM compression techniques to optimize ROM memory, and
summaries prior work in these areas.
2.1 ROM Look-Up Table
In a DDS, the ROM is used as a LUT to convert digital phase input from the
accumulator to output amplitude. The accumulator output represents the phase of the
wave as well as an address to a word, which is corresponding amplitude of the phase,
in the LUT. This phase amplitude from ROM LUT drives the DAC to provide an analog
output.
The output frequency of a DDS is defined by the frequency control word (FCW)
input to the accumulator and the clock frequency. The FCW is a phase increment that is
added in the accumulator every clock cycle. Each period of sine wave is an overflow of
the phase in the accumulator. Frequency resolution 4f of DDS is dependent on word
length of the phase accumulator and is given by
∆f =fclk
2N, (2.1)
where fclk is the clock frequency and N is the accumulator word length. The DDS output
frequency is given by
fout =fclk × FCW
2N. (2.2)
5
Using a ROM LUT as a phase to amplitude converter has some advantages over
other ROMless architectures. The simplicity of the ROM circuit makes the ROM LUT
easy to implement. The ROMless architectures, such as co-ordinate rotation digital com-
puter (CORDIC), depend on iterations of computation to obtain amplitude of a phase.
These computations are done using circuit cells like adders, shifters, multiplexers and
registers. In HBT technology, these cells are known for consuming a lot of power and
space. A 2-bit adder in InP DHBT reported by Turner et al. [10] consumes 0.9 W. In a
CORDIC algorithm each iteration increases the accuracy of the result by approximately
one bit [2]. In order to reduce the truncation error and have higher accuracy the number
of iterations is required to be high. This contributes to higher power dissipation and
larger area. The advantages of ROMless architectures are seen only when the higher bit
accuracy is desired. At this point the ROM becomes very large, consumes high power
and becomes slow compare to ROMless architecture. Even in CMOS technology, when
the desired phase amplitude is less than 14-bits a ROM LUT with compression is better
than CORDIC algorithm technique in terms of both speed and space [2]. The ROM
LUT stores the values of phase amplitudes while ROMless architectures compute phase
amplitudes. Inherently, a ROM LUT provide better SFDR than any ROMless architec-
ture for same bit width. In an HBT technology, a ROM LUT can provide better SFDR,
lower power, and smaller area than ROMless architecture.
The InP DHBT process [11] used in this thesis has transistor with ft and fmax
both over 300 GHz simultaneously and a high of breakdown voltage BVceo of over 4 V.
A ROM implemented in InP DHBT technology will inherently provide high speed oper-
ation. Prior work on bipolar ROM designs and BiCMOS random access memory (RAM)
were researched to come up with a design that can get as much leverage as possible from
high speed devices in InP DHBT process. The next section describes emitter-coupled
logic (ECL) gate, which is the building block of high-speed digital logic, along with
bipolar ROM components such as memory cell, row decoder and sense amplifier.
6
2.2 Bipolar ROM Designs
2.2.1 The Emitter-Coupled Logic Gate
The ECL gate is designed for high speed operation. The ECL gate consists of
a differential pair and emitter followers, as shown in Figure 2.1. The differential pair
serves as a fast current steering switch. The transistors Q1 and Q2 are identical and are
biased symmetrically. When inputs Vin1 and Vin2 are equal, current ID/2 flows through
both legs of the differential pair. The current source ID and resistor RS are chosen
to provide a voltage drop of about 300 mV across RS . The low voltage swing of the
differential pair is one of the reasons why ECL gates have high performance. When
there is difference in voltage between Vin1 and Vin2 one leg has more current steered
through than the other. The ratio of the collector currents flowing through transistors
Q1 and Q2 is exponentially related to the voltage difference between Vin1 and Vin2.
A difference of 120 mV between Vin1 and Vin2 will practically turn off transistor Q2
allowing most of the current ID to flow through Q2. This results in two output states,
logic low and logic high, at the collectors of Q1 and Q2 which are given by Equations 2.3
and 2.4 .
Vc1 = VCC − ID ×RS (2.3)
Vc2 = VCC (2.4)
The differential pair output of an ECL gate cannot drive other multiple gates
directly because the collector current used in driving the load will cause a reduction
in the voltage swing. Thus an emitter follower is connected to each of the differential
outputs to drive other gates. This causes current to be drawn from emitter follower
instead of collector of differential pair. The outputs Vout1 and Vout2 are a diode drop
7
ID
Vc1 Vc2
VEE
VCC
RS RS
Vin1 Vin2
VCCVCC
Vout1 Vout2
Q2Q1
Q4Q3
IFIF
Figure 2.1: Basic ECL gate consisting differential pair and emitter follower.
lower than Vc1 and Vc2, so emitter follower also act as a level-shifter. The ECL gate
provides differential outputs, so it serves as both an inverter and a buffer simultaneously.
The differential pair with an emitter follower is a building block for ECL multiplexers,
XORs, latches and other logic gates. The ECL gate is the most versatile circuit in
high-speed bipolar digital circuit family.
2.2.2 ROM memory cell
The memory cell is the most crucial building block of ROM memory. Each cell
represents a logic high or low and an array of memory cells forming a word represents
the digital amplitude of a sine wave in the DDS. The memory cell should be designed
such that it is easy to decode and read. In this section the two memory cell architectures
most commonly used in bipolar ROM design are described.
2.2.2.1 Schottky cell ROM model
The ROM cell designed with a Schottky barrier diode (SBD) as a memory cell
is described by Gunn et al. [17] in 1977. A SBD serves as good choice for a memory
8
Bit Line
Word Line Bit High
Bit Low
Figure 2.2: Schottky diode memory cell array.
cell because it is small and can be fabricated free of the emitter-collector leakage using
conventional bipolar IC technology. Figure 2.2 shows an array of SBD memory cells.
A bit is addressed by the coincidence of sinking current from a bit line and forcing a
voltage on a word-line. The bit line is held high if a SBD is present at an intersection;
otherwise the bit line is pulled low. The bipolar ROM fabricated by Gunn et al. in [17]
used 2 metal layers and had an access time of 150 ns. The ROM design by Ludwig [18]
using SBD cell similar to [17] was reported to have bit access time of 36 ns. This type
of cell can be implement in ROM design only in technologies that provide SBD, such
as SiGe HBT, but the design has to be adapted to the low break down voltage of SiGe
HBT process. This memory cell design is not applicable for technologies which do not
provide SBD such as Vitesse VIP-2 InP DHBT.
2.2.2.2 Transistor cell ROM model
A ROM designed with a transistor as a memory cell is described by Barret et
al. [19]. Figure 2.3 shows the memory array using bipolar transistors for the memory
cell. The row line has horizontal base-diffused stripe. This strip has n+ emitter blocks
9
Word Line
Bit Line
GND
Bit High
Bit Low
Figure 2.3: Transistor memory cell array.
diffused at regular intervals serving as memory cells. If a metal connection to an emitter
is present, the bit is high; otherwise it is low. The ROM consists of a buffer-inverter
circuit with high fan out capable of driving high capacitance. The address decoder
and bit detector are also integrated with the memory array which are not explained by
authors. The ROM access time was reported to be in range or 15 to 40 ns [19].
Comparison of two memory cells in terms of speed can be difficult as technolo-
gies progress, but the Schottky cell is easier to fabricate and takes less space than the
transistor cell. In general, the SBD has less capacitance, making it easier to charge and
discharge quickly. This contributes to high speed performance. The transistor cell has
benefit of having low current drive because it is driven from a base input. In the case of
Schottky cell, the word driver must be able to drive more current as the SBDs are used
for pulling the bits high in the word. A word driver design has been reported by [18]
that can sink up to 25mA of current.
Neither of these ROM memory cell designs are implemented in the ROM design
for this thesis. Rather a differential ROM memory cell baed on the work of Padoan et
10
al [20], which provide higher differential signal at sense amplifier input, is implemented.
This ROM memory cell design is described in Chapter 3, Section 3.1.2.
2.2.3 Row Decoder using Bipolar Transistors
The decoder circuit takes an input address of the word to be decoded and it
selects a word line from the ROM memory core. The access time of ROM is dependent
on the decoder delay time. The ROM access time can be reduced by having a high
speed decoder. BiCMOS and Bipolar RAM designs have taken advantage of high-speed
bipolar decoder designs to get fast access time. In this section some high-speed bipolar
decoders used in BiCMOS and Bipolar RAM are described.
2.2.3.1 Schottky Barrier diode decoder combined with a pull-up circuit
Homma et al. [21] designed SBD decoder with pull-up circuit. This decoder
provides fast discharge and pull-up of decoder lines. The decoder uses an active pull-
up circuit instead of the pull-up resistor as shown in the Figure 2.4. The pull-up is
performed by means of emitter follower therefore sufficiently high pull-up current can
flow transiently when the voltage of the decoder line switches from low to high. This
provides very fast rise time of the decoder lines. The SBDs have low capacitance due to
their small area which provides short delay time. The decoder design is not applicable
for the technologies that do not provide SBD.
2.2.3.2 ECL-based Decoder
The ECL-based decoder uses the ECL gate configuration to decode memory ad-
dresses. The ECL-based decoder by Douseki et al. [22] used for a high-speed BiCMOS
SRAM is composed of a predecoder, a level shifter, and a main decoder. The prede-
coder selects the block of memory and does the most of the decoding, while the main
decoder does decoding within that block and buffers the signal to the MOSFET level.
11
ID
aVEE
VCC
RS RSVCC
VCC
Three–level VSS Generator
Row 1Q4Q3
ID
VEE
Q2Q1
Ac_Bit1
Clk
VEE
a
VCCVCC
SBD Decoder
VSS
QE
Pull-up Circuit
RDE
Row 2
VCC
RDE
Figure 2.4: Schottky Barrier Diode Decoder with Pull-up Circuit.
12
VCC
RS
RS
Ac_Bit0
Vref
VCC
VCC
Ac_Bit1
Ac_Bitn
Ac_Bitn
Ac_Bit0
RS
VCC
Ac_Bit1
VEE
VEE
Row 1
Row 2n
Figure 2.5: Predecoder design in ECL-based Decoder
The predecoder consists of ECL NOR gates, that take the address input, and emitter
followers as shown in the Figure 2.5. The main decoder is designed to drive MOSFET
gates, which is not required for bipolar ROM. This decoder design has multiple stages
which contribute to more power consumption and greater delay.
The ECL-based decoder reported by Essl et al. [23] is very similar in design to
the predecoder stage in [22]. The design has two stages; the first stage is a level shifter
changing the input to ECL, and the second stage is a NOR gate configuration like that
shown in Figure 2.5. The output of the NOR gate drives the wordline in the ROM.
Both SBD decoder and ECL-based decoder designs consume more power than
multi-emitter transistor AND gate decoder reported by Kawarada et al. [24, 25]. The
SBD decoder uses three more current sources per bit than multi-emitter transistor AND
gate decoder. The SDB decoder is considered to be faster than multi-emitter transistor
AND gate decoder because of its pull-up circuit and low capacitance SBD. However,
SBD decoder is not applicable in this InP DHBT technology [11] as it does not provide
SBD devices. The ECL-based decoder design has one more current source per row
13
than multi-emitter transistor AND gate decoder. When the number of row increases,
the power dissipation through these current sources becomes significant. This thesis
work utilizes multi-emitter transistor AND gate decoder which is described in detail in
Chapter 3, Section 3.1.1.
2.2.4 Bipolar Sense Amplifier
The sense amplifier detects the value of a bit stored in the memory cell and
outputs it value. A fast sense amplifier design is essential for fast ROM access time. A
sense amplifier design reported by Miyanaga et al. [26] is shown in the Figure 2.6. The
bases of the differential gate take inputs from the bit lines. In next stage, the current
signal is converted to a voltage signal. The sense amplifier provides a single-ended data
output line. Since the sense amplifier has few stages, it is fast and low power. In this
thesis this sense amplifier design has been implemented with modifications to provide
fewer stages and a differential output to drive the DAC. The details are described in
Chapter 3, Section 3.1.3
2.3 Bipolar ROM in DDS
There have been few reported designs of the bipolar ROM in DDS circuits.
Kwok et al. [3] reported a sine ROM for DDS in AlGaAs/GaAs HBT technology with
300 ps bit access time. The design of the bipolar ROM is described briefly and vaguely.
The ROM is based on current mode logic modified as appropriate in different circuit
configuration to achieve high speed and low power. The memory cell consists of one
transistor per cell with the collector grounded. Bits are assigned by connecting the emit-
ter metal to the output bit line through a via. The description of the memory cell in [3]
resembles to ROM cell by Barrett et al. [19].
14
VCC
VR
ID
VEE
VCC
IRIR
RS
VCC
VR
Cell Array
VEE
VCC
Data oVref
Figure 2.6: RAM Cell Sense Amplifier
2.4 ROM Compression Techniques
If a full sine wave is mapped in the ROM, it will increase the size of ROM
which results in increased area and power. For DDS designs operating at low clock
frequencies, memory size is not a big problem. However, DDSs operating at clock
frequency above couple GHz, memory size becomes very crucial because bigger ROM
gets slower. When using bipolar ROM, power and chip area becomes crucial. ROM
space can be optimized by exploiting the sine wave symmetry nature and using different
compression techniques.
Tierney et al. [27] in 1971 recognized the need for ROM compression. For
example, the quadrature sine-wave symmetry technique uses the symmetric nature of
sine wave separated in four quadrants. The phase amplitude of only one quadrant of
sine wave is mapped in the ROM LUT. The phase amplitude of a single quadrant is used
for mapping the sine wave in the other three quadrants by changing the address sign
and the amplitude sign to obtain the full sine output. The memory size is reduced to
15
one fourth using just the quadrature sine-wave symmetry technique. The benefit of the
reduced ROM size comes without any loss in the SFDR.
There are other ROM compression techniques which are used along with sine-
wave quadrature symmetry. ROM compression techniques generally require additional
circuit block such as adders, subtractors, and multipliers. A comparison of differ-
ent ROM compression techniques is shown in the Table 2.1. The table illustrates the
ROM size and additional logic circuits required for different compression techniques
to achieve nearly equal SFDR. Inherently, smaller ROMs operate at higher frequency
than the bigger ROMs. This is another benefit of reducing the ROM size by using
compression techniques.
Table 2.1 shows that Hutchison’s techniques, Sunderland’s, Nicholas’ and Bel-
laouar’s architectures provide different compression ratios at no loss of SFDR. All these
four techniques require two ROMs; a coarse ROM provides low resolution samples, and
a fine ROM gives additional resolution by interpolating between the low resolution sam-
ples [2]. The ROM compression techniques have tradeoffs in ROM size, compression
ratio and number of required additional circuit blocks. The linear addressing technique
reported by Ghosh et al. [30] requires a single ROM and provides little loss in SFDR.
The non linear addressing technique reported by Ghosh et al. [30] requires two smaller
ROMs and provides a high compression ratio with little loss of SFDR. The modified
Bellaouar’s architecture with improved compression and considerable reduction in ROM
size for little loss in SFDR has been reported by El Said et al. [31].
The ROM compression techniques compared in Table 2.1 provide SFDR be-
tween 71 dB and 72.245 dB. Bellaouar’s architecture [29], non linear addressing tech-
nique [30] and modified Bellaour’s architecture [31] can be implemented in DHBT
technology to achieve a smaller ROM size with little loss of SFDR. It can be seen from
Table 2.1 that the size of the ROM required for these three compression techniques are
between 25x9 bit to 24x8 bit, which can be implemented in InP DHBT technology [32].
16
Met
hod
#R
OM
Size
Tota
lCom
pres
sion
SFD
RC
omm
ents
Rat
iodB
Unc
ompr
esse
dM
emor
y(2
12)x
8bi
ts1:
172
.245
1/4
Qua
drat
ure
Sine
-Wav
eSy
mm
etry
(210)x
8bi
ts4:
172
.245
2ad
ders
requ
ired
Hut
chis
on’s
Tech
niqu
e[2
](2
7)x
6bi
ts,
11.6
4:1
72.2
45(3
,4,3
)pha
sese
gmen
tatio
n,(2
10)x
2bi
ts2
adde
rsre
quir
edSu
nder
land
’sA
rchi
tect
ure
[4]
(27)x
8bi
ts,
27:1
72.2
45(3
,4,3
)pha
sese
gmen
tatio
n,(2
6)x
3bi
ts2
adde
rsre
quir
edN
icho
las
Arc
hite
ctur
e[2
8](2
6)x
7bi
ts,
51.2
:172
.245
(3,3
,4)p
hase
segm
enta
tion,
(26)x
3bi
ts2
adde
rsre
quir
edB
ella
ouar
’sA
rchi
tect
ure
[29]
(25)x
8bi
ts,
78.8
:172
.245
(5,5
)pha
sese
gmen
tatio
n,(2
5)x
5bi
ts2
adde
rsre
quir
edL
inea
rAdd
ress
ing
Tech
niqu
e[3
0](2
6)x
9bi
ts56
.89:
171
5x7
mul
tiplie
r,2
adde
rsre
quir
edN
onlin
earA
ddre
ssin
gTe
chni
que
[30]
(25)x
9bi
ts,
78.8
:171
6x7
mul
tiplie
r,(2
4)x
8bi
ts2
adde
rsre
quir
edE
lSai
d’s
Tech
niqu
e[3
1](2
4)x
8bi
ts,
128.
8:1
72.1
(4,6
)pha
sese
gmen
tatio
n,(2
4)x
8bi
ts8x
7m
ultip
lier,
2ad
ders
requ
ired
Tabl
e2.
1:T
heco
mpa
riso
nof
diff
eren
tRO
Mco
mpr
essi
onte
chni
ques
forR
OM
LU
Tin
DD
S.
17
Applying ROM compression techniques is not the first phase goal of this re-
search, but it provides an idea of the size of the ROM required if ROM compression
techniques were to be applied. There has been very little work on ROM compression
techniques implemented in bipolar ROM compared to low frequency DDS in CMOS
technology. Kwok et al. [3] have implemented Nicholas architecture [33] compression
technique in their HBT sine ROM to achieve compression ratio of 1:70. Some of the
bipolar ROM LUT work done are mentioned in the section Bipolar ROM in DDS, 2.3.
18
CHAPTER 3
16x6-Bit High Speed Bipolar ROM in InP
Chapter 2 gave an overview of previously reported bipolar ROM memory cells,
decoders and sense amplifier designs. This chapter describes a 16x6-bit ROM imple-
mented in InP DHBT technology. The ROM implementation draws from prior work
on bipolar ROMs to achieve a good tradeoff between power consumption and speed in
InP DHBT technology. Simulation and measured results from the ROM test circuit are
discussed.
3.1 Circuit Design
The ROM consists of a 16 row decoder, a 16x6-bit memory cell array, and an
array of sense amplifiers, as illustrated in Figure 3.1. The row decoder receives its input
from the 4 most significant bits (MSBs) of an 8-bit differential accumulator, based on
the accumulator reported by Turner et al. [9]. The bit width of the accumulator feeding
the decoder determines the word length and hence the resolution of the system. During
each clock cycle, the row decoder reads one word line and the bit values stored in the
ROM cells are detected by the sense amplifier. The sense amplifier outputs drive a DAC
which converts several high speed digital outputs from the ROM into single analog
output. This facilitates testing by providing a single high-speed analog output instead of
several high-speed digital outputs.
3.1.1 Row Decoder
Minimizing the row decoder delay time is key to high speed bit access. A single
row of the row decoder is shown in Figure 3.2 and it contains a multi-emitter transistor
AND gate with the base biased by feedback from the collector circuit and a word drive
transistor (QW ) based on [24, 25]. The multi-emitter transistor AND gates in the decoder
19
16x6
Memory Cell Array
Sense Amplifier
Row
Dec
oder
Row 1
Row 2
Row 3
Row 16
Col
1
Col
2
Col
3
Col
6
Bit
0
Bit
1
Bit
2
Bit
5
Inpu
t fr
om A
ccum
ulat
or
Output to DAC
Figure 3.1: Block diagram of 16x6-bit ROM.
take single ended inputs. The differential output from the accumulator provides both the
single-ended signal and its complement, eliminating the need for a separate inverter cir-
cuit between the accumulator and the multi-emitter transistor AND gate. When all of
the bits connected to the multi-emitter transistor AND gate are high, the multi-emitter
transistor (QM ) is near the cut-off region. With a minimal amount of current flowing
through the collector, the voltage drop across resistor R2 is small. If one or more bits
are low at the multi-emitter transistor AND gate, QM is turned on. The collector cur-
rent produces about 250 mV of potential drop across resistor R2. The potential drop is
also seen at the word drive transistor QW . R1 and R2 connecting base and collector of
transistor QM , biases QM at saturation region when QM is on and close to cutoff region
when QM is off. When the output of a particular QW is low, the corresponding row is
deselected. A row is selected when all the input bits to its AND gate are high making the
output of QW high. A simulated output of the 16-row decoder is shown in Figure 3.3.
20
Row Driver
VCC
R1
R2
IS
Ac_Bit 0
Ac_Bit 1Ac_Bit 2
Ac_Bit 3
QW
QM
Figure 3.2: The schematic of a single row of the row decoder circuit.
The output of only one row is logic high at any time. This row decoder design balances
speed and power [24, 25] as there are few stages. The simulation result showed shortest
delay through the decoder to be about 7 ps. This design approach performs better than
the ECL decoder of base input type by Essl et al. [23], which uses extra stages that
increase delay and power consumption.
3.1.2 ROM Memory Cell Array
The ROM memory cell array with pull up transistors and sense amplifier circuits
is illustrated in Figure 3.4. Single-ended inputs from the row decoder drive the row
selection lines of the ROM. Each column of the ROM has differential signal lines that
feed into the sense amplifiers. When a row line is not selected, both of these lines are
at same the potential and have the same amount of current flowing through them. If a
transistor is connected to the high line, it is a high bit (QH). If it is connected to the low
21
1 1.1 1.2 1.3 1.4−1350
−1300
−1250
−1200
−1150
−1100
−1050
−1000
Time (ns)
Out
put V
olta
ge (m
V)
ROW1ROW2
ROW3ROW4
ROW5ROW6
ROW11ROW12
ROW7ROW8
ROW9ROW10
ROW13ROW14
ROW15ROW16
Figure 3.3: The simulation showing only one row of decoder output is logic high whileothers stay low.
line, it is a low bit (QL). A diode (QD) is connected in series with the transistor QL or
QH to maintain the transistor bias in the safe operating regime.
A word line is selected when its potential is raised by the row decoder. The row
decoder selects only one row at a time. When the word line is selected, the transistors
for each bit of that word are turned on producing a voltage difference between the high
line and low line. This small voltage difference is detected by the sense amplifier which
outputs either a logic high or a logic low.
The ROM cell uses a transistor to assign both a high and low bit value to each
cell. This provides a larger signal at the sense amplifier input than obtained from a
ROM utilizing transistors only for logical high [19]. In this differential ROM memory
cell array design, there are on average twice as many transistors, since both high and low
bits have transistors. However, the overall layout area is not substantially increased [20],
because the transistors added for low bits are located in the space that would otherwise
have been left empty between the differential signal lines.
22
Bit 0 Bit 1 Bit 5
+ -
Sense Amplifier
Row 1
Row 16
VCC
+ - + -
Hi g
h L
i ne
Lo w
Lin
e
16x6 ROM
QL
QH
QD
IH IL
Rc Rc
QD
QC QC
-+ -+ -+
Figure 3.4: The schematic of the ROM memory cell array with sense amplifier.
3.1.3 Sense Amplifier
The sense amplifier used in the ROM circuit is modification of a sense amplifier
reported by Miyanaga et al. [26]. The sense amplifier used in [26] has current mode
logic (CML) gate to detect the current in the bit line. A stage of emitter input type is used
for converting current signal to voltage signal followed by emitter follower. The sense
amplifier used in this ROM is a simple differential amplifier, as shown in Figure 3.5. The
high and low signal lines from each column of the ROM memory cell array are input to
the sense amplifier. When a row is selected, the transistor connected to the high or low
line will pull it down by about 250 mV. The current flowing through the collectors of the
differential pair is exponentially related to the difference in voltage between high and
low line. The voltage difference created by row selection is amplified as a differential
output. The 0.5 mA current difference between the high and low lines of the ROM is
23
Low Line
ID
High Line
Out N Out P
VEE
VCC
IC IC
IH IL+ -
RS RS
Figure 3.5: The schematic of the sense amplifier.
generated due to the difference in their voltage. The sense amplifier output is buffered
by an inverter and emitter follower to provide full differential outputs to the DAC.
3.2 Test Circuit
A test circuit was designed and sent out for fabrication. A block diagram of the
ROM test circuit is shown in Figure 3.6. It consists of an 8-bit accumulator, a 16x6-bit
ROM, and a 6-bit DAC. The accumulator generates a predictable high speed input to the
ROM which can be controlled easily. The DAC is used to convert several high speed
digital outputs from the ROM into single analog output. This facilitates testing, as it is
not practical to measure several ROM outputs separately. The accumulator and ROM
use a -3.8 V power supply, and the DAC uses a -4.5 V power supply.
In the test circuit, the four MSBs from the accumulator are input to the ROM.
The bit pattern in the ROM is shown in Table 3.1. The five least significant bits (LSBs)
from the ROM are input to the DAC. The four LSBs of the ROM are programmed as a
decreasing counter, and bit 4 is programmed as 0s. The DAC uses this input to generate
24
FCW
Clk in
ROMBit(7:4)
Bit 5 output
Bit(4,4:0) Analog OutputWaveform
DAC
AccumulatorThe Worst Case Bit Pattern
Figure 3.6: Functional schematic of the ROM test circuit.
a falling ramp. To further analyze the proper functioning of the ROM, the MSB from
the ROM is programmed with the bit pattern (1111110111111110). This bit pattern is
intended to act as a worst case switching. The bit pattern uses a long string of logic
high and then a single instance of logic low followed by a long string of logic high. The
high line of the MSB column has more bit transistors connected to it than the low line,
so it has a greater parasitic capacitance than low line due to the additional base-emitter
capacitance from the high bit transistors. This causes the MSB column to retain logic
high value for a longer time than logic low value. Also, at high clock frequency, the
parasitic capacitor will not have enough time to discharge in response to the short time
of changed bit, causing the ROM to fail.
3.3 Simulation Results
Figure 3.7 shows the simulated output voltage differential of the MSB from the
ROM circuit as a function of the input clock frequency under four different settings of
simulation conditions:
1. schematic of the row decoder, the ROM memory cell array, and the sense ampli-
fier, all without layout parasitics
25
Row # Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 01 1 0 1 1 1 12 1 0 1 1 1 03 1 0 1 1 0 14 1 0 1 1 0 05 1 0 1 0 1 16 1 0 1 0 1 07 0 0 1 0 0 18 1 0 1 0 0 09 1 0 0 1 1 1
10 1 0 0 1 1 011 1 0 0 1 0 112 1 0 0 1 0 013 1 0 0 0 1 114 1 0 0 0 1 015 1 0 0 0 0 116 0 0 0 0 0 0
Table 3.1: The bit pattern in the ROM.
2. the row decoder with layout parasitics, schematic of the ROM memory cell array
and sense amplifier without layout parasitics
3. schematic of the decoder without layout parasitics, the ROM memory cell array
and the row decoder with layout parasitics
4. the row decoder, the ROM memory cell array, and the sense amplifier with layout
parasitics
The results indicate that the maximum operating clock frequency of the ROM is
limited by the ROM memory cell array and not by the row decoder or the sense amplifier.
The plot of the results from the simulation of the layout of the row decoder, the ROM
memory cell array and the sense amplifier with layout parasitics shows a differential
output from the ROM of about 160 mV at a clock frequency at 36 GHz. The differential
signal of 150 mV from the ROM is sufficient to drive the DAC after being buffered. The
performance of the ROM is dependent on the decoding time since each row does not
have the same decoding time. The worst case delay occurs when 3 or 4 of the input bits
26
20 25 30 35 40 45 50 55 600
50
100
150
200
250
300
350
Operating Clock Frequency in GHz
Diff
eren
tial O
utpu
t Vol
tage
in m
V 1. All Schematic
4. All Layout with Parasitics
3. Decoder Schematic and ROM Layout
2. Decoder Layout and ROM Schematic
Figure 3.7: Simulated differential output voltage from the ROM under four differentsimulation settings: (1) schematic of the row decoder, the ROM memory cell array, andthe sense amplifier without layout parasitics: (2) the row decoder with layout parasitics,schematic of the ROM memory cell array and sense amplifier without layout parasitics:(3) schematic of the decoder without layout parasitics, the ROM memory cell array andthe row decoder with layout parasitics: and (4) the row decoder, the ROM memory cellarray, and the sense amplifier with layout parasitics.
27
to the row decoder are switching simultaneously. Row 1 and row 16 have the longest
delay since all of the address bits to the row decoder inputs must switch when going from
row 16 to row 1, or vice versa. The skew in the row 16 of the MSB is simulated to be
10 ps compared to clock signal when the ROM test circuit is clocked at 36 GHz and with
frequency control word (FCW) set to 16. This is due to the high parasitic capacitance
in the high line and the leakage current from the high bit transistors of unselected rows
in the MSB column. The high line in the MSB column has more leakage current than
the low line. When the sense amplifier senses the high bit, the differential voltage at
the input is higher than when sensing the low bit. This offset in the voltage causes fast
high bit sensing and slow low bit sensing. As illustrated in Figure 3.8, when decoding
row 16 the high line does not get discharged completely to the potential of logic low.
When the row decoder switches and selects row 1, the high line gets back to logic high
potential quickly generating the shortest delay. The skew in row 16 and the MSB are
also impaired by parasitics on the long input interconnections from the accumulator
and the fact that the MSB column is the farthest column from the row decoder output.
The parasitic extracted simulation results predicted a bit access time of 15.43 ps for the
ROM. The ROM test circuit simulation showed the accumulator and the ROM dissipates
6.76 W of which 1.49 W was dissipated by the ROM circuit alone.
3.4 Measurement Results
The microphotograph of the ROM test circuit is shown in the Figure 3.9. The
ROM test circuit is 2700 µm by 1450 µm and contains 2242 transistors. The circuits
were tested on wafer using a probe station and high-frequency probes. The output wave-
forms were measured with a high-frequency sampling oscilloscope.
Figure 3.10 shows the measured output from the DAC when the accumulator is
clocked at 36 GHz and the accumulator FCW is set to 1. When the FCW is set to 1,
the output at DAC changes every 16 clock cycles because the four MSBs of the 8-bit
28
2.31 2.315 2.32 2.325 2.33 2.335 2.34 2.345 2.35−350
−300
−250
−200
−150
−100
−50
0
Time (ns)
Out
put V
olta
ge (m
V)Row16, Bit5
Row16, Bit0
Figure 3.8: The skew in the row 16 of MSB when the ROM test circuit is clocked at36 GHz and FCW set to 16.
Accumulator Row Decoder
Sense Amplifier
DAC
Memory Core
Figure 3.9: Microphotograph of ROM test circuit. The chip is 2700 µm by 1450 µmand has 2242 transistors.
29
0 2 4 6 8 10−30
−20
−10
0
10
20
30
40
Time (ns)
Out
put V
olta
ge (m
V)
Figure 3.10: The DAC output data plot of the ROM test circuit clocked at 36 GHz andFCW set to 1. With this FCW the output changes every 16 clock cycles.
accumulator are used for the row decoder. The measured result shows a falling ramp as
expected, with each step equal to 443.2 ps. Figure 3.11 shows the measured result for
FCW set to 16 with a 36 GHz clock. In this state, the ROM word lines change every
clock cycle, hence the output of the DAC changes every clock cycle. A glitch can be
seen in the 8th step where the ROM data switches from 1000 to 0111. This is due to
the DAC not functioning properly when all four inputs switch at high frequency. This
problem in the DAC has been addressed by Tierney et al. [27] as a DAC noise. It can
be seen that the DAC also has problem when inputs switch from 0000 to 1111. Smaller
glitches can be seen at when inputs switch from 0100 to 0011 and 1100 to 1011, but
these are more subtle. However, the DAC shows the proper output after the glitch, so it
is seeing the correct inputs from the ROM at a 36 GHz clock frequency.
Figure 3.12 shows the measured single-ended output of the MSB of the ROM
when the circuit is clocked at 36 GHz and the accumulator FCW is set to 1. Figure 3.13
shows the measured single-ended output of the MSB from the ROM when FCW is set to
16. The output bits match the programmed bits from the Bit 5 of the ROM. This shows
that the worst case bit pattern in the ROM works at 36 GHz.
30
0 0.1 0.2 0.3 0.4 0.5−30
−20
−10
0
10
20
30
Time (ns)
Out
put V
olta
ge (m
V)
Figure 3.11: The DAC output data plot of the ROM test circuit clocked at 36 GHz andFCW set to 16. With this FCW the output changes every clock cycle. The glitch at thecenter of the wave form is caused from the DAC switching from 1000 to 0111.
0 2 4 6 8 10−150
−100
−50
0
50
Time (ns)
Out
put V
olta
ge (m
V)
1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0
Figure 3.12: The output of MSB of ROM programmed with bit pattern1111110111111110 when the accumulator is clocked at 36 GHz and the accumulatorFCW is set to 1. With this FCW the output is low for 16 clock cycles at a time.
31
0 0.1 0.2 0.3 0.4 0.5 0.6−80
−60
−40
−20
0
20
Time (ns)
Out
put V
olta
ge (m
V)
1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0
Figure 3.13: The output of MSB of ROM programmed with bit pattern1111110111111110 when the accumulator is clocked at 36 GHz and the accumulatorFCW is set to 16. With this FCW the output is low for a single clock cycle at a time.
The entire test circuit dissipates 7.22 W of power, with 2.12 W from the DAC
and 5.1 W from the combination of the accumulator and the ROM. The ROM alone
consumes 1.13 W of power. The measured power dissipation is 0.36 W or 24.2% less
than simulated power dissipation.
32
CHAPTER 4
32x6-bit ROM designs
4.1 ROM Designs
For a high speed DDS, the ROM needs to be fast, large in terms of memory
size and have low power consumption. These three goals are conflicting and must be
considered while making design decisions. The ROM design described in Chapter 3
has high operating frequency but is relatively small for DDS applications. This chapter
describes two design approaches for increasing the size of the ROM. In the first design
approach, the bit size is increased by increasing rows in the ROM memory core. In this
design, low power consumption is considered as goal. In the second design approach,
the bit size is increased by integrating two high speed 16x6-bit ROMs using high-speed
2:1 multiplexer (MUX) and D-latch. High speed operation is the primary goal of second
design approach.
4.1.1 ROM with 32x6-bit Memory core
The memory size of the ROM can be increased by increasing the size of the
memory core. The memory core can be increased by increasing the number of rows
and column in the memory. Figure 4.1 illustrates a 32x6 bit ROM design where the
bit-size of the ROM is increased by doubling the number of rows in the memory core
of the 16x6-bit ROM. A new 32-row decoder is designed which takes 5 input bits from
the accumulator. A single row of a row decoder consist of a 5-input multi-emitter AND
gate and a word drive transistor. The inputs from the accumulator have to drive twice as
many gates than in 16-row decoder so twice as many inverter-buffer are used for driving
32 rows. The sense amplifier design remains unchanged as the number of columns is
same as in the 16x6-bit ROM.
33
32x6
Memory Cell Array
Sense Amplifier
Row
Dec
oder
Row 1
Row 2
Row 3
Row 32
Col
1
Col
2
Col
3
Col
6
Bit
0
Bit
1
Bit
2
Bit
5
Inpu
t fr
om A
ccum
ulat
or
Output to DAC
Figure 4.1: Block diagram of 32x6-bit ROM.
34
All the design aspects of this ROM are the same as that of 16x6-bit ROM. It
has a larger memory core and row decoder than the design in Chapter 3. There is no
additional new circuit added for the proper functioning of the ROM. The larger memory
core and decoder increases the area of the ROM by about two times. The 5-input multi-
emitter transistor AND gate is bigger than 4-input multi-emitter transistor AND gate
hence has higher capacitance. The 32-row decoder is twice as big as 16 row decoder
which means more parasitic capacitance. The accumulator has to drive twice as much
capacitance in 32-row decoder which makes it slower than the 16-row decoder. The
increase in memory core size added more parasitic capacitance in both low and high
line of the ROM slowing down the sense amplifier. The 32x6-bit ROM is slower than
16x6-bit ROM not just because of added parasitics in both decoder and memory core
but also due to leakage current through unselected bit transistor. The bit transistor in
the unselected rows in the memory core are not in cutoff and a small amount of current
of about 20 µA is always flowing through them. In the 32x6-bit ROM there are more
unselected rows adding up to more leakage current than in the 16x6-bit ROM. In a
column with the number of logic high is much greater than the number of logic low, or
vice-versa, it gives rise to uneven current in high line and low line. This uneven amount
of current in the lines at sense amplifier will slow down ROM.
4.1.2 32x6-bit ROM with 2:1 MUX
The functional schematic in Figure 4.2 illustrates the second design approach
where two high-speed 16x6-bit ROMs, ROM A and ROM B, are integrated together
using 2:1 MUX and D-latch to form a 32x6-bit ROM. MUX designs in InP DHBT
technology have been reported to operate at 80 Gbit/s [34]. This ROM leverages the fast
ECL MUX shown in Figure 4.3. There are no design changes within the 16x6-bit ROM.
A 2:1 MUX takes input from Bit4 of the accumulator for selecting bits from ROM A or
ROM B for output.
35
16x6 Bit
ROM A
2:1 Mux
Input From
A
ccumulator
AC_Bit<3:0>
AC_Bit<4>
16x6 Bit
ROM B
Bit<0:5>
Bit<0:5>
Bit<0:5> D
Clk
Clock
Q
QBuffer
Delay
Delay Inv
Figure 4.2: Functional schematic diagram of 32x6-bit ROM with 2:1 MUX. The ROMis designed by integrating a 2:1 MUX, D-latch and 2 16x6-bit ROMs.
The ECL 2:1 MUX takes two differential inputs, A and B, 6 inputs each from
two 16x6-bit ROMs and a differential input, C, a single input from the accumulator. The
bit from the accumulator is delayed by using the series of ECL inverter-buffer to match
the access time of 16x6-bit ROM. The input from the accumulator determines which
ROM, A or B is seen at the output by controlling the current through transistors QC1
and QC2. The emitter follower is connected to differential pair output so that current is
not drawn from the collector of differential pair to drive the other gates.
If values from 16x6-bit ROMs are switching or are different from each other
at the time when the MUX is switching the output, then a glitch occurs. Figure 4.4
illustrates the glitch that occurs when the MUX switches the inputs from ROMs whose
logic levels are changing as well. A D-latch shown in the Figure 4.5 is used for removing
the glitch from the output of the MUX. The D-latch takes outputs VP and VN from the
2:1 MUX as inputs D and D̄, and a clock signal. The output value of the latch Q and Q̄ is
latched until the next rising cycle of the clock. The clock signal to the D-latch is delayed
by the amount of access time of the 16x6-bit ROM and then by another half clock cycle
by using an inverter. This makes D-latch to trigger on the second half of the clock cycle
36
IS
VEE
VC
C
RSRS
AA B B
VC
C
C C
VEEVEE
VP
VN
IL
IL
QC1 QC2
Figure 4.3: The schematic diagram of 2:1 multiplexer.
or falling edge of the real clock cycle. Delayed D-latch triggering eliminates the glitch
in the MUX output that occurs in the first half of the clock cycle. Figure 4.6 shows the
glitch-free buffered output from the D-latch when the clock signals are appropriately
delayed. The output of the D-latch is delayed by a half clock cycle from its input.
This design approach requires additional digital logic gates such as 2:1 MUX and
D-latch. These additional circuits contribute to higher power consumption and greater
layout area. The results from both ROMs are compared in the Simulation Results section
below.
4.2 Simulation Results
Firstly, simulation results from the operation of the ROM with 2:1 MUX is dis-
cussed. Table 4.1 shows Bit 4 of the accumulator, the programmed Bit 5 of ROM A,
Bit 5 of ROM B and the latched Bit 5 of the 32x6-bit ROM with 2:1 MUX. ROM A
and ROM B are the two 16x6-bit ROMs in 32x6 bit ROM with 2:1 MUX as shown in
Figure 4.2. When accumulator Bit 4 is logic high MUX selects output from ROM A and
from ROM B when it is logic low, which is illustrated in Table 4.1. Figure 4.7 shows
37
0.4 0.45 0.5 0.55−1400
−1300
−1200
−1100
−1000
−900
−800
Time (ns)
Out
put V
olta
ge (m
V)
1 0 0 1
Glitch
Figure 4.4: The MUX output with glitch that occurs when MUX is switching ROMinputs .
ID
VEE
VC
C
RMRM
DD
VC
C
Clock Clock
VEE
VEE
Q
Q
ILIL
Figure 4.5: The schematic diagram of D-latch.
38
0.4 0.45 0.5 0.55−1400
−1300
−1200
−1100
−1000
−900
−800
Time (ns)
Out
put V
olta
ge (m
V)
1 0 0 1
Figure 4.6: Buffered glitch-free output from D-latch.
Bit 4 from the accumulator, Bit 5 from ROM A and ROM B, and the output from the
ROM with 2:1 MUX simulated at 40 GHz. It can be clearly seen from Figure 4.7 that
the ROM output bit matches with the expected output bit from the Table 4.1. It also can
be seen from Figure 4.7 that the 32x6-bit ROM output is delayed by about a clock cycle
compared to the output from ROM A and ROM B. This is due to the half cycle delayed
clock input to the D-latch used for removing glitch caused by the MUX and buffer after
the D-latch output.
Both design approaches taken to increase the bit size of the ROM have their
own advantages in terms of power and speed. Figure. 4.8 shows a comparison of the
simulated output voltage differential of the MSB based on schematic simulation of the
32x6-bit ROM with 2:1 MUX, and the ROM with 32x6-bit memory core along with
16x6-bit ROM plotted against clock frequency. The outputs from the ROMs are not
loaded. It can be seen from the comparison that the ROM with 2:1 MUX design has
faster operating clock frequency and higher differential output than the ROM with the
32x6-bit memory core. It also can be seen that the speed of the ROM with the 2:1
MUX is comparable with the speed of 16x6-bit ROM. The output of the ROM with 2:1
39
Row # Accumulator Bit 4 ROM A Bit 5 ROM B Bit 5 Output1 0 1 0 02 1 0 0 03 0 0 0 04 1 0 0 05 0 0 0 06 1 0 0 07 0 1 1 18 1 1 0 19 0 0 0 0
10 1 1 0 111 0 0 0 012 1 1 0 113 0 0 0 014 1 1 0 115 0 0 0 016 1 1 1 1
Table 4.1: The bit pattern in Bit 4 of accumulator, Bit 5 of ROM A and Bit 5 of ROMB. Bit 4 from the accumulator switches the outputs in the 2:1 MUX.
MUX has a higher output voltage than the output from 16x6-bit ROM because the 2:1
MUX takes the buffered output from 16x6-bit ROMs and the output from the 2:1 MUX
is latched and buffered again. The 32x6-bit ROM with 2:1 MUX has higher operating
clock frequency but it suffers from slow access time. The outputs from the two 16x6
bit ROMs in the 32x6-bit ROM with 2:1 are delayed by series of 2:1 MUX, D-latch
and buffers. The access time will depend on the ROM layout with parasitics, hence
it is not extracted from the simulation based on schematic. The ROM with 2:1 MUX
does not function correctly once the output from the 16x6-bit ROM differential output is
below 50 mV. The ROM with 32x6-bit memory core operates slower because of longer
parasitic capacitance in its decoder and memory core. The access time in the ROM with
2:1 MUX is higher than the ROM with 32x6-bit memory core although it has higher
operation frequency. This is due to the propagation delay caused by buffers, 2:1 MUX
and D-latch. Considering 300 mV differential output as the least required level of output
after the buffer, the ROM with 32x6-bit memory core simulated at clock frequency of
40
20 GHz and the ROM with 2:1 MUX simulated frequencies up to 46 GHz. The ROM
with 2:1 MUX does not give correct output bit at frequencies higher than 46 GHZ.
Both ROM designs use a power supply of -3.8 V. The schematic simulation
result showed that the ROM with 32x6-bit memory core draws 0.515 A of current and
consumes 1.96 W. Results in Chapter 3 showed that the power consumed by the 16x6-bit
ROM from simulation result overestimates the measured power by 24.2%. The 1.96 W
consumed by 32x6-bit memory core ROM is comparable to the 1.49 W consumed by
16x6-bit ROM. Increase in the power is primarily due to increase in size of the row
decoder and buffers. To further analyze the power dissipation caused by changing the
size of memory core, the ROM with 16x8-bit memory core and the ROM with 32x9-
bit memory core were designed and simulated. The 16x8-bit ROM provides the effect
on power consumption due to an increase in the number of memory core columns and
32x9-bit ROM provides the effect on power consumption due to the increase in numbers
of both memory core rows and columns. Table 4.2 shows current drawn by the row
decoder and memory core, and the total power consumed by the ROMs with different
memory core sizes. It can be seen from the table that current drawn by the memory
core of 16x6 bit and 32x6 bit ROM are equal but the row decoder current is almost
twice as much. Table 4.2 also illustrates that when the number of memory core columns
is increased, keeping the rows constant, the memory core current increases but row
decoder current remains same. This indicates that increasing the number of rows in
memory core does not increase power consumption in memory core because power
consumption in memory core is dependent on the number of bit columns in the memory
core. The row decoder power consumption increases when the row in the row decoder
is increased.
The ROM with 32x6-bit ROM with 2:1 MUX draws 1.86 A of current and con-
sumes power of 7.068 Watt of power. The power consumed by 32x6-bit ROM with 2:1
MUX is more than 3 times that of ROM with 32x6-bit memory core. The high power
41
2.2
2.25
2.3
2.35
2.4
2.45
2.5
2.55
2.6
−1.
2
−1
−0.
8
Tim
e (n
s)
ACC<4>(V)
2.2
2.25
2.3
2.35
2.4
2.45
2.5
2.55
2.6
−1.
2
−1
−0.
8
Tim
e (n
s)
ROMA<5>(V)
2.2
2.25
2.3
2.35
2.4
2.45
2.5
2.55
2.6
−1.
2
−1
−0.
8
Tim
e (n
s)
ROMB<5>(V)
2.25
2.3
2.35
2.4
2.45
2.5
2.55
2.6
−1.
2
−1
−0.
8
Tim
e (n
s)
Output(V)
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
1
0
0
0
0
0
0
0
1
1
0
1
0
1
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
1
0
1
0
1
0
1
Figu
re4.
7:T
hebu
ffer
edan
dla
tche
dM
UX
outp
utof
Bit
5fr
omsi
mul
atio
nof
32x6
-bit
RO
Mw
ith2:
1M
UX
.
42
10 15 20 25 30 35 40 45 5050
100
150
200
250
300
Dif
fere
nti
al O
utp
ut
Vo
ltag
e m
V
Frequency in GHz
a. 32x6−bit ROM with 2:1 MUX
b. ROM with 32x6−bit Memory Core
c. 16x6−bit ROM
Figure 4.8: Simulated differential output voltage of ROMs: (a) schematic of 32x6-bit ROM with 2:1 MUX and D-latch: (b) schematic of ROM with 32x6-bit memorycore: and (c) schematic of 16x6-bit ROM. The outputs from the ROMs are not drivinganything. The 32x6-bit ROM with 2:1 MUX has a higher output voltage than the outputfrom the 16x6-bit ROM because the 2:1 MUX takes the buffered output from 16x6-bitROMs and the output from the 2:1 MUX is latched and buffered again.
ROM Memory Row Decoder Memory Core Total Total PowerCore Size (bit) Current (A) Current (A) Current (A) (W)
16x6 0.17 0.22 0.39 1.4916x8 0.15 0.29 0.44 1.6532x6 0.29 0.22 0.51 1.9632x9 0.29 0.32 0.61 2.33
Table 4.2: The power dissipated by ROMs with different memory core sizes.
43
ROM Design Power Operation Clock Frequency(W) (GHz)
16x6-bit ROM 1.49 36ROM with 32x6-bit core 1.957 20
32x6-bit ROM with 2:1 MUX 7.068 46
Table 4.3: Power and speed comparison of three different ROMs
dissipation in the ROM with the 2:1 MUX is due to having two fully functional 16x6-bit
ROM with additional circuits like 2:1 MUXs, D-latches and buffers. The circuit block of
6 2:1 MUXs, 6 D-latches and delay buffers for 6-bits dissipate most of the power. Low
power MUX and D-latch techniques have been reported by Razavi et al. [35]. These
designs are modification of stacked differential pairs in conventional ECL to adapt in
low power supply. Low voltage supply reduce the power consumption but this comes at
the cost of having multiple power supply because ROM and decoder has higher voltage
rail. The designs of MUX and latch reported by Razavi et al. [35] operate power supply
voltage as low as 1.5 V.
The design approach of increasing ROM bit size by increasing the rows in mem-
ory core slows down the ROM but this design approach is better if the power and access
time is the prime goal and not the very high frequency operation. If a very high fre-
quency operation is the design goal, and not low power, then the ROM with 2:1 MUX is
a better approach. Table 4.3 shows the speed and power analysis for the 16x6-bit ROM,
ROM with 32x6-bit memory core and 32x6-bit ROM with 2:1 MUX.
44
CHAPTER 5
Conclusion
The goal of the research was to explore designs of bipolar ROMs in InP DHBT
technology suitable for use as a ROM LUT and capable of operating at 30-40 GHz. The
ROM suitable for DDS LUT was designed, fabricated and tested to work at 36 GHz.
The simulation results were compared with test results. Further research was done to
increase the ROM bit size without losing speed and power. The SFDR of a DDS system
is associated with the size of ROM LUT. Based on the need for low power and high
speed, two design approaches were explored. One design is for low power and the other
provides high frequency clock operation. The simulation results from both of these
designs were discussed. It would be good to realize these two designs in hardware and
verify the simulation results with the test results in the future.
5.1 Summary of Accomplishment
A 16x6-bit ROM test circuit in InP DHBT technology operating at maximum
clock frequency of 36 GHz was demonstrated. The ROM bit access time of 15.43 ps is
estimated from the simulation result. The entire test circuit consumes power of 7.22 W,
while the ROM circuit alone dissipates 1.13 W. The ROM test circuit has 2242 transis-
tors in an area of 2700 µm by 1450 µm. Further more, two different 32x6-bit ROMs are
designed to increase the bit size. The first ROM design increased the bit size by increas-
ing the bits in the memory core. The schematic simulation showed that ROM operates
at 20 GHz and consumes 1.96 W of power. The second ROM design integrates two
smaller high-speed ROMs together using 2:1 MUX. The schematic simulation result
showed that ROM operates in the range of 46 GHz and consumes 7.06 W of power.
45
5.2 Recommendation for Future Work
Power dissipation is always higher in bipolar circuits than in CMOS counter-
parts. This makes bipolar circuits less attractive and limits their application. Some low
voltages techniques for high speed digital bipolar circuits have been proposed by Razavi
et al. [35]. These techniques have modified stacked differential pairs in conventional
ECL to use low power supply to reduce the power. Some digital logic blocks such as
MUX and D-latch can be designed using these techniques to save power, but it comes at
the cost of multiple different level power supply and slower speed. Additional research
is required to improve the ROM design regarding both speed and power.
The first recommendation for future work is to duplicate and extend this work
in SiGe technology. Despite lower BVCEO, SiGe technology has some advantages that
this project can benefit from. The first advantage is that SiGe process provides SBD, so
the memory cell and decoder in the ROM design could benefit from smaller and faster
SBD. A SBD memory cell and a decoder design is described in Chapter 2, Section 2.2
can be implemented in SiGe process to achieve better performance. Another advantage
in SiGe process is it provides PMOS and NMOS because it is built on CMOS process.
The ROM design can benefit from high impedance NMOS current source to reduce
the power consumption. SiGe process also has thin copper interconnect which provide
lower capacitance compare to thick aluminium interconnect in InP. Lower parasitic ca-
pacitance in the interconnect can help improve the performance of the ROM. On top
of these, SiGe process is more easily manufacturable than InP process with better yield
and lower cost.
The next step needs to be implementing ROM compression technique in LUT of
DDS system after having high speed ROM designed. The ROM compression techniques
such as Bellaouar’s architecture [29], non linear addressing technique [30] and modified
Bellaour’s architecture [31] can be implemented in the ROM designed in this thesis. The
additional circuit blocks required for ROM compression such as multipliers and adders
46
need to be explored. Finally, the future work showing the comparison in terms of SFDR,
speed, power and size of DDS with ROM LUT to ROM-less DDS architecture in InP
DHBT technology is recommended.
47
REFERENCES
[1] A. Yamagishi, M. Ishikawa, T. Tsukahara, and S. Date, “A 2-V, 2-GHz low-powerdirect digital frequency synthesizer chip-set for wireless communication,” IEEEJournal of Solid-State Circuits, vol. 33, no. 2, pp. 210–217, February 1998.
[2] J. Vankka, “Methods of mapping from phase to sine amplitude in direct digital syn-thesis,” IEEE Trans. Ultrasonics Ferroelectrics, and Frequency Control, vol. 44,pp. 526–534, March 1997.
[3] C. Kwok, N. Sheng, P. Asbeck, G. Kent, and S. Chen, “Ultra-high speed HBT sineROM for direct digital synthesis application,” Aerospace and Electronics Confer-ence, Proceedings of the IEEE 1994 National, pp. 461–467, May 1994.
[4] D. A. Sunderland, R. A. Strauch, S. S. Wharfield, H. T. Peterson, and C. R. Cole,“CMOS/SOS frequency synthesizer LSI circuits for spread spectrum communica-tions,” IEEE Journal of Solid-State Circuits, vol. 14, pp. 497–506, August 1984.
[5] C. Chang, J. Wang, and C. Yang, “Low-Power and High-Speed ROM modulesfor ASIC applications,” IEEE Journal of Solid-State Circuits, vol. 36, no. 10, pp.1516–1523, October 2001.
[6] O. Takahashi, N. Aoki, J. Silberman, and S. Dhong, “A 1-GHz logic circuit familywith sense amplifiers,” IEEE Journal of Solid-State Circuits, vol. 34, no. 5, pp.616–622, May 1999.
[7] K. R. Elliot, “Direct digital synthesis for enabling next generation RF systems,”CSIC Digest, pp. 125–128, November 2005.
[8] S. E. Turner and D. E. Kotecki, “Direct digital synthesizer with ROM-less archi-tecture at 13 GHz clock frequency in InP DHBT technology,” IEEE Microwaveand Wireless Components Letters, vol. 16, no. 5, May 2006.
[9] ——, “Direct digital synthesizer with sine-weighted DAC at 32 GHz clock fre-quency in InP DHBT technology,” IEEE Journal of Solid-State Circuits, acceptedfor publication.
[10] S. E. Turner, R. B. Elder, D. S. Jansen, and D. E. Kotecki, “4-Bit adder-accumulator at 41-GHz clock frequency in InP DHBT technology,” IEEE Mi-crowave and Wireless Components Letters, vol. 15, no. 3, pp. 144–146, March2005.
[11] G. He, J. Howard, M. Le, P. Partyka, B. Li, G. Kim, R. Hess, R. Bryie, R. Lee,S. Rustomji, J. Pepper, M. Kail, M. Helix, R. B. Elder, D. S. Jansen, N. E. Harff,J. F. Prairie, E. S. Daniel, and B. K. Gilbert, “Self-aligned InP DHBT with ft andfmax over 300 GHz in a new manufacturable technology,” IEEE Electron DeviceLetters, vol. 25, no. 8, pp. 520–522, 2004.
48
[12] H. Rucker, B. Heinemann, R. Barth, D. Bolze, J. Drews, U. Haak, W. Hopp-ner, D. Knoll, K. Kopke, S. Marschmeyer, H. H. Richter, P. Schley, D. Schmidt,R. Scholz, B. Tillack, W. Winkler, H. E. Wulf, and Y. Yamamoto, “SiGe:C BiC-MOS technology with 3.6 ps gate delay,” IEDM Tech. Dig., pp. 531–534, 2003.
[13] K. Ishii, H. Nosaka, M. Ida, K. Kurishima, and T. Shibata, “3.21 ps ECL gate usingInP/InGaAs DHBT technology,” Electron Letters, vol. 39, pp. 1434–1436, 2003.
[14] A. Rylyakov and T. Zwick, “96 GHz static frequency divider in SiGe bipolar tech-nology,” GaAs IC Symp. Tech. Dig., pp. 288–290, 2003.
[15] M. Mokhtari, C. Fields, and R. D. Rajavel, “100+ GHz static divider-by-2 circuitin InP-DHBT technology,” GaAs IC Symp. Tech. Dig., pp. 291–293, 2002.
[16] J. C. Zolper, “Challenges and opportunity for InP HBT mixed signalcircuit tech-nology,” Int. Conf. on Indium Phosphide and Related Materials, pp. 8–11, May2003.
[17] J. F. Gunn and R. L. Pritchett, “A bipolar 16K ROM utilizing schottky diode cells,”IEEE International Solid-State Circuits Conference, pp. 118–119, Feb 1977.
[18] J. A. Ludwig, “A 50K bit schottky cell bipolar Read-Only memory,” IEEE Journalof Solid-State Circuits, vol. 15, no. 5, pp. 816–820, October 1980.
[19] J. C. Barrett, A. Bergh, T. Hornak, and J. E. Price, “Design considerations for ahigh-speed bipolar READ-ONLY memory,” IEEE Journal of Solid-State Circuits,vol. 5, no. 5, pp. 196–202, October 1970.
[20] S.Padoan and A. Boni, “High speed, low voltage ROMs,” 2nd IEEE-CAS Region8 Workshop on Analog and Mixed IC Design, pp. 18–21, September 1997.
[21] N. Homma, K. yamaguchi, H. Nanbu, K. Kanetani, Y. Nishioka, A. Uchida, andK. Ogiue, “A 3.5-ns, 2-W, 20-mm2, 16-kbit ECL bipolar RAM,” IEEE Journal ofSolid-State Circuits, vol. 21, no. 5, pp. 675–679, October 1986.
[22] T. Douseki and Y. Ohmori, “BiCMOS circuit technology for a high-speed SRAM,”IEEE Journal of Solid-State Circuits, vol. 23, pp. 68–73, February 1988.
[23] D. V. Essl, R. W. Mitterer, B. F. Rehn, and J. R. Domitrowich, “Automated de-sign optimization of integrated switching circuits,” IEEE Journal of Solid-StateCircuits, vol. 9, pp. 14–20, February 1974.
[24] K. Kawarada, M. Suzuki, H. Mukai, K. Toyoda, and Y. Kondo, “A fast 7.5 nsaccess 1K-Bit RAM for cache memory systems,” IEEE Journal of Solid-State Cir-cuits, vol. 13, no. 5, pp. 656–663, October 1978.
[25] C. Chuang, D. D. Tang, G. P. Li, E. Hackbarth, and R. R. Boedeker, “A 1.0-ns 5K-Bit ECL RAM,” IEEE Journal of Solid-State Circuits, vol. 21, no. 5, pp. 670–674,October 1986.
49
[26] H. Miyanaga, S. Konaka, Y. Kobayashi, Y. Yamamoto, and T. Sakai, “A 0.85-ns 1K-Bit ECL RAM,” IEEE Journal of Solid-State Circuits, vol. 21, no. 4, pp.501–504, August 1986.
[27] J. Tierney, C. M. Rader, and D. Gold, “A digital frequency synthesizer,” IEEETrans. on Audio and Electroacoustics, vol. 19, pp. 48–57, March 1971.
[28] H. T. Nicholas, H. Samueli, and B. Kim, “The optimization of direct digtal fre-quency synthesizer performance in the presence of finite world length effects,”Proc. 42nd Ann. Freq. Contr. Symp, pp. 357–363, 1988.
[29] A. Bellaouar, M. S. O’brecht, A. M. Fahim, and M. I. Elmasry, “Low-power directdigital frequency synthesis for wireless communications,” IEEE Journal of Solid-State Circuits, vol. 35, no. 3, pp. 386–390, March 2000.
[30] M. Ghosh, L. S. J. Chimakurthy, F. F. Dai, and R. C. Jaeger, “A novel DDS ar-chitecture using nonlinear ROM addressing with improved compression ratio andquantisation noise,” Proceedings of IEEE International Symp. on Circuits and Sys-tems, pp. 705–708, May 2004.
[31] M. M. El Said and M. I. Elmasry, “An improved ROM compression technique fordirect digital frequency synthesizer,” IEEE International Symposium on Circuitsand Systems, pp. 26–29, May 2002.
[32] S. Manandhar, S. E. Turner, and D. E. Kotecki, “A 20 GHz and 46 GHz, 32x6-bitROM for DDS application in InP DHBT technology,” International Conference onElectronics, Circuits and Systems, submitted for publication.
[33] H. T. Nicholas and H. Samueli, “A 150-MHz direct digital frequency synthesizer in1.25um CMOS with -90 dBC spurious performance,” IEEE Journal of Solid-StateCircuits, vol. 26, no. 12, pp. 1959–1969, December 1991.
[34] R. E. Makon, M. Lang, R. Driad, K. Schneider, M. Ludwig, R. Aidam, R. Quay,M. Schlechtweb, and G. Weimann, “Over 80 Gbit/s 2:1 multiplexer and low powerselector ICs using InP/InGaAs DHBTs,” Electronics Letters, vol. 41, no. 11, pp.644–646, May 2005.
[35] B. Razavi, Y. Ota, and R. G. Swartz, “Design techniques for low voltage highspeed digital bipolar circuits,” IEEE Journal of Solid-State Circuits, vol. 29, no. 3,pp. 332–339, March 1994.
50
BIOGRAPHY OF THE AUTHOR
Sanjeev Manandhar was born in Kathmandu, Nepal on November 13, 1979. He
received his high school education from St. Xavier’s Campus in Kathmandu.
He entered The University of Maine in 2000 and obtained his Bachelor of Sci-
ence degree in Computer Engineering in 2004.
In September 2004, he was enrolled for graduate study in Electrical Engineering
at The University of Maine and served as a Graduate Research Assistant. His current re-
search interests include digital and mixed-signal circuit design. He is a member of IEEE,
and his interests include playing soccer, photography and hiking. Sanjeev is a candidate
for the Master of Science degree in Electrical Engineering from The University of Maine
in August 2006.
51