Report

ASEMINAR REPORTONASYNCRONOUS CHIP

SUBMITTED IN PARTIAL FULFILLMENTFOR THE AWARD OF THE DEGREEOFBACHELOR OF TECHNOLOGYTORAJASTHAN TECHNICAL UNIVERSITY, KOTA

Submitted By: - Submitted to:- SHEKHAR SHARMA Mr. VIKAS TIWARIAsst.Prof(E.C.E. Dept.)

ELECTRONICS & COMMUNICATION ENGINEERINGSIDDHI VINAYAK COLLEGE OF SCIENCE & HIGHER EDUCATION, ALWARSESSION: 2012

An ISO: 9001:2000 certified CollegeSiddhi Vinayak College of Science & HR. EducationApproved by A.I.C.T.E., Govt. of Rajasthan & Affiliated to University of RajasthanCollege campus: E-1, B-1, M.I.A., (Ext.) Alwar 301030 Fax: 0144-2881777Ph: 0144-2881888, 288144, 2882888, 3295489, 9351010431Visit us at: www.gmcolleges.com E-mail: siddhivinayak @gmcolleges.com

CERTIFICATE This is to certify that Mr. Shekhar Kumar Sharma of Electronics & Communication Branch of Siddhi Vinayak College Of Science & Hr. Education, ALWAR has completed his seminar work entitled ASYNCRONOUS CHIP under my Supervision as a partial, fulfillment of his Degree of Bachelor of Technology of Rajasthan Technical University, Kota. I am fully satisfied with the work carried out by him which has been reported in this seminar and all the work done is bonafide to the above name student. I strongly recommend for the award of the Degree. SEMINAR GUIDE:: Mr. Vikas TiwariAsst. Prof(E.C.E. Dept.)

ACKNOWLEDGEMENT

First and foremost, I would like to thank my respected parents, who always encouraged me and taught me to think and work out innovatively what so ever would be the field of my life.My sincere thanks go to Mr. Vikas Tiwari (H.O.D., ECE Dept.) for his prodigious guidance, persuasion, reformative and prudential suggestion throughout my seminar work. It is just her guidance because of which I was able to know every aspects of the seminar used here.

Finally it is indeed a great pleasure & privilege to express my thanks to colleagues, my friends and my family members for their all types of help and suggestions.

Date :- ----------- Shekhar Kumar Sharma

ABSTRACT

Breaking the bounds of the clock on a processor may seem a daunting task to those brought up through a typical engineering program. Without the clock, how do you organize the chip and know when you have the correct data or instruction? We may have to take this task on very soon.

Clock speeds are now on the gigahertz range and there is not much room for speedup before physical realities start to complicate things. With a gigahertz powering a chip, signals barely have enough time to make it across the chip before the next clock tick. At this point, speedup the clock frequency could become disastrous. This is when a chip that is not constricted by clock speed could become very valuable.

Interestingly, the idea of designing a computer processor without a central controlling clock is not a new one. In fact, this idea was suggested as early as 1946, but engineers felt that this asynchronous design would be too difficult to design with their current, and by todays standards, clumsy technology.

Today, we have the advanced manufacturing devices to make chips extremely accurate. Because of this, it is possible to create prototype processors without a clock. But will these chips catch on? A major hindrance to the development of clock less chips is the competitiveness of the computer industry. Presently, it is nearly impossible for companies to develop and manufacture a clock less chip while keeping the cost reasonable. Until this is possible, clock less chips will not be a major player in the market.

CONTENTS

Chapter no. Particulars Page

1. INTRODUCTION 1.1 DEFINITION 1.2 CLOCK CONCEPT 2. CLOCKLESS APPROACH

2.1. CLOCK LIMITATIONS 2.1. ASYNCRONOUS VIEW 3. PROBLEMS WITH SYNCRONOUS CIRCUITS 3.1 LOW PERFOMANCE 3.2 LOW SPEED 3.3HIGH POWER DISSIPATION 3.4HIGH ELECTROMAGNETIC NOISE4. ASYNCRONOUS CIRCUITS 4.1CLOCKLESS CHIPS IMPLEMENTATION4.2THROWING AWAY GLOBAL CLOCK4.3STANDADISE OF COMPONENTS4.4HOW CLOCKLESS CHIPS WORKS

5. COMPUTERS WITHOUT CLOCKS5.1HOW FAST COMPUTER IS?5.2BEAT THE CLOCK5.3OVERVIEW/ CLOCKLESS SYSTEM5.4LOCAL OPERATION5.5RENDEZVOUS CIRCUITS5.6ARBITER CIRCUIT 5.7THE NEED FOR SPEED

6.SIMPLICITY IN DESIGN6.1ASYNCRONOUS OF HIGHER PERFOMANCE6.2ASYNCHRONOUS FOR LOW POWER6.3ASYNCRONOUS FOR LOW NOISE7.ADVANTAGE OF ASYNCRONOUS CHIP

8.APPLICATION OF ASYNCRONOUS CHIP8.1ASYNCHRONOUS FOR HIGH PERFORMANCE8.2ASYNCHRONOUS FOR LOW POWER8.3ASYNCHRONOUS FOR LOW NOISE AND LOW EMISSION8.4HETEROGENEOUS TIMING

9.A CHALLENGING TIME

10.FUTURE SCOPE

CONCLUSION

REFERENCESChapter-1 INTRODUCTION

1.1 DEFINITIONEvery action ofthe computertakes place in tiny steps, each a billionth of a second long. A simpletransferof data may take only one step; complex calculations may take many steps. All operations, however, must begin and end according to the clock's timing signals. The use of a central clock also creates problems. As speeds have increased, distributing the timing signals has become more and more difficult. Present-day transistors can process data so quickly that they can accomplish several steps in the time that it takes a wire to carry a signal from one side of the chip to the other. Keeping the rhythm identical in all parts of a large chip requires careful design and a great deal of electrical power. Wouldn't it be nice to have an alternative? Clockless approach, which uses a technique known as asynchronous logic, differs from conventional computer circuit design in that the switching on and off ofdigital circuitsis controlled individually by specific pieces of data rather than by a tyrannical clock that forces all of the millions of the circuits on a chip to march in unison. It overcomes all the disadvantages of a clocked circuit such as slow speed, high power consumption, high electromagnetic noise etc. For these reasons the clockless technology is considered as the technology which is going to drive majority of electronic chips in the coming years.1.2 CLOCK CONCEPTThe clock is a tiny crystal oscillator that resides in the heart of every microprocessor chip. The clock is what which sets the basic rhythm used throughout the machine. The clock orchestrates the synchronous dance of electrons that course through the hundreds of millions of wires and transistors of a modern computer. Such crystals which tick up to 2 billion times each second in the fastest of today's desktop personal computers, dictate the timing of every circuit in every one of the chips that add, subtract, divide, multiply and move the ones and zeros that are the basic stuff of the information age. Conventional chips (synchronous) operate under the control of a central clock, which samples data in the registers at precisely timed intervals. Computer chips of today are synchronous: they contain a main clock which controls the timing of the entire chips. One advantage of a clock is that, the clock signals to the devices of the chip when to input or output. The circuit which uses global clock can allow data to flow in the circuit in any manner of sequence and order does not matter.The diagram above shows the global clock is governing all components in the system that need timing signals. All components operate exactly once per clock tick and their outputs need to be ready and next clock tick.

fig1.1: Power Dissipation in Synchronous (Left) & Asynchronous (Right)

Chapter-2CLOCKLESS APPROACH

2.1 CLOCK LIMITATIONSThere are problems that go along with the clock, however. Clock speeds are now in the gigahertz range and there is not much room for speedup before physical realities start to complicate things. With a gigahertz clock powering a chip, signals barely have enough time to make it across the chip before the next clock tick. At this point, speedup up the clock frequency could become disastrous. This is when a chip that is not constricted by clock speeds could become very valuable. Clock (FrequencyOne can create a clock that is so fast and it sends its timing signals to the logic circuits which are governed by the clock timing signals. These logic circuits are supposed to respond to every tick of the clock and yet when they can compile to match the speed then logic circuits will be not optimum according to the speed of clock and hence the input and output can go incorrect. This will result hardware problem since one has to assemble chips to achieve the speed of clock and hence much more complicated situation arise.

2.2 ASYNCRONOUS VIEWBy throwing out the clock, chip makers will be able to escape from huge power dissipation. Clockless chips draw power only when there is useful work to do, enabling a huge savings in battery-driven devices. Like a team of horses that can only run as fast as its slowest member, a clocked chip can run no faster than its most slothful piece of logic; the answer isn't guaranteed until every part completes its work. By contrast, the transistors on an asynchronous chip can swap information independently, without needing to wait for everything else. The result? Instead of the entire chip running at the speed of its slowest components, it can run at the average speed of all components. At both Intel and Sun, this approach has led toprototypechips that run two to three times faster than comparable products using conventional circuitry.

Another advantage of clockless chips is that they give off very low levels of electromagnetic noise. The faster the clock, the more difficult it is to prevent a device from interfering with other devices; dispensing with the clock all but eliminates this problem. The combination of low noise and low power consumption makes asynchronous chips a natural choice for mobile devices.Computer chips of today are synchronous. They contain a main clock, which controls the timing of the entire chips. There are problems, however, involved with these clocked designs that are common today.

One problem is speed. A chip can only work as fast as its slowest component. Therefore, if one part of the chip is especially slow, the other parts of the chip are forced to sit idle. This wasted computed time is obviously detrimental to the speed of the chip.

New problems with speeding up a clocked chip are just around the corner. Clock frequencies are getting so fast that signals can barely cross the chip in one clock cycle. When we get to the point where the clock cannot drive the entire chip, well be forced to come up with a solution. One possible solution is a second clock, but this will incur overhead and power consumption, so this is a poor solution. It is also important to note that doubling the frequency of the clock does not double the chip speed, therefore blindly trying to increase chip speed by increasing frequency without considering other options is foolish.

The other major problem with c clocked design is power consumption. The clock consumes more power that any other component of the chip. The most disturbing thing about this is that the clock serves no direct computational use. A clock does not perform operations on information; it simply orchestrates the computational parts of the computer.

New problems with power consumption are arising. As the number of transistors on a chi increases, so does the power used by the clock. Therefore, as we design more complicated chips, power consumption becomes an even more crucial topic. Mobile electronics are the target for many chips.

These chips need to be even more conservative with power consumption in order to have a reasonable battery lifetime. The natural solution to the above problems, as you may have guessed, is to eliminate the source of these headaches: the clock.

Fig 2.1: ASYNCRONOUS VIEW

Chapter-3PROBLEMS WITH SYNCRONOUS CIRCUITS Synchronous circuits aredigital circuitsin which parts are synchronized by clock signals. In an ideal synchronous circuit, every change in the logical levels of its storage components is simultaneous. These transitions follow the level change of a special signal called the clock signal. Ideally, the input to each storage element has reached its final value before the next clock occurs, so thebehaviorof the whole circuit can be predicted exactly. Practically, some delay is required for each logical operation, resulting in a maximum speed at which each synchronous system can run. However there are several problems that are associated with synchronous circuits:3.1 LOW PERFOMANCEIn a synchronous system, all the components are tied up together and the system is working on its worst case execution. The speed of execution will not be faster than that of the slowest circuit in the system and this will determine the final working performance of the system. Although there are faster circuits which have sophisticated performance but since they are depending of some other slow components for input and output of data then they can no long run faster than the slowest components. Hence the performance of the synchronous system is limited to its worst case performance.3.2 LOW SPEEDA traditional CPU cannot "go faster" than the expected worst-case performance of the slowest stage/instruction/component. When an asynchronous CPU completes an operation more quickly than anticipated, the next stage can immediately begin processing the results, rather than waiting for synchronization with a central clock. An operation might finish faster than normal because of attributes of the data being processed (e.g., multiplication can be very fast when multiplying by 0 or 1, even when running code produced by a brain-dead compiler), or because of the presence of a higher voltage or bus speed setting, or a lower ambient temperature, than 'normal' or expected.3.3 HIGH POWER DISSIPATIONAs we know that clock is a tiny crystal oscillator that keeps vibrating during all time as long as the system is power on, this lead into high power dissipation by the synchronous circuit since they use central clock in their timings. The clock itself consumes about 30 percent of the total power supplied to the circuit and sometimes can even reach high value such as 70 percent. Even if the synchronous system is not active at the moment still its clock will be oscillating and consumes power that is dissipated as heat energy. This makes synchronous system more power consumer and hence not suitable for use in design of mobile devices and battery driven devices.

3.4 HIGH ELECTROMAGNETIC NOISE

Since clock itself is crystal oscillator it is then associated with electromagnetic waves. These waves produce electromagnetic noise due to oscillations. Noise will also be accompanied by emission spectra. The higher the speed of clock is the higher number of oscillations per second and this leak high value of electromagnetic noise and spectra emission. This is not a good sign for design of mobile devices too. Apart from the problems above, the clock is synchronous circuit and globally distributed over the components which are obviously in running in different speed and hence the order of arrive of the timing signal is not important. Data can be received and transmitted in any form of order regardless of there sequential order they arrive at the fist stage of execution. The designing of clock frequency should be so sophisticated since the frequency of the clock is fixed and poor march of design can result problem in the reusability of resources and interfacing with mixed-time environment devices.

Chapter-4 ASYNCRONOUS CIRCUITS

Asynchronous circuits are the electronic digital circuits that are not govern by the central clock in their timing instead they are standardized in their installation and they use handshakes signals for communication to each other components. In this case the circuits are not tied up together and forced to follow the global clock timing signals but each and every component is loosely and they run at average speed. Asynchronous is can be achieved by implementing three vital techniques and these are:

4.1 CLOCKLESS CHIPS IMPLEMENTATIONIn order to achieve asynchronous as final goal one must implement the electronic circuits without using central clock and hence make the system free from tied components obeying clock. One tricky technique is to use clockless chips in the circuit design. Since these chips are not working with central clock and guarantee to free different components from being tied up together. Now as components can run on their own different performance and speed hence asynchronous is established.

4.2 THROWING AWAY GLOBAL CLOCKThere is no way one can success to implement asynchronous in circuits if there is global clock that is managing the whole system timing signals. Since the clock is installed only to enable the synchronization of components, by throwing away the global clock it is possible now for components to be completely not synchronized and the communication between them is only by handshaking mechanism.4.3 STANDADISE OF COMPONENTSIn synchronous system all the components are closed up together as to be managed by central clock. Synchronous ness can be split up if these components are not bound together and hence standardizing these components is one of the alternatives. Here all the components are going to be standard in a given range of working performance and speed. There is average speed in which the design of system is dedicated to compile and the worst case execution will be avoided.

4.4 HOW CLOCKLESS CHIPS WORKSBeyond a new generation of design-and-testing equipment, successful development of clockless chips requires the understanding of asynchronous design. Such talent is scarce, as asynchronous principles fly in the face of the way almost every university teaches its engineering students. Conventional chips can have values arrive at a register incorrectly and out of sequence; but in a clockless chip, the values that arrive in registers must be correct the first time. One way to achieve this goal is to pay close attention to such details as the lengths of the wires and the number of logic gates connected to a given register, thereby assuring that signals travel to the register in the proper logical sequence. But that means being far more meticulous about the physical design than synchronous designers have been trained to be.

An alternative is to open up a separate communication channel on the chip. Clocked chips represent ones and zeroes using low and high voltages on a single wire, "dual-rail" circuits, on the other hand, use two wires, giving the chip communications pathways, not only to send bits, but also to send "handshake" signals to indicate when work has been completed. Fair additionally proposes replacing the conventional system of digital logic with what known as "null convention logic," a scheme that identifies not only "yes" and "no," but also "no answer yet"-a convenient way for clockless chips to recognize when an operation has not yet been completed. All of these ideas and approaches are different enough that executing them could confound the mind of an engineer trained to design to the beat of a clock. It's no surprise that the two newest asynchronous startups, Asynchronous Digital Devices and Self-Timed Solutions, are populating now, and clockless-chip research has been going on the longest. For a chip to be successful, all three elements-design tools, manufacturing efficiency and experienced designers-need to come together. The asynchronous cadre has very promising ideas. There is now way one can obtain pure asynchronous circuits to be used in the complete design of the system and this is one of major barrier of clockless implementation but the circuits were successfully standardized and hence they do not have to be in synchronous mode. And hence handshakes were the solution to overcome synchronization.

One component which needs to communicate with the other uses the handshake signals to achieve the establishment of connection and then with set up the time at which is going to send data and at the other side another component will also use the same kind of handshakes to harden the connection and wait for that time to receive data.

In circuits implemented by clockless chips, data do not have to move at random and out of order as in synchronous in which the movement of data is no so essential. In asynchronous circuits data are treated as very important aspect and hence do not move at any time they only and only move when are required to move in case such as transmission between several components. This technique has offered low power consumption and low electromagnetic noise and also there will of course be smooth data streaming.

Chapter-5COMPUTERS WITHOUT CLOCKS

Asynchronous chips improve computer performance by letting each circuit run as fast as it can. 5.1 How fast is your personal computer?When people ask this question, they are typically referring to the frequency of a minuscule clock inside the computer, a crystal oscillator that sets the basic rhythm used throughout the machine. In a computer with a speed of one Gigahertz, for example, the crystal ticks a billion times a second. Every action of the computer takes place in tiny step; complex calculations may take many steps. All operations, however, must begin and end according to the clocks timing signals. Since most modern computers use a single rhythm, we call them synchronous. Inside the computers microprocessor chip, a clock distribution system delivers the timing signals from the crystal oscillator to the various circuits, just as sound in air delivers the beat of a drum to soldiers to set their marching space. Because all parts of the chip share the same rhythm, the output of any circuit from one step can serve as the input to any other circuit for the next step. The synchronization provided by the clock helps chip designers plan sequences of actions for the computer. The use of a central clock also creates problems. As speeds have increased, distributing the timing signals has become more and more difficult. Present day transistors can process data so quickly that they can accomplish several steps in the time that it takes a wire to carry a signal from one side of the chip to the other. Keeping the rhythm identical in all parts of a large chip requires careful design and a great deal of electrical power. Each part of an asynchronous system may extend or shorten the timing of its steps when necessary, much as a hiker takes long or short steps when walking across rough terrain. Some of the pioneers of the computer age, such as mathematician Allen M Turing, tried using asynchronous designs to build machines in the early 1950s. Engineers soon abandoned this approach in favour of synchronous computers because common timing made the design process so much easier. Now asynchronous computing is experiencing a renaissance. Researchers at the University of Manchester in England, The University of Tokyo and The California Institute of Technology had demonstrated asynchronous microprocessors. Some asynchronous chips are already in commercial mass production. In the late 1990s Sharp, the Japanese electronics company used asynchronous design to build a data driven media processor a chip for editing graphics, video and audio and Philips Electronics produced an asynchronous microcontroller for two of its pagers. Asynchronous parts of otherwise synchronous systems are also beginning to appear; the Ultra SPARC IIIi processor recently introduced by SUN includes some asynchronous circuits developed by our group. We believe that asynchronous systems will become ever more popular as researchers learn how to exploit their benefits and develop methods for simplifying their design. Asynchronous chipmakers have achieved a good measure of technical success, but commercial success is still to come. We remain a long way from fulfilling the full promise of asynchrony.

5.2 BEAT THE CLOCKWhat are the potential benefits of asynchronous systems?First, asynchrony may speed up computers. In a synchronous chip, the clocks rhythm must be slow enough to accommodate the slowest action in the chips circuits. If it takes a billionth of a second for one circuit to complete its operation, the chip cannot run faster than one gigahertz. Even though many other circuits on that chip may be able to complete their operations in less time, these circuits must wait until the clock ticks again before proceeding to the next logical step. In contrast each part of an asynchronous system takes as much or as little time for each action as it needs. Complex operations can take more time than average, and simple ones can take les. Actions can start as soon as the prerequisite actions are done, without waiting for the next tick of the clock. Thus the systems speed depends on the average action time rather than the slowest action time. Coordinating as actions, however, also takes time and chip area. If the efforts required for local coordination are small, an asynchronous system may, on average, be faster than a clocked system. Asynchrony offers the most help to irregular chip designs in which slow actions occur infrequently. Asynchronous design may also reduce a chips power consumption. In the current generation of large, fast synchronous chips, the circuits that deliver the timing signals take up a good chunk of the chips area. In addition, as much as 30% of the electrical power used by the chip, must be devoted to the clock and its distribution system. Moreover, because the clock is always running, it generates heat whether or not the chip is doing anything useful. In asynchronous systems, idle parts of the chip consume negligible power. This feature is particularly valuable for battery-powered equipment, but it can also cut the cost of larger systems by reducing the need for cooling fans and air-conditioning to prevent them from overheating. The amount of power saved depends on the machines pattern of activity. Systems with parts that act only occasionally benefit more than systems that act continuously. Most computers have components, such as the floating-point arithmetic unit, that often remain idle for long periods. Furthermore, as systems produce less ratio interference than synchronous machines do. Because of a clocked system uses a fixed rhythm, it broadcasts a strong radio signal at its operating frequency and at the harmonics of that frequency. Such signals can interfere with cellular phones, televisions and aircraft navigation systems that operates t the same frequencies. Asynchronous systems lack a fixed rhythm, so they spread their radiated energy broadly across the radio spectrum, emitting less at any one frequency.

5.3 OVERVIEW/ CLOCKLESS SYSTEM Most modern computers are synchronous: all their operations are coordinated by the timing signals of tiny crystal oscillators within the machines. Now researchers are designing asynchronous systems that can process data without the need for a governing clock.

Asynchronous systems rely on local coordination circuits to ensure an orderly flow of data. The two most important coordination circuits are called the Rendezvous and the Arbiter.

The potential benefits of asynchronous systems include faster speeds, lower power consumption and less radio interference. As integrated circuit become more complex, chip designers will need to learn asynchronous techniques.

Yet another benefit of asynchronous design is that it can be used to build bridges between clocked computers running at different speeds. Many computing clusters, for instance, link fast PCs with slower machines. These clusters can tackle complex problems by dividing the computational tasks among the PCs. Such a system is inherently asynchronous: different parts march to different beats. Moving data controlled by one clock to the control of another clock requires asynchronous bridges, because data may be out of sync with the receiving clock.

Finally, although asynchronous design can be challenging, it can also be wonderfully flexible. Because of the circuits of an asynchronous system need not share a common rhythm, designers have more freedom in choosing the systems parts and determining how they interact. Moreover, replacing any part with a faster version will improve the speed of the entire system.

5.4 LOCAL OPERATIONTo describe how asynchronous systems work, we often use the metaphor of the bucket brigade. A clocked system is like a bucket brigade in which each person must pass and receive buckets according to the tick tock rhythm of the clock. When the clock ticks, each person pushes a bucket forward to the next person down the line. When the clock tocks, each person grasps the bucket pushed forward by the preceding person. The rhythm of this brigade cannot go faster than the time it takes the slowest person to move the heaviest bucket. Even if most of the buckets are light, everyone in the line must wait for the clock to tick before passing the next bucket. Local cooperation rather than the common clock governs an asynchronous bucket brigade. Each person who holds a bucket can pass it to the next person down the line as soon as the next persons hands are free. Before each action, one person may have to wait until the other is ready. When most of the buckets are light, however, they can move down the line very quickly. Moreover, when theres no water to move, everyone can rest between buckets. A slow person will still hinder the performance of the entire brigade, but replacing the slowpoke will return the system to its best speed.

Bucket brigade

Bucket brigades in computers are called pipelines. A common pipeline executes the computers instructions. Such a pipeline has half a dozen or so stages, each of which acts as a person in a bucket brigade.

For example, a processor executing the instruction ADD A B Chip must fetch the instruction from memory, decode the instruction, get the numbers from addresses A and B in memory, do the addition and store the sum in memory address C.

DelayProcessing logicDelayReqAckAckReqProcessing logicPipeline diagram

Here a bundled data self-timing scheme is used, where conventional data processing logic is used along with a separate request (Req) line to indicate data validity. Requests may be delayed by at least the logic delay to insure that they still indicate data validity at the receiving register. An acknowledge signal (ack) provides flow control, so the receiving register can tell the transmitting register when to begin sending the next data. A clocked pipeline executes these actions in a rhythm independent of the operations performed or the size of the numbers. In an asynchronous pipeline, though, the duration of each action may depend on the operation performed the size of the numbers and the location of the data in memory (just as in bucket brigade the amount of water in a bucket may determine how long it takes to pass it on).

Without a clock to govern its actions, an asynchronous system must rely on local coordination circuits instead. These circuits exchange completion signals to ensure that the actions at each stage begin only when the circuits have the data they need. The two most important coordination circuits are called the Rendezvous and the Arbiter circuits.

A Rendezvous element indicates when the last of two or more signals has arrived at a particular stage. Asynchronous systems use these elements to wait until all the concurrent actions finish before starting the next action.

One form of Rendezvous circuit is called the Muller C-element, named after David Muller, now retired from a professorship at the University of Illinois. A Muller C-element is a logic circuit with two inputs and on output. When both inputs of a Muller C-element are TRUE, its output becomes TRUE.

When both inputs are FALSE, its output becomes FALSE. Otherwise the output remains unchanged. For therefore, Muller C-element to act as a Rendezvous circuit, its inputs must not change again until its output responds. A chain of Muller C-elements can control the flow of data down an electronic bucket brigade.

5.5 RENDEZVOUS CIRCUITS Rendezvous circuitIt can coordinate the action of an asynchronous system, allowing data to flow in an orderly fashion without the need for a central clock. Shown here is an electronic pipeline control by a chain of Muller C-elements, each of which allows data to pass down the line only when the preceding stage is full indicating that data are ready to move and the following stage is empty.

Each Muller C-element has two input wires and one output wire. The output changes to FALSE when both inputs are FALSE and back to TRUE when both inputs are TRUE (in the diagram, TRUE signals are shown in blue and FALSE signals are in red.). The inverter makes the initial inputs to the Muller C-element differ, setting all stages empty at the start. Lets assume that the left input is initially TRUE and the right input FALSE (1). A change in signal at the left input from TRUE to FALSE (2) indicates that the stage to the left is full that is, some data have arrived. Because the inputs to the Muller C-element are now the same, its output changes to FALSE. This change in signals does three things: it moves data down the pipeline by briefly making the data latch transparent, it sends a FALSE signal back to the preceding C-element to make the left stage empty, and it sends a FALSE signal ahead to the next Muller C-element to make the right stage full (3).

Research groups recently introduced a new kind of Rendezvous circuit called GasP. GasP evolved from an earlier family of circuits designed by Charles E. Molnar, at SUN Microsystems. Molnar dubbed his creation asP*, which stands for asynchronous symmetric pulse protocol (the asterisk indicates the double P). G is added to the name because GasP is what you are supposed to do when you see how fat our new circuits go. It is found that GasP modules are as fast as and as energy-efficient as Muller C-elements, fit better with ordinary data latches and offer much greater versatility in complex designs.

5.6 ARBITER CIRCUIT An arbiter circuit performs another task essential for asynchronous computers. An arbiter is like a traffic officer at an intersection who decides which car may pass through next. Given only one request, an Arbiter promptly permits the corresponding action, delaying any request until the first action is completed. When an Arbiter gets two requests at once, it must decide which request to grant first.

For example, when two processors request access to a shared memory at approximately the same time, the Arbiter puts the request into a sequence, granting access to only one processor at a time. The Arbiter guarantees that there are never two actions under way at once, just as the traffic officer prevents accidents by ensuring that there are never two cars passing through the intersection on a collision course.

Although Arbiter circuits never grant more than one request at a time, there is no way to build an Arbiter that will always reach a decision within a fixed time limit. Present-day Arbiters reach decisions very quickly on average, usually within about a few hundred picoseconds. When faced with close calls, however, the circuits may occasionally take twice as long, and in very rare cases the time needed to make a decision may be 10 times as long as normal.

The fundamental difficulty in making these decisions causes minor dilemmas, which are familiar in everyday life. For example, two people approaching a doorway at the same time may pause before deciding who will go through first. They can go through in either order. All that needed is a way to break the tie.

An Arbiter breaks ties. Like a flip-flop circuit, an Arbiter has two stable states corresponding to the two choices. One can think of these states as the Pacific Ocean and The Gulf of Mexico. Each request to an Arbiter pushes the circuit toward one stable state or the other, just as a hailstone that falls in the Rocky Mountains can roll downhill toward The Pacific or the Gulf. Between the two stable states, however, there must be a meta-stable line, which is equivalent to the Continental Divide. If a hailstone falls precisely on the Divide, it may balance momentarily on that sharp mountain ridge before tipping toward The Pacific or the Gulf. Similarly, if two requests arrive at an Arbiter within a few picoseconds of each other, the circuit may pause in its meta-stable state before reaching one of its stable states to break the tie.

5.7 THE NEED FOR SPEED

Research group at Sun Microsystems concentrates on designing fast asynchronous systems. We have found that speed often comes from simplicity. Our initial goal was to build a counter flow pipeline with two opposing data flows like two parallel bucket brigades moving in opposite directions. We wanted the data from both flows to interact at each of these stages; the hard challenge was to ensure that every northbound data element would interact with every southbound data element. Arbitration turned out to be essential. At each joints between successive stages, an Arbiter circuit permitted only one element at a time to pass.

This project proved very useful as a research target; we learned a great deal about coordination and arbitration and built test chips to prove the reliability of our Arbiter circuits.

The experiments at Manchester, Caltech and Philips demonstrate that asynchronous microprocessors can be compatible with their clocked counterparts. The asynchronous processors can connect to peripheral machines without special programs or interface circuitry.

Chapter-6SIMPLICITY IN DESIGN

There in no complexity of a simple design for clockless chips. The one fundamental achievement is to throw the central clock away and standardization of components can be used intensively. Integrated pipeline mode plays an important role in total system design.There are about four factors regarding pipeline and these are:1.Domino logic 2. Delay insensitive3. Bundle data4. Dual rail

Domino logic is a CMOS-based evolution of the dynamic logic techniques which were based on either PMOS or NMOS transistors. It allows a rail-to-rail logic swing. It was developed to speed up circuits. In a cascade structure consisting of several stages, the evaluation of each stage ripples the next stage evaluation, similar to a domino falling one after the other. The structure is hence called Domino CMOS Logic. Important features include:* They have smaller areas than conventional CMOS logic.* Parasitic capacitances are smaller so that higher operating speeds are possible.* Operation is free of glitches as each gate can make only one transition.* Only non-inverting structures are possible because of the presence of inverting buffer.* Charge distribution may be a problem

Delay insensitive circuit is a type of asynchronous circuit which performs a logic operation often within a computing processor chip. Instead of using clock signals or other global control signals, the sequencing of computation in delay insensitive circuit is determined by the data flow. Typically handshake signals are used to indicate the readiness of such a circuit to accept new data (the previous computation is complete) and the delivery of such data by the requesting function. Similarly there may be output handshake signals indicating the readiness of the result and the safe delivery of the result to the next stage in a computational chain or pipeline. In a delay insensitive circuit, there is therefore no need to provide a clock signal to determine a starting time for a computation. Instead, the arrival of data to the input of a sub-circuit triggers the computation to start. Consequently, the next computation can be initiated immediately when the result of the first computation is completed.

The main advantage of such circuits is their ability to optimize processing of activities that can take arbitrary periods of time depending on the data or requested function. An example of a process with a variable time for completion would be mathematical division or recovery of data where such data might be in a cache. The Delay-Insensitive (DI) class is the most robust of all asynchronous circuit delay models. It makes no assumptions on the delay of wires or gates. In this model all transitions on gates or wires must be acknowledged before transitioning again. This condition stops unseen transitions from occurring. In DI circuits any transition on an input to a gate must be seen on the output of the gate before a subsequent transition on that input is allowed to happen. This forces some input states or sequences to become illegal. For example OR gates must never go into the state where both inputs are one, as the entry and exit from this state will not be seen on the output of the gate. Although this model is very robust, no practical circuits are possible due to the heavy restrictions. Instead the Quasi-Delay-Insensitive model is the smallest compromise model yet capable of generating useful computing circuits. For this reason circuits are often incorrectly referred to as Delay-Insensitive when they are Quasi-Delay- Insensitive.

6.1 ASYNCRONOUS OF HIGHER PERFOMANCEIn order to increase the performance of the circuit, the following are basics to be implements.* Data-dependent delays.*All carry bits need to be computed.The figure show first circuit being not asynchronous and then the second shows dual rail with every bit taken into computation.

6.2 ASYNCHRONOUS FOR LOW POWERPower consumption is very important aspect in designing any mobile and to increase the battery capacity and life for battery driven devices. Hence asynchronization of power is completely inevitable to achieve a low level of power dissipated. The circuit should consume power only when and where active. Rest of the time the circuit returns to a non-dissipating state, until next activation. The figure shows how power is less confused by first taking down the frequency by dividing the give frequency to two and the next one show as many circuits are cascaded the more the frequency is divided. This provides a crucial reduction on power consumption.

6.3 ASYNCRONOUS FOR LOW NOISEAny system with clock will be having oscillations in it and will create electromagnetic noise and this is the source of the actual noise one hears from convectional computers. For every clock cycle there will be spike emitted and emission of random spectra is accompanied together with noise. This problem is greatly reduced to significant considerable range by discarding the central clock as explain above and the spectra radiation are much smoother in asynchronous circuits.Chapter-7ADVANTAGE OF ASYNCRONOUS CHIP

A clocked chip can run no faster than its most slothful piece of logic; the answer isn't guaranteed until every part completes its work. By contrast, the transistors on an asynchronous chip can swap information independently, without needing to wait for everything else. The result? Instead of the entire chip running at the speed of its slowest components, it can run at the average speed of all components. At both Intel and Sun, this approach has led to prototype chips that run two to three times faster than comparable products using conventional circuitry.

Clockless chips draw power only when there is useful work to do, enabling a huge savings in battery-driven devices; an asynchronous-chip-based pager marketed by Philips Electronics, for example, runs almost twice as long as competitors' products, which use conventional clocked chips.Asynchronous chips use 10 percent to 50 percent less energy than synchronous chips, in which the clocks are constantly drawing power. That makes them ideal for mobile communications applications - which usually need low power sources - and the chips' quiet nature also makes them more secure, as typical hacking techniques involve listening to clock ticks.

Another advantage of clockless chips is that they give off very low levels of electromagnetic noise. The faster the clock, the more difficult it is to prevent a device from interfering with other devices; dispensing with the clock all but eliminates this problem. The combination of low noise and low power consumption makes asynchronous chips a natural choice for mobile devices. "The low-hanging fruit for clockless chips will be in communications devices," starting with cell phonesAsynchronous logic would offer better security than conventional chips: "The clock is like a big signal that says, Okay, look now," says Fant. "It's like looking for someone in a marching band. Asynchronous is more like a milling crowd. There's no clear signal to watch. Potential hackers don't know where to begin."

Analyzing the power consumption for each clock tick can crack the encryption on existing smart cards. This allows details of the chips inner workings to be deduced. Such an attack would be far more difficult on a smartcard based on asynchronous logic.

They can perform encryption in a way that is harder to identify and to crack. Improved encryption makes asynchronous circuits an obvious choice for smart cardsthe chip-endowed plastic cards beginning to be used for such security-sensitive applications as storage of medical records, electronic funds exchange and personal identification.

Ivan Sutherland of Sun Microsystems, who is regarded as the guru of the field, believes that such chips will have twice the power of conventional designs, which will make them ideal for use in high-performance computers. But Dr Furber suggests that the most promising application for asynchronous chips may be in mobile wireless devices and smart cards.

Different stylesThere are several styles of asynchronous design. Conventional chips represent the zeroes and ones of binary digits (bits) using low and high voltages on a particular wire.One clockless approach, called dual rail, uses two wires for each bit. Sudden voltage changes on one of the wires represent a zero, and on the other wire a one.

"Dual-rail" circuits use two wires giving the chip communications pathways, not only to send bits, but also to send "handshake" signals to indicate when work has been completed. Replacing the conventional system of digital logic with what he calls "null convention logic," a scheme that identifies not only "yes" and "no," but also "no answer yet"a convenient way for clockless chips to recognize when an operation has not yet been completed.

Another approach is called bundled data. Low and high voltages on 32 wires are used to represent 32 bits, and a change in voltage on a 33rd wire indicates when the values on the other 32 wires are to be used.

Chapter-8APPLICATION OF ASYNCRONOUS CHIP1. High performance. 2. Low power dissipation. 3. Low noise and low electro-magnetic emission.4. A good match with heterogeneous system timing.

8.1 ASYNCHRONOUS FOR HIGH PERFORMANCEIn an asynchronous circuit the next computation step can start immediately after the previous step has completed: there is no need to wait for a transition of the clock signal. This leads, potentially, to a fundamental performance advantage for asynchronous circuits, an advantage that increases with the variability in delays associated with these computation steps. However, part of this advantage is canceled by the overhead required to detect the completion of a step. Furthermore, it may be difficult to translate local timing variability into a global system performance advantage.

Data-dependent delays The delay of the combinational logic circuit show in Figure-1 depends on the current state and the value of the primary inputs. The worst-case delay, plus some margin for flip-flop delays and clock skew, is then a lower bound for the clock period of a synchronous circuit. Thus, the actual delay is always less (and sometimes much less) than the clock period.

A simple example is an N-bit ripple-carry adder (Figure 2). The worst-case delay occurs when 1 is added to 2N - 1. Then the carry ripples from FA1 to FAN. In the best case there is no carry ripple at all, as, for example, when adding 1 to 0. Assuming random inputs, the average length of the longest carry-propagation chain is bounded by log2N. For a 32-bit wide ripple-carry adder the average length is therefore 5, but the clock period must be 6 times longer! On the other hand, the average length determines the average case delay of an asynchronous ripple-carry adder, which we consider next. In an asynchronous circuit this variation in delays can be exploited by detecting the actual completion of the addition. Most practical solutions use dual-rail encoding of the carry signal (Figure 2(b)); the addition has completed when all internal carry-signals have been computed. That is, when each pair (cfi; cti) has made a monotonous transition from (0; 0) to (0; 1) (carry = false) or to (1; 0) (carry = true). Dual-rail encoding of the carry signal has also been applied to a carry bypass adder. When inputs and outputs are dual-rail encoded as well, the completion can be observed from the outputs of the adder.

The controller communicates exclusively with the controllers of the immediately preceding and succeeding stages by means of handshake signaling, and controls the state of the data latches (transparent or opaque). Between the request and the next acknowledge phase the corresponding data wires must be kept stable.

8.2 ASYNCHRONOUS FOR LOW POWERDissipating when and where active the classic example of a low-power asynchronous circuit is a frequency divider. A D-flip-flop with its inverted output fed back to its input divides an incoming (clock) frequency by two (Figure 4(a)). A cascade of N such divide-by-two elements (Figure 4(b)) divide the incoming frequency by 2N.

The second element runs at only half the rate of the first one and hence dissipates only half the power; the third one dissipates only a quarter, and so on. Hence, the entire asynchronous cascade consumes, over a given period of time, slightly less than twice the power of its head element, independent of N. That is, fixed power dissipation is obtained. In contrast, a similar synchronous divider would dissipate in proportion to N. A cascade of 15 such divide-by-two elements is used in watches to convert a 32 kHz crystal clock down to a 1 Hz clock. The potential of asynchronous for low power depends on the application. For example, in a digital filter where the clock rate equals the data rate, all flip-flops and all combinational circuits are active during each clock cycle. Then little or nothing can be gained by implementing the filter as an asynchronous circuit. However, in many digital-signal processing functions the clock rate exceeds the data (signal) rate by a large factor, sometimes by several orders of magnitude 2. In such circuits, only a small fraction of registers change state during a clock cycle. Furthermore, this fraction may be highly data dependent. The clock frequency is chosen that high to accommodate sequential algorithms that share resources over subsequent computation steps. One is vastly improved electrical efficiency, which leads directly to prolonged battery life.One application for which asynchronous circuits can save power is Reed-Solomon error correctors operating at audio rates, as demonstrated at Philips Research Laboratories. Two different asynchronous realizations of this decoder (single-rail and dual-rail) are compared with a synchronous (product) version. The single rail was clearly superior and consumed five times less power than the synchronous version.A second example is the infrared communications receiver IC designed at Hewlett-Packard/Stanford. The receiver IC draws only leakage current while waiting for incoming data, but can start up as soon as a signal arrives so that it loses no data. Also, most modules operate well below the maximum frequency of operation.The filter bank for a digital hearing aid was the subject of another successful demonstration, this time by the Technical University of Denmark in cooperation with Oticon Inc. They re-implemented an existing filter bank as a fully asynchronous circuit. The result is a factor five less power consumption.A fourth application is a pager in which several power-hungry sub circuits were redesigned as asynchronous circuits, as shown later in this issue. 8.3 ASYNCHRONOUS FOR LOW NOISE AND LOW EMISSIONSub circuits of a system may interact in unintended and often subtle ways. For example, a digital sub circuit generates voltage noise on the power-supply lines or induces currents in the silicon substrate. This noise may affect the performance of an analog-to-digital converter connected so as to draw power from the same source or that is integrated on the same substrate. Another example is that of a digital sub circuit that emits electromagnetic radiation at its clock frequency (and the higher harmonic frequencies), and a radio receiver sub-circuit that mistakes this radiation for a radio signal. Due to the absence of a clock, asynchronous circuits may have better noise and EMC (Electro-Magnetic Compatibility) properties than synchronous circuits.This advantage can be appreciated by analyzing the supply current of a clocked circuit in both the time and frequency domains. Circuit activity of a clocked circuit is usually maximal shortly after the productive clock edge. It gradually fades away and the circuit must become totally quiescent before the next productive clock edge. Viewed differently, the clock signal modulates the supply current as depicted schematically in Figure 5(a). Due to parasitic resistance and inductance in the on-chip and off-chip supply wiring this causes noise on the on-chip power and ground lines.8.4 HETEROGENEOUS TIMINGThere are two on-going trends that affect the timing of a system-on-a-chip: the relative increase of interconnects delays versus gate delays and the rapid growth of design reuse. Their combined effect results in an increasingly heterogeneous organization of system-on-a-chip timing. According to Figure 7, gate delays rapidly decrease with each technology generation. By contrast, the delay of a piece of interconnect of fixed modest length increases, soon leading to a dominance of interconnect delay over gate delay. The introduction of additional interconnects layers and new materials (copper and low dielectric constant insulators) may slow down this trend somewhat. Nevertheless, new circuits and architectures are required to circumvent these parasitic limitations. For example, across-chip communication may no longer fit within a single clock period of a processor core. Heterogeneous system timing will offer considerable design challenge for system-level interconnect, including buses, FIFOs, switch matrices, routers, and multi-port memories. Asynchrony makes it easier to deal with interconnecting a variety of different clock frequencies, without worrying about synchronization problems, differences in clock phases and frequencies, and clock skew. Hence, new opportunities will arise for asynchronous interconnect structures and protocols. Once asynchronous on-chip interconnect structures are accepted, the threshold to introduce asynchronous clients to these interconnects is lowered as well. Also, mixed synchronous-asynchronous circuits hold promise.Chapter-9 A CHALLENGING TIME

Although the architectural freedom of asynchronous systems is a great benefit, it also poses a difficult challenge. Because each part sets its own pace, that pace may vary from time to time in any one system and may vary from system to system. If several actions are concurrent, they may finish in a large number of possible sequences. Enumerating all the possible sequences of actions in a complex asynchronous chip is as difficult as predicting the sequences of actions in a school yard full of children. This dilemma is called the state explosion problem.

Can chip designers create order out of the potential chaos of concurrent actions?

Fortunately, researchers are developing theories for tracking this problem. Designers need not worry about all the possible sequences of actions if they can set certain limitations on the communication behavior of each circuit. To continue the schoolyard metaphor, a teacher can promote safe play by teaching each child how to avoid danger.

Another difficulty is that we lack mature design tools, accepted testing methods and widespread education in asynchronous design. A growing research community is making good progress, but the present total investment in clock-free computing parlances in comparison with the investment in clocked design. Nevertheless, we are confident that the relentless advances in the speed and complexity of integrated circuits will force designers to learn asynchronous techniques. We do not know yet whether asynchronous systems will flourish first within large computer and electronics companies or within start-up companies eager to develop new ideas. The technological trend, however, is inevitable: in he coming decades, asynchronous design will become prevalent.

Chapter-10 FUTURE SCOPE

The first place well see, and have already seen, clock less designs are in the lab. Many prototypes will be necessary to create reliable designs. Manufacturing techniques must also be improved so the chips can be mass-produced.

The second place well see these chips are in mobile electronics. This is an ideal place to implement a clock less chip because of the minimal power consumption. Also, low levels of electromagnetic noise creates less interference, less interference is critical in designs with many components packed very tightly, as is the case with mobile electronics.

The third place is in personal computers (PCs). Clock less designs will occur here last because of the competitive PC market.

It is essential in that market to create an efficient design that is reasonably priced. A manufacturing cost increase of a couple of cents per chip can cause an entire line of computers to fail because of the large cost increase passed onto the customer. Therefore, the manufacturing process must be improved to create a reasonably priced chip.

CONCLUSION

Clocks have served the electronics design industry very well for a long time, but there are insignificant difficulties looming for clocked design in future. These difficulties are most obvious in complex SOC development, where electrical noise, power and design costs threaten to render the potential of future process technologies inaccessible to clocked design.

Self-timed design offers an alternative paradigm that addresses these problem areas, but until now VLSI designers have largely ignored it. Things are beginning to change; however, self-timed design is poised to emerge as a viable alternative to clocked design. The drawbacks, which are the lack of design tools and designers capable of handling self-timed design, are beginning to be addressed, and a few companies (including a couple of start-ups, Theseus Logic Inc., and Cogency Technology, Inc.) have made significant commitments to the technology.

Although full-scale commercial demonstrations of the value of self-timed design are still few in number, the examples available, demonstrates that there are no show stoppers to threaten the ultimate viability for this strategy. Self-timed technology is poised to make an impact, and there are significant rewards on offer to those brave enough to take the lead in its exploitation.

REFERENCES

1. Scanning the Technology: Applications of Asynchronous Circuits C. H. (Kees) van Berkel, Mark B. Josephs, and Steven M. Nowick proceedings of IEEE, December 1998.2. Computers without clocks Ivan E Sutherland and Jo Ebergen Scientific American, August 2002.3. Is it time for Clockless chips? David Geer published by IEEE Computer Society, March 2005.4. Guest Editors Introduction: Clockless VLSI Systems Soha Hassoun, Yong-Bin Kim and Fabrizio Lombardi copublished by IEEE CS and IEEE November December 2005.5. It's Time for Clockless Chips Claire Tristram from MIT Technology October 20016. Old tricks for new chips Apr 19th 2001 From The Economist print edition

Report

Documents

Transcript of Report