08-17-2017, 12:26 AM
[attachment=2475]
INTRODUCTION
How fast is your personal computer? When people ask this ques tion, they are typically
referring to the frequency of a minuscule clock inside the computer, a
crystal oscillator that sets the basic rhythm used throughout the
machine. In a computer with a speed of one gigahertz, for example,
the crystal "ticks" a billion times a second. Every action of the
computer tak es plac e in tiny steps, each a billionth of a second long.
A simple transfer of data may take only one step; complex
calculations may take many steps. All operations, however, must
begin and end according to the clock's t iming signals. The use of a central clock also creates problems. As
speeds have increased, distributing the timing signals has become
more and more difficult. Present-day transistors can process data so
quick ly that they can accomplish several steps in the time that it takes
a wire to carry a signal from one side of the chip to the other. Keeping
the rhythm identical in all parts of a large chip requires careful design
and a great deal of electric al power. Wouldn't it be nice to have an
alternative? Clockless approach, which uses a technique known as
asynchronous logic, differs from conventional computer circuit design
in that the switching on and off of digital circuits is controlled
individually by specific pieces of data rather than by a tyrannical clock
that forces all of the millions of the circuits on a chip to march in
unison. It overcomes all the disadvantages of a clocked circuit such
as slow speed, high power c onsumption, high electromagnetic noise
etc.
For these reasons the clockless technology is
considered as the technology which is going to drive majority of
elect ronic chips in the coming years.
A BRIEF HISTORY
CONCEPT OF CLOCKS
The clock is a tiny crystal oscillator t hat resides in the
heart of every microprocessor chip. The clock is what which sets the
basic rhythm used throughout the machine. The clock orchestrates
the synchronous dance of electrons that course through the hundreds
of millions of wires and transistors of a modern computer.
Such crystals which tick up to 2 billion times each second
in the fastest of today s desktop personal computers, dictate the
timing of every circuit in every one of the chips that add, subtract,
divide, multiply and move the ones and zeros that are the basic stuff
of the information age.
Conv entional chips (s ynchronous) operate under the
control of a central clock, which samples data in the registers at
precisely timed intervals. Computer chips of today are s ynchronous:
they contain a main clock which controls the timing of the entire
chips. One advantage of a clock is that, the clock signals to the
devic es of the chip when to input or output. This functionality of the
synchronous design makes designing the chip much easier. There
are problems that go along with the clock , howev er. Clock speeds are now in the gigahertz range and there is
not much room for speedup before physical realities start to
complicate t hings. With a gigahertz clock powering a c hip, signals
barely have enough time to make it across the chip before the next
clock tick. At this point, speedup up the clock frequency could
become disastrous. This is when a chip that is not constricted by
clock speeds could become very valuable.
WORKING OF A SYNCHRONOUS
CIRCUIT
This is the working model of a particular s ynchronous
circuit. A synchronous circuit looks for a particular signal of the clock.
In this c ase, the circuit is looking for the leading edge of the clock
pulse. As we see in the figure, all actions in this circuit take place only
on the leading edge of the clock cycle. Especially when transferring
the data on to the registers the computations settle down and wait for
the next leading edge of the clock to occur. Then only the data will be
transferred to the next register.
The figure gives a clear idea of how conventional chips
operate under the control of a central clock, which samples data in
the registers at precisely timed intervals. The only thing the designers
have to think about is how to complete one operation during a single
tick of the clock. It is extremely import ant to design the circuits in such
a fashion that all the computations must settle down and be ready for
the next logical operation before the next clock tick.
PROBLEMS OF SYNCHRONOUS
CIRCUITS
One problem is speed. A chip can only work as fast as its
slowest component. Therefore, if one part of the chip is especially
slow, the other parts of the chip are forced to sit idle. This wasted
computing time is obviously detrimental to t he speed of the chip. New problems with speeding up a clocked chip are just
around t he corner. Clock frequencies are getting so fast that signals
can barely cross the chip on one clock cycle. When we get to the
point where the clock cannot drive the entire chip, we'll be forced to
come up with a solution. One possible solution is a second clock, but
this will incur overhead and power consumption, so t his is a poor
solution. It is also important to note that doubling the frequency of the
clock does not double the chip speed, therefore blindly trying to
increase chip speed by increasing frequency without considering
other options is foolish. The other major problem with a clock ed design is power
cons umption. The clock consumes more power than another other
component of the chip. The most dis turbing thing about this is that the
clock serves no direct computational use. A clock does not perform
operations on information; it simply orchestrates the computational
parts of the computer. New problems with power consumption are arising. As the
number of transistors on a chip increas es, so does the power used by
the clock . Therefore, as we design more complicated chips, power
cons umption becomes an even more crucial topic. Mobile electronics
are the target for many chips. These chips need to be even more
cons ervative with power c onsumption in order to have a reas onable
battery lifetime. The natural solution to the above problems, as you may
have guessed, is to eliminate the source of these headaches: the
clock.
CONCEPT OF CLOCKLESS CHIPS
The main concept behind a clockless design is evident
from the name itself. That is, they don t have a global clock which
synchronizes it s actions. So there must be some control mechanism
which should synchronize the components inside a clockless chip to
ensure correct working of the chip. The clockless chips rely up on
handshaking signals, handoff signals & sometimes a local clock to
synchronize the ac tions.
By throwing out the clock, chip makers will be able to
escape from the problems of the synchronous circuits. Clockless
chips draw power only when t here is useful work to do, enabling a
huge savings in battery-driven devices; an asynchronous-chip-based
pager marketed by Philips Electronics, for example, runs almost twice
as long as competitors' products, which use conventional clocked
chips. Like a team of horses that can only run as fast as its
slowest member, a clocked c hip can run no faster than its most
slothful piec e of logic; the answer isn't guaranteed until ev ery part
completes its work. By contrast, the transistors on an asynchronous
chip can swap information independently, without needing to wait for
everything else. The result? Instead of the entire chip running at the
speed of its slowest components , it can run at the average speed of
all components. At both Intel and Sun, this approac h has led to
prototype chips that run two to three times faster than comparable
products using conventional circuitry. Another advantage of clockless chips is that they giv e off
very low lev els of electromagnetic noise. The fast er t he clock, the
more difficult it is to prevent a device from interfering with other
devic es; dispensing with the clock all but eliminates this problem.
WORKING OF ASYNCHRONOUS
CIRCUIT
Clockless (also called asynchronous, self timed or event driven)
chips dispens e with the timepiece. The figure below gives an idea of
working of an asynchronous circuit. In this particular scheme (which
is called a duel rail circ uit which will be discussed later), data moves
instead under the control of local " handshake" signals (lines below)
that indicate when work has been completed and is read y for t he next
logic operation.
As we can see above there is the usual logical circuitry and
instead of a clock signal which controls the circuit, there are two lines
on the top and bottom. The wires are used to transfer the data bits
and the control bits together. So there is no separate control signal
going across the circ uit. The control signal is encoded within the data
that is being transferred. This control signals act as handshaking and
handoff signals whic h indicates when the component is ready for the
next logical operation.
There are different ways to implement an asynchronous
circuit. The next part is about v arious t ypes of implementation.
TYPES OF IMPLEMENTATIONS
There are mainly three kinds of implementations of an
asynchronous circuit. They are the following. 1. BOUNDED DELAY METHOD
2. DELAY INSENSITIVE METHOD
3. NULL CONVENTIONAL LOGIC(NCL) The simplest implementation of as ynchronous design is the
Bounded-Delay method. This design is very similar to s ynchronous
design; in Bounded-Delay design we ass ume that we know the
largest amount of time for each component to perform its task.
Knowing the bounds of the delay time allows for computations to be
sped up. The Delay-Insensitiv e method, which is quite the opposite of
Bounded-Delay, does not assume any bounds on time. As a result,
handshaking is needed between components. Another way of implementing an as ynchronous design is to use
NULL Convention Logic (NCL). This convention uses a NULL state
when data is in the reset phase, as opposed to DATA in the set
phase. The theory behind NCL is simple. If a gate has any inputs that
are NULL, then this gate has an output which is NULL. Once the gate
gets all its inputs, that are all its inputs are DATA, then the output of
the gate is DATA. In this way, the gates do not need to be clocked
because they do their computation as s oon as possible.
THE GENERAL MODEL
Completion Detection Done GoLogic Circuit
Input Output
The general model of an asynchronous design
implementation is shown above. In this circuit we can see a logic
circuit which does the same operation as in the synchronous circuit.
This is the actual logic which does all the calculation.
Attached t o this logic circuit is the completion detection
unit which helps the circuit to proceed in a controlled fashion i.e.
without an error. This completion detection unit indic ates when the
circuit has completed its action and when it is ready for the next
action.
The input signal in the combinational logic part and the
go signal in the completion detection circuit reach the unit
simultaneously. When t he combinational logic is done with the input
signal, a done signal is produced by the completion detection circuit.
This signal is an indication given by the c ompletion det ection circuit
for the signal to pass to the next step. In some cases the done
signal acts as the go signal to the completion detection circuit of the
next stage. BOUNDED DELAY
Prototype delay
Done GoEarly?
Combinational logic
Input Output
The above circuit shows the working model of a bounded delay
circuit. Bounded delay method is quite similar to the design of
synchronous circuits. In bounded delay method we assume that we
know the maximum time a component takes to complete its working.
So this is kept in mind while designing an as ynchronous circuit. I.e.
the circuit is designed in such a way that the control will be
transferred to the next circuit only when the previous component
completes its work. To do this we introduce the maximum time which
a circuit takes as the prot otype delay.
In the circuit we can see that, comparing with the general
model, the circuit which introduces the prototype delay acts as the
completion detection circuit in bounded delay method. That is, a
component is considered to have finished its working when the
introduc ed delay is over. But this kind of implementation has a disadvantage. Here
we are assuming t he maximum time taken and this is introduced as
the delay. So it is not possible to do early completion even if the
circuit doesn t t ake the maximum time. So it is forced to wait until the
delay is over. DELAY-INSENSITIVE METHOD Cont rary to the bounded delay method which assumes
bounds on time, the delay-insensitive method doesn t assume any
bounds on time. Therefore communication between independent
components is essential. This is done with the help of handshak e and
hand off signals. These signals indicate when the job of a component
is over.
There are many ways in which a dela y insensitive method
can be done. The most popular and efficient method is the duel-rail
encoding method. In this method separate channels are open for
data and control signals. Signals of both the channels together
indicate the control and data signals.
In one method each signal X is encoded with two wires
XH & XL. The encoding scheme is shown below
XH=0 XL=0 -- Data not ready.
XH=0 XL=1 -- Logical 0.
XH=1 XL=0 -- Logical 1.
XH=1 XL=1 -- Not used.
As we see from the coding above, each wire in the
logical circuit will now need two wires to implement a duel-rail circuit.
So the input will consist of a total of four wires and t he output will
consis t of two wires. Thus special k ind of gates would be required to
implement the logics. The AND, OR & NOT gat es are shown below. AH AH
AL AL
CH=AH*BH CH=AH+BH
CL=AL+BL CL=AL*BL
BH BH
0
XH
1 CH
0 CL
XL
1
NOT gate can be
implemented simply as the only thing we need to do is to reverse the
wires. I.e. CH=XL & CL=XH.
NULL CONVENTIONAL LOGIC
NULL Convention Logic (NCL) is a logic that integrates
data transformation and control into a single expression, thus yielding
inherently clockless or delay insensitive circuits and systems. NCL
enables solutions for digital designs facing the critical power, noise,
or sys tem integration issues . The following NCL features enable the
designer to solve t hese problems:
NCL uses a combination of multiwire data
representation and control/signal ing protocol : NCL circuits switch
between a voltage based data representation of DATA and a control
repres entation of NULL. This separation bet ween control and data
repres entations provides a self-synchronization throughout the
design. No clock is needed.
NCL uses threshold gates with hysteresis: Threshold
gates provide the basic building block of NCL designs. Threshold
gate inputs and outputs can be in one of two states, DATA or NULL.
A threshold gate starting with its output in a NULL state will remain in
the NULL state until the specified number of inputs is plac ed in the
DATA state. Once the gate reac hes the DATA state, it remains in this
state until all of the inputs return to the NULL state. The hysteresis in
the thres hold gate provides the threshold needed to k eep from
switching during the intermediate state when the number of inputs in
the DATA state is greater than zero, but less than the threshold limit.
In addition, hysteresis provides the storage to remain at DATA until
all of the inputs have returned to NULL. Since these gates use two
values, as traditional Boolean logic does, they can be constructed
with traditional CMOS (or Bipolar) processes.
MERITS OF ASYNCHRONOUS
CIRCUITS
There are mainly three advantages of clock less design.
They are
Increase in speed.
Reduced power consumption.
Less electromagnetic noise.
The first of these advantages is s peed. Chips can run at
the average speed of all its components instead of the speed of the
slowest component, as was the case with a clocked design. The
transistors on an asynchronous chip can s wap information
independently, without needing to wait for everything els e. At both
Intel and Sun micro systems, this approach has led to prototype chips
that run two to three times faster than comparable products using
conv entional circuitry. Therefore t he speed of an asynchronous
design is not limited by the size of the chip. An example of how much
an asynchronous design can improve speed is the asynchronous
Pentium designed by Intel in 1997 that runs three times as fast as the
synchronous equiv alent. Another crucial advant age of clockless chips is the
reduction in power cons umption. The reas on for this is that
asynchronous chips use power only during computations, while a
clocked circuit is always running. The Intel Pentium referred above
ran three times faster than clocked equivalent with half the power. The third advantage of the clockless design is that it
produces less elec tromagnetic nois e which interferes with the
work ing frequencies of ot her signals.
SPEED COMPARISON
The above figure of the buck et brigade c an be used to
desc ribe the flow of data in a computer. A synchronous computer
system is like a bucket brigade in which each person follows the tick
tock rhythm of a clock . When the clock ticks, each person pushes a
buck et forward to the next person down the line (top). When the clock
tock s, each pers on grasps the bucket pushed forward by the
preceding person (middle). An as ynchronous s ystem, in contrast, is
like an ordinary bucket brigade: each person who holds a bucket can
pass it down the line as soon as the next person s hands are free. It is quite evident from the above metaphor, why the
clockless chip is faster and eff ective. The clock less chips can run on
the av erage speed of all of its components rather t han adopting the
speed of the slowest member. This is because of the fac t that an
action is not restricted by the rules like those in a clock ed design.
POWER & NOISE CHARECTERISTICS POWER The above graphics is obtained from the Philips official
website. It illustrates the power saving character of the clockless chip.
This is an experiment to find out the heat emitted by a chip by placing
it under special k ind of light. The chip on the left side is a
synchronous chip and t he one to the right is its asynchronous
equivalent. The red spots on the chip indicate the positions where
heat is dissipated. It is clear from the figure that the synchronous emit
more heat as it has the more number of red spots on the light
emission measurements. It is clear that a chip which produces the more heat would
cons ume more power. Clearly synchronous chips consume more
power than the asynchronous equivalent. The reason for this is that as ynchronous chips use power
only during computations, while a clocked chip always cons umes
power because the chip is always running. The clock together with it s
timing circuits not only tak e up a good area of the chip, but also
account for 30% of the t otal power consumed by the chip. So
removing this would surely give an increase in life of the battery. It is
also important to note that the idle parts of a clockless chip consume
negligible amount of power. The above s aid reason is more applicable in the case of
mobile electronics where a battery is used to drive the chip. One
would think that it is not much of an issue when we consider the case
of a computer or other devices which can be plugged. But in this case
the c hip can cut the cost needed in the design of these equipments
by reducing the need for cooling fans, air-conditioning and other
cooling equipments in order to prevent overheating. The amount of
power saved will depend on the machine s pattern of activity.
Systems with parts that act occasionally benefit more than systems
that act continuously. Most computers have components, such as the
floating-point arithmetic unit, that often remain idle for long periods.
NOISE Now a day the demand for mobile devices is getting
higher and higher. Everything around is becoming wireless. These
devic es work by sending and receiving radio signals . When a clocked
circuit is used in these types of devices the noise generated by the
large frequency of the clock interferes with the working frequency of
the mobile devices. In order to avoid errors caused by these noise
signals, designers would not be free to provide the scale of
integration they wish.
As ynchronous s ystems produce less radio interference
than synchronous machines. Because a clocked system uses a fixed
rhythm, it broadcasts a strong radio signal at its operating frequency
and at the harmonics of that frequenc y. Suc h signals can interfere
with cellular phones, televisions and aircraft navigation systems that
operate at the same frequencies. Asynchronous s ystems lack a fixed
rhythm, so they spread their radiated energy broadly across the radio
spectrum, emitting less at any one frequenc y.
OTHER BENEFITES
Yet another benefit of as ynchronous design is that it can
be us ed to build bridges between clock ed computers running at
different speeds. Many computing clusters, for inst ance, page link fast PCs
with slower machines. These clusters can tackle complex problems
by dividing the computational tasks among the PCs. Such a system is
inherently asynchronous: different parts march to different beats.
Moving data controlled by one clock to the control of another clock
requires as ynchronous bridges, because the data may be "out of
sync" with the receiving clock. Finally, although asynchronous design can be
challenging, it can also be wonderfully f lexible. Becaus e the circuits of
an asynchronous system need not share a common rhythm,
designers have more freedom in choosing the system's parts and
determining how they interact. Moreover, replacing any part with a
faster version will improve t he speed of the entire system. In contrast,
increasing the speed of a clocked syst em usually requires upgrading
every part. A final advantage of the clock less c hip is the abilit y to
provide superior encryption. This is because there is no way for a
hack er to track regularly timed signals, which are given away by the
clock in a synchronous design. The hackers do not know what to look
at. This has significant potential as security becomes an increasing
priority. This becomes even more critical as c omputer all over the
world become more closely connected and are sharing confidential
material. Improved encryption makes asynchronous circuits an
obvious choice for smart cards the chip-endowed plas tic cards
beginning to be used for suc h security s ensitive applications as
storage of medical records, electronic funds exchange and personal
identification.
APPLICATIONS
Clockless design is inevitable in the future of chip design
because of the two major advantages of speed and power
cons umption, but where will we first see these designs in use? The first place we'll s ee, clockless designs in the lab.
Many prototypes will be necess ary to create reliable designs.
Manufacturing techniques must also be improved so the chips can be
mass-produced. The second place we'll see these chips are in mobile
elect ronics. This is an ideal place to implement a clockless chip
because of the minimal power consumption. Also, low levels of
elect romagnetic noise creates less interference; less interference is
critical in designs with many components packed v ery tightly, as is
the case with mobile electronics. The third place is in personal computers (PCs).
Clockless designs will occur here last because of the competitive PC
market. It is essential in that market to create an efficient design that
is reasonably priced. A manufacturing cost increase of a couple cents
per chip can cause an entire line of computers to fail because of the
large c ost increase passed onto the customer. Therefore, the
manufacturing process must be improved to create a reasonably
priced chip. A final plac e that asynchronous design may be used is
encryption devices. The reason for this is there are no regularly timed
signals for hackers t o look for. This becomes even more critical as
computer all over the world become more closely connected and are
sharing c onfidential material. They will be also used in smart cars as
they provide excellent security.
CLOCKLESS PRODUCTS IN THE
MARKET Now of course, the question is why aren't Intel, AMD,
and all the other chipmak ers scrambling to put together research
teams to design asynchronous prototypes? Well, actually, s ome
have. In 1997, Intel developed a prototype of a Pentium style chip
that ran 3 times as fast as a clocked equivalent, and used half the
power, but lack of a perceived market convinced Intel to abandon the
project (Intel's approach to asynchronous design seems to be slow
integration-async hronous circuitry is notoriously easy to integrate into
clocked chips-and Intel has done so-including a few clockless
elements in the Pentium IV series). Sharp Corp. built an asynchronous c hip for embedded
applications in 1997, and Philips has consistently given a hefty
budget to its as ynchronous design research department for many
years. In the past few years, others have jumped on the
bandwagon, including Motorola, who have joined forces with Theseus
Logic (one of the first companies founded on the principle of
asynchronous design) to produce a 32 bit processor and an 8 bit
microcontroller, and MIPS, who have licensed their 32 bit Architecture
to Fulcrum Mic rosystems, a competitor to Theseus. The latest company t o step into this field is a Manchester
based company called SELF-TIMED SOLUTIONS which has
developed clockless chips for communication devices.
LIMITATIONS OF
ASYNCHRONOUS CIRCUITS
Design difficulties. Lack of good tools. Testing difficulties.
DESIGN DIFFICULTIES
The primary drawback to asynchronous design is t hat it
is hard. Control logic must operate in fundamental mode, or a close
variant (like burst mode), and the synthesis formalisms are unfamiliar.
Architectural design has all the same challenges that concurrent
soft ware has; researchers have yet to make concurrent software
design a turnkey affair, despite decades of attention. And of course, there is the basic obstacle that as ynchronous
design techniques have been out of favor sinc e the 1980s, and are
therefore not typically taught in universities. If a microprocessor
design company today wanted to us e asynchronous logic, they would
have to begin by training their engineering staff in the basics. LACK OF GOOD TOOLS The predominance of CAD tools oriented towards
synchronous design is another chicken-and-egg problem. However,
most circuit simulation techniques are independent of s ynchrony, and
existing tools can be adapted for asynchronous use. Also, previous
academic design efforts have produced the firs t sprinkling of a
dedicated tool bas e. TESTING DIFFICULTIES
Testing asynchronous circuits presents several new
challenges. For example, a c ommon technique in synchronous
testing is to slow or stop the clock, to allow t he logic functions to be
observed at human s peeds. However, gating the reques t and/or
acknowledge signals is a possibility, and it is at least conceivable that
dropping Vdd to near the threshold could provide a useful slowing
effect (and possibly more useful, since some of the slow-transition
effects are preserved, unlike clock dividing). Additionally, as ynchronous circ uits have timing
requirements that are more const rained than synchronous circuits.
Whereas the latter simply have to compute a valid result before the
clock edge, asynchronous circuits may have minimum delays too; the
prototype delay in a bounded-delay design is such a circuit.
Finally, related to the design difficulties, is the tes ting of
the possible interleaving scenarios, as in concurrent software.
As ynchronous control circuitry must be designed to handle a variety
of contingencies regarding timing, and the testing harness must be
able to cause at least most of these possibilities .
CONCLUSION Why isn t it popular? Why doesn t industry currently use as ynchronous
designs (with a handful of exceptions)? The main cause is risk.
As ynchronous design techniques are sometimes seen as unproven,
despite a number of academic (and industry) successes. Further,
any asynchronous design will incur additional cost in training
engineers to use techniques they didn t learn in school. Finally, tool
development is likely seen as an obstacle. Moreover, at least up to now, industry has been getting
by without asynchronous design. So far, the clocked designs have
been feasible (if occasionally expensive), and low power does not yet
dominate demand. Should it be used?
My conclusion is an emphatic yes! Clock s are getting
faster, while chips are getting bigger, both of which make clock
distribution harder. Chips are also becoming more heterogeneous,
with functions like memory and network interfaces being considered,
all of which c omplicates the global timing analys is nec essary for a
synchronous design. Finally, we are entering an age when
processors will be just about everywhere, and this will require very
low power designs. It s just not practical to expect a clean, skew-free
clock for every (say) piece of clothing with a processing element. But this can only happen if more focus, especially at the
university level, is given to asynchronous design. Most of today s
designers don t understand it well enough to use it, and may even
regard it with sus picion. It is certainly a c hallenge, but just as the
soft ware community is moving towards more concurrency, the
hardware community must move to incorporate asynchronous logic.