chipcflow

8/4/2019 chipcflow

1/4

CHIPCFLOW - A DYNAMIC DATAFLOW MACHINE USING DYNAMIC

RECONFIGURABLE HARDWARE

Jorge L. Silva, Joelmir J. Lopes

Department of Computer Systems

University of Sao Paulo

Av. Tabalhador Saocarlense, 400,

Sao Carlos, SP, Brazil

emails: [email protected],

[email protected]

Valentin O. Roda, Kelton P. Costa

Department of Electrical Engeneering

University of Sao Paulo

Av. Trabalhador Saocarlense, 400,

Sao Carlos, SP, Brazil

emails: [email protected],

[email protected]

ABSTRACTIn order to convert High Level Language (HLL) into hard-

ware, a Control Dataflow Graph (CDFG) is a fundamen-

tal element to be used. Otherwise, Dataflow Architecture,

can be obtained directly from the CDFG. In the 1970s and

late 1980s, the Dataflow Model was the focus of attention

that provided parallelism in a natural form. In particular,

dynamic dataflow architecture can be generated to produce

a high level of parallelism. In this paper, the ChipCflow

project is described as a system to convert HLL into a dy-

namic dataflow graph to be executed in a dynamic recon-

figurable hardware, exploring the dynamic reconfiguration.

The ChipCflow consists of various parts: the compiler to

convert the C program into a dataflow graph; the operators

and its instances; the tagged-token; and the matching data.

Some results are presented in order to show a proof of con-

cept for the project.

1. INTRODUCTION

A Dataflow Architecture is an architecture where a natural

parallelism is present. This kind of architecture was first

researched in the 1970s and was discontinued in the 1990s

[1, 4, 7]. With the advance of the technology of microelec-

tronics, the Field Programable Gate Array (FPGA) is been

used, mainly because of its flexibility, the facilities to imple-ment complex systems and the intrinsic parallelism. Thus,

the dataflow architecture is a topic once more [2, 5], spe-

cially because of the reconfigurable architecture, witch is

totally based on FPGAs. On the other hand, a great effort is

been realized to covert high level language as a C language

into a hardware, in order to help engineering to project his

systems using a high level of abstraction as well as a digital

logic level. In particular, the ChipCflow project is a system

where a C programs is initially converted into a Dynamic

Sponsor acknowledgments for CNPq

Dataflow graph, followed by its execution in a Reconfig-urable Hardware. Its flow diagram is shown in Figure 1.

As can be clearly seen in Figure 1, the ChipCflow system

begins in a host machine where a C program is edited, to

be converted into a control dataflow graph (CDFG) gener-

ating a CDFG object program. The CDFG object program

is converted into a VHDL where modules of CDFG are ac-

cessed from a data base of VHDL modules. After gener-

ating the complete VHDL program, an EDA tool to con-

vert the VHDL program into a bitstream and to download

it to a FPGA is used. A dynamic reconfiguration, present

in some FPGAs that provide dynamic dataflow execution is

used, witch is the purpose of this project.

Fig. 1. The Flow Diagram for ChipCflow tool.

A dataflow graph is composed by a set of operators in-

terconnected by arcs. Various items of data can be in an arc

coming to an operator. When all arcs of an operator are with

data partners, it fires and send a result to another operator.

There are two models of dataflow architecture: a static and a

dynamic dataflow model. In the static model, just one item

of data can be in an arc waiting for its partner. A protocol

for a static dataflow model should be used to maintain just

one item of data in an arc. Consequentially the parallelism

is limited to one item of data per arc. In a dynamic model,

978-1-4244-3846-4/09/$25.00 2009 IEEE 213

8/4/2019 chipcflow

2/4

more than one item of data can be in an arc. The protocol

for dynamic dataflow model also should control each item

of data in the arc and their partners, however a tagged-token

is used to control the data partners. In this case, the paral-

lelism is present when all items of data can be in the arcs,

rightly limited for the size of the hardware to receive thesedata.

In the section 2 the basic structure for ChipCflow: the

pre-compiler, its operators, and some examples of graphs is

presented. In the section 3 the instances model; the tagged-

token format and the iterative constructors are described witch

allow several instance of an operator to be executed in the

dynamic model of dataflow using iterative constructor re-

spectively. In the section 4 the Matching data that identify

items of data partners is described. In the section 5 the im-

plementation of the operator and its instances are described.

Finally in the section 6 the conclusion and future works are

described.

2. THE C PRE-COMPILER FOR DATAFLOW

GRAPHS

After lexical analyzing, semantic analyzing and code opti-

mization, the code generation, in the C pre-compiler, pro-

vide a file with various packets of bits that represent the

dataflow graph. The format of the packet is described in

the Figure 2. As can be clearly seen in the Figure 2, the first

4-bits of the packet is been used to identify the operator; the

second, the thirty and the fourth 5-bits is been used to iden-

tify the three inputs (a,b and c) of the operator; finally the

sixth and the seventh 5-bits to identify the outputs (s and z)of the operator. This is a generic template for the operator

with three inputs and two outputs signal, however there are

operators which less than three inputs signals and just one

output signal. After compiling a C program, its correspon-

dent dataflow graph and the packets of bits for this dataflow

graph are generated. In the Figure 3, one single example of a

While C statement is described. After compiling the While

statement, a dataflow graph are generated. It is shown in the

Fig. 2. The bit stream for dataflow graph.

Figure 3 that each operator has a set of bits to identify its

function, as well as each arc has a set of bits to identify its

interconnections. In particular, in the left top of the figure

has an operator with the code 0001 and its arcs 00000

(value 0), 00001 (value i), 00010 (a control sig-

nal) and 00011 (the output signal), corresponding to three

input signals and one output signal respectively. The packet

of bits for this particular operator can be clearly seen in the

first packet of bits in the Figure 4. The xxxxx in the packet

of bits represent an arc with no connection signal. Thus, a

file with these packets of bits, is a binary representation for

a dataflow graph extracted from a while C statement in the

C pre-compiler.

Fig. 3. A C program converted into a bit stream.

Fig. 4. The bit stream generated from a C program.

After generating a file of binary representation for a dy-

namic dataflow graph, the C pre-compiler converts the file

into a VHDL code. A library with all the operators imple-

mented in VHDL is used for this conversion. The informa-

tion in the packet of bits is used to construct a VHDL code

component for the operator. The set of components are then

used to generate the final VHDL code to be executed in the

hardware.

214

8/4/2019 chipcflow

3/4

3. THE INSTANCES OF THE OPERATORS

As the ChipCflow is based on a Dynamic Dataflow Archi-

tecture, an arc can be viewed as a buffer of data and various

items of data can be in the buffer waiting for the data part-

ner in an operator. However, a new instance of the operatorwill be generated for each item of data coming through the

arc. It is for this reason that there is no buffer of data in

the arcs. Thus, various instances of an operator can be gen-

erated, waiting for the items from their data partners. To

implement the model of instances, a process to insert and

remove the sub-graph in an original graph was proposed.

3.1. The tagged-token

Tagged-tokens are used in the item of data to instantiate sev-

eral execution of the same operator. An example of the ap-

plication of a tagged-token is the execution of a simple loop

in C described in the following algorithm:

z=0;

for (i=0; i

8/4/2019 chipcflow

4/4

Fig. 6. The Instances with its Matching Circuits and the

common variable.

system to generate the instances; and the control to execute

these instances. Initially, an ADD operator only for proof-of-concept was implemented. A Statechart diagram of the

operator is described in Figure 7. The process begins by re-

ceiving the astr signal, informing the current operator that

it has an item of data to receive from the previous operator

and also to test the bitz signal that defines if there is no data

waiting to be sent to the next operator. Thus, if the condi-

tions are true, the next step is to observe the incoming data,

with its partner bufa[47-32]=bufb[47-32], according to the

tag specified in Figure 5. If there is a partner, the operator

executes the operation, sends the result zstrb, and acknowl-

edges all the items of data aack,back. When there are no

partners, the data is buffered, and no acknowledgements are

sent back. The implementation was carried out in an EDA

tool from Xilinx ISE 9.2i for Virtex 5 (360 MHz). The oper-

ator spent 6ns to deal with all the conditions for the protocol,

matching data, and the execution. For this implementation,

static dataflow architecture was tested, without dynamic re-

configuration, which will be the focus in the future.

6. CONCLUSION

Research to convert High Level Language (HLL) into hard-

ware has put forward various possibilities mainly with the

flexibility and capacity of the reconfigurable architectures.A Control Dataflow Graph (CDFG) is a fundamental ele-

ment in this process. Otherwise, a Dataflow Architecture,

which was the focus in the 1980s, can be obtained directly

from the CDFG. In particular, dynamic dataflow architec-

ture can be generated in order to produce a high level of

parallelism. In this paper, the ChipCflow project was de-

scribed as a system to convert HLL into a dynamic dataflow

graph to be executed in dynamic reconfigurable hardware,

exploring the dynamic reconfiguration. The operator, which

is the main element in the dataflow graph, was implemented,

and spent 6ns to execute all the process. The simulation re-

Fig. 7. The Statechart of an Operator.

sults demonstrate the proof of concept for the operator. The

next steps of the ChipCflow project are to implement the

complete model of instances and generate an analysis with

benchmarks to verify the impact of this approach.

7. REFERENCES

[1] Arvind. Dataflow: Passing the token. ISCA Keynote, 2005.[2] A. Capelli. A dataflow control unit for c-to-configurable

pipelines compilation flow. IEEE Sumposium on Field-

Programmable Custom Computing Machines FCCM04,

2004.[3] A. DEHON. Reconfigurable architecture for general-purpose

computing. Ph.D. thesis, Massachusetts Institute of Technol-

ogy, 1996.[4] J. B. Dennis. A preliminary architecture for a basic dataflow

processor. Proceedings of the 2nd Annual Symposium on Com-

puter Architecture, 1975.[5] S. Swanson. Wavescalar. 36th Annual International Sympo-

sium on Microarchitecture, 2003.[6] S. e. a. Swanson. Wavescalar. 36th Annual International Sym-

posium on Microarchitecture, 2003.[7] A. H. Veen. Dataflow machine architecture. ACM Computing

Surveys, 18(4):365396, 1986.

216

chipcflow

Documents

Transcript of chipcflow