IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across...

33
RC25508 (WAT1411-052) November 19, 2014 Computer Science Research Division Almaden – Austin – Beijing – Cambridge – Dublin - Haifa – India – Melbourne - T.J. Watson – Tokyo - Zurich IBM Research Report Graph Programming Interface: Rationale and Specification K. Ekanadham, Bill Horn, Joefon Jann, Manoj Kumar, José Moreira, Pratap Pattnaik, Mauricio Serrano, Gabi Tanase, Hao Yu IBM Research Division Thomas J. Watson Research Center P.O. Box 218 Yorktown Heights, NY 10598 USA

Transcript of IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across...

Page 1: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

RC25508 (WAT1411-052) November 19, 2014Computer Science

Research DivisionAlmaden – Austin – Beijing – Cambridge – Dublin - Haifa – India – Melbourne - T.J. Watson – Tokyo - Zurich

IBM Research Report

Graph Programming Interface: Rationale and Specification

K. Ekanadham, Bill Horn, Joefon Jann, Manoj Kumar, José Moreira,Pratap Pattnaik, Mauricio Serrano, Gabi Tanase, Hao Yu

IBM Research DivisionThomas J. Watson Research Center

P.O. Box 218Yorktown Heights, NY 10598

USA

Page 2: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

Graph Programming Interface:

Rationale and Specification

K Ekanadham, Bill Horn, Joefon Jann, Manoj Kumar,Jose Moreira, Pratap Pattnaik, Mauricio Serrano, Gabi Tanase, Hao Yu

Generated on 2014/11/12 at 06:28:15 EDT

Abstract

Graph Programming Interface (GPI) is an interface for writing graph algorithms using linearalgebra formulation. The interface addresses the requirements of supporting both portabilityand high performance across a wide spectrum of computing platforms. The application developercomposes his or her application using a collection of objects and methods defined by GPI. Thisapplication is then linked with a run-time library that implements the objects and methodsefficiently on the target platform. This run-time library can be optimized to use the memoryhierarchy characteristics and parallelism features of that platform. Concrete instances of GPImust follow a specific binding of the interface to a programming language. We first present abinding for the C programming language.

1

Page 3: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

Contents

1 Introduction 3

2 Motivation 3

2.1 The limitations of current graph analytics approaches . . . . . . . . . . . . . . . . . 3

2.2 The value of a graph programming interface . . . . . . . . . . . . . . . . . . . . . . . 4

2.3 The linear algebra approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 GPI scope: a usage scenario 6

4 The interface 6

4.1 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4.1.1 GPI Scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4.1.2 GPI Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4.1.3 GPI Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4.1.4 GPI Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.2.1 Vector and matrix building methods . . . . . . . . . . . . . . . . . . . . . . . 10

4.2.2 Accessor methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4.2.3 Base methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.2.4 Derived methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

5 Conclusions 20

A GPI C bindings 21

B Methods summary 24

C Example implementation of breadth first search 25

D Example implementation of Brandes’s betweenness centrality 26

E Example implementation of Prim’s minimum spanning tree 27

F Example implementation of strongly connected components 28

G Graph generation 29

G.1 The preferential attachment algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 29

G.2 Implementation of preferential attachment algorithm . . . . . . . . . . . . . . . . . . 29

G.3 The RMat algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2

Page 4: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

1 Introduction

Graph Programming Interface (GPI) is an interface for writing graph algorithms using linear alge-bra formulation [1]. The algorithms are independent of both the programming language and theimplementation of data structures defined in the interface. GPI defines a set of objects and the se-mantics of methods that operate on those objects. In order to implement the objects and methodsof GPI in a specific language, one needs to specify GPI binding rules specific to that language. Inthe appendix, we present the bindings for the C programming language. In the future, bindingsfor other languages can be suitably defined.

We start this report with a motivation for defining GPI: the desire to simultaneously achieve highperformance and high portability for graph analytics applications. We proceed with an exampleof a usage scenario for GPI-based code in a larger graph analytics context. We then presentthe definition of the GPI interface in a programming language-independent form. We concludewith a summary of observations about GPI. The appendix provides a binding of GPI for the Cprogramming language (C11 standard), various examples of graph algorithms coded in C usingGPI, and an overview of random graph generation techniques that are useful for the study of graphanalytics algorithms.

2 Motivation

Graph analytics is an important component of modern business computing. Much of the big datainformation, a subject commanding great attention these days, is graph structured. Graph analyticstechniques require these large graphs to be sub-graphed (analogous to select and project operationsin relational databases) and be analyzed for various properties.

2.1 The limitations of current graph analytics approaches

Since large graph analysis requires sizable computational resources, application developers are oftenfaced with the task of selecting the hardware system to execute the applications. The potentialsystems one considers are large SMPs (i.e., multi-core/multi-threaded general purpose processorsbased systems), distributed memory systems, and accelerated systems where the CPUs are aug-mented with GPUs and/or FPGAs. Traditionally, to achieve good performance, each of thesesystems requires the basic kernels of the graph algorithm to be specialized for that system.

The application programmer is then faced with the challenge of achieving high performancewhile maintaining code portability, either because systems evolve over time or because the userwishes to move from one type of system to another. Attaining optimum performance requiresdetailed knowledge of the design of the processors and the systems they comprise, including theircache and memory hierarchy. This knowledge is needed to adapt the analytics algorithms to theunderlying system, in particular to take advantage of the parallelism or concurrency at the chip,node or system level. Exploiting concurrency and parallelism so far has been and still is a skilledand non-automated task.

3

Page 5: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

The performance and portability challenge is not easily addressable by compilers because compil-ers do not have the ability to examine large instruction windows. Furthermore, in the conventionalrepresentation of graph algorithms the control flow is often data dependent, making the job of thecompiler even more difficult. Particularly, the number of vertices and edges in the graph beinganalyzed, the sparsity of the graph and whether that sparsity have a regular pattern, are not knowto the compiler.

2.2 The value of a graph programming interface

Our approach is to define a programming interface that can be used to build machine-independentgraph algorithms. This programming interface can be implemented through a platform-optimizedrun-time library that is linked with the application code. Such approach bypasses the compilerdifficulties mentioned above and is designed to provide the following values and capabilities:

1. Unburden the application developer from performance concerns at the processor level.

2. Unburden the application developer from the task of adapting his or her code to the sizes ofthe various caches and other characteristics of the memory hierarchy.

3. Unburden the application developer from the headaches of exploiting parallelism, factoringin the design of the memory subsystem as well as the nature of compute nodes.

4. As a consequence of the above, address the porting issue while delivering high performance.

The interface described in this paper creates the above values by providing a set of well definedobjects and methods on those objects that support a linear algebra formulation of graph algorithms.That is, the algorithms are expressed as a combination of operations on vectors and matrices. Inparticular, operations on the adjacency matrix of a graph. By optimizing the methods of theinterface for a platform, we can ensure that the algorithms so described will run well in thatplatform. Furthermore, the algorithms are portable (and performance portable) across a spectrumof platforms that efficiently implement these methods.

Ideally, a run-time library implementing the GPI interface will use two run-time inputs toproduce an optimized execution of the graph algorithm, as shown in Figure 1. The first input isinformation on the characteristics of the graph, including features such as the size of the graphand nature of its sparsity. The second input is information on attributes of the execution platform,such as the size of the main memory and cache hierarchies, SMT levels of cores and performanceprojections for such levels.

Our goal is to define GPI so that it can be implemented efficiently across a wide spectrum ofplatforms, and thus deliver on the performance with portability value proposition. We also wantto develop initial reference implementations that demonstrate that value across selected platforms.Over time, the specification of GPI will evolve as new systems appear and new algorithms arecovered. The implementations will also get better as more effort and experience is thrown into theproblem.

4

Page 6: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

Graph

analytics

Algorithms using the

linear algebra

operators

Attributes of graphs and vectors:

•User supplied and system discovered

Analytics application – from developer

System capabilities:

•Sockets, cores,

accelerators

•SMP / scale-out

•Cache & memory hierarchy

•Multiple optimized implementations of

each graph operator (C/C++)

•Each optimal for some set of graph

attributes & system capabilities

Discovery of graph attributes

•Sparsity structure

Select operators optimized

implementation

Deployment configuration

•Example: number of

software/hardware threads

•Page size

Application configuration

•Example, capacity of leaf nodes

in quad tree

Optimized library of primitive ops

System

Run time

Attributes

Figure 1: High level view of a graph analytics run-time.

2.3 The linear algebra approach

We chose to define GPI according to a linear algebra formulation of graph algorithms [1]. Thisformulation centers around operations on the adjacency matrix A of a graph. For a graph ofn vertices, this is an n × n matrix where the elements represent the edges of the graph. Forunweighted graph, A is a matrix of Boolean values: Aij = true means that there is an edge fromvertex i to vertex j. Correspondingly, Aij = false means no such edge. For weighted graphs, Aij

takes numerical values. Aij = 0 means no edge from vertex i to vertex j. Correspondingly, apositive value means an edge of that weight from vertex i to vertex j. For undirected graphs, theadjacency matrix is always symmetric.

In the linear algebra formulation, it is also common to represent sets of vertices by vectors.If there are n vertices in the graph, then a vector v of size n is used, such that vi = 0 (or false)indicate vertex i does not belong to the set. Correspondingly, vi > 0 (or true) indicates vertex idoes belong to the set. Vectors can also be used to represent vertex labels, number of paths to avertex, reachability sets and other quantities.

The linear algebra formulation for graph algorithms has several value propositions. First, byreducing the computation to linear algebra operations, we naturally get a framework to exploitparallelism and memory hierarchies of various computing platforms. The formulation also deliversthe sought off portability, since a library of linear algebra operations can be optimized for a specificplatform while preserving a standard interface. Productivity is also enhanced through the definitionof a well formalized set of building blocks that can be used to encode general algorithms. Further-

5

Page 7: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

more, these building blocks are already familiar to people from the more traditional scientific andtechnical computing domains. Finally, the linear algebra formulation supports a perturbative ap-proach to computation. If a computation C is performed in an adjacency matrix A producing aresult y = C(A), then the effect on the result of a small change δA to the adjacency matrix can oftenbe approximated (if not computed exactly) as δy = C ′(δA) where C ′ is related to C (sometimesidentical). As a simple example, consider the matrix-vector product: (A+ δA)x = Ax+ δAx.

3 GPI scope: a usage scenario

Applications such as graph database, graph analysis and social network analysis software handlea large number of entities and a variety of relationships among those entities. The graphs under-lining these applications typically carry a large number of attributes on the vertices and edges.These graphs are persistently stored in files or database management systems. Examples includeNeo4J, Accumulo, and HBASE. The graph data may also be persisted using formats such as CSV,GraphML, Graph eXchange Language (GXL), and Resource Description Framework (RDF).

The design point for GPI is focused on the structural aspects of the graph representations. Morespecifically, using the adjacency matrix of the graph. Figure 2 illustrates a typical usage scenariowhere GPI will be used in today’s graph analytics applications. Starting with the persistent graphdata, applications often perform filtering of a graph to create a subgraph. From that subgraph, theapplication extracts its structure, using various extractor functions, in the form of its adjacencymatrix. That structure is then manipulated using GPI objects and methods. The extractor modulesprovide functionality to manage and populate the adjacency matrix, leaving the high level graphdata parsing and the extraction of other graph data to other components.

To exercise the separation of GPI interface and implementation, we keep the definition of theadjacency matrix, as well as other matrices and vectors, encapsulated from the users. Thesematrices and vectors can only be manipulated by GPI methods. That way, data representationfor objects and algorithms for methods can be chosen as to optimize for a particular platform anddata set combination. In the future, GPI will define a service interface to provide utilities for graphdumping, debugging, and monitoring.

4 The interface

This section of the report defines the interface that any GPI implementation must conform to. Theinterface specification consists of two parts: First, we introduce the objects defined by GPI. GPIdefines scalar, vector and matrix objects as containers of data. GPI also defines function objects,that perform transformations on the data in these containers. Second, GPI defines a set of methodsto operate on its objects, implementing linear algebra operations on matrices and vectors. GPIobjects are fully opaque, in that they can only be operated on by the GPI methods. We proceedto explain each of these parts of GPI.

6

Page 8: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

normally stored in a graph database

(e.g. Accumulo, Neo4j, Google Bigtables, HBASE)

also stored in a graph database, mostly the same as

the original graph database

Extracts structural features of the graph, like.

• Importance of nodes• e.g. BC (Betweenness-Centrality),

• Least delay service path• e.g. MST (Min Spanning Tree),

• Ranking of nodes

• e.g. Eigen-Centrality, etc.

Structural Analyzer for Subgraph

Graph Data

Filter Methods

(Like DB SELECT methods)

Subgraph Data

Extractors

The GPI interface (A graph programming interface.

Including a service interface for debugging, monitoring, recovery etc.)

Matrices & Vectors

Figure 2: Usage scenario for GPI in graph analytics work flow.

4.1 Objects

A summary of GPI objects is shown in Figure 3. GPI Scalar, GPI Vector and GPI Matrix are allcontainers of homogeneous elements. Each element is one data item, and the type of the containeris the type of its elements. Supported element types are shown in Table 1. We use GPI type todenote one of the element data types in the table. GPI Function is the fourth type of object in GPI.All these objects are discussed in more detail below.

When a value of type T1 is assigned to an element (from a scalar, vector or matrix) of type T2,casting has to occur. Casting is the transformation of a value of one type to a value of anothertype. Some casting operations are lossless, in the sense that no information is lost. Other castoperations are lossy, because information in the original type cannot be preserved in the new type.For example, casting from GPI int32 to GPI fp64 is always lossless. Every 32-bit signed integer hasan exact representation as a 64-bit floating-point number. The reverse casting is lossy, as most64-bit floating-point numbers do not have an exact representation as a 32-bit integer. Castingis performed automatically by GPI inside its methods, whenever a computed value is of a typedifferent than the container it is supposed to be stored in. Application programmer code can alsoperform casting explicitly. This can be done by using one of the copy methods described below.

7

Page 9: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

Figure 3: Objects in GPI.

4.1.1 GPI Scalar

GPI Scalar is a zero-dimensional, one-element container. The element can be of any of the typesin Table 1. GPI defines a number of scalars, shown in Table 2. Other scalars can be built bycasting elements of one of the types in Table 1 to GPI Scalar. (The casting mechanism is specific toeach programming language binding. We discuss casting for the C programming language in theappendix.)

Table 1: Data types in GPI.GPI data type Description

GPI bool Boolean (true or false)GPI int32 32-bit signed integerGPI uint32 32-bit unsigned integerGPI fp32 32-bit IEEE floating-point numberGPI fp64 64-bit IEEE floating-point numberGPI index A matrix (row or column) or vector index, a vertex indexGPI size The size (number of elements) of a collectionGPI addr An address (pointer)

8

Page 10: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

Table 2: Built-in GPI scalars.Built-in scalar value

GPI zero 0GPI one 1GPI minusone -1GPI null nothing

4.1.2 GPI Vector

A vector is a one-dimensional homogeneous container defined by the type T of its elements and thenumber n of elements in the container (the size of the vector). For a vector v of size n, v[i] denotesthe i-the element of that vector, i = 0, . . . , n− 1.

Vectors in GPI are represented by variables of type GPI Vector. Let v be a vector of type T . Wesay that a GPI Vector v is associated with vector v if and only if v[i] ≡ v[i], ∀i. Reading or writingan element v[i] has the effect of reading or writing the corresponding associated element v[i]. Morethan one GPI Vector can be associated with the same vector.

4.1.3 GPI Matrix

A matrix is a two-dimensional homogeneous container defined by the type T of its elements, thenumber m of rows and the number n of columns in the container (the shape of the matrix). Fora matrix A of shape m × n, A[i, j] denotes the element at row i and column j of the matrix,i = 0, . . . ,m− 1, j = 0, . . . , n− 1. A[i, :] denotes the i-th row of the matrix (an n-element vector),and A[:, j] denotes the j-th column of the matrix (an m-element vector).

Matrices in GPI are represented by variables of type GPI Matrix. Let A be a matrix of typeT . We say that a GPI Matrix A is associated with matrix A if and only if A[i, j] ≡ A[i, j],∀i, j.Reading or writing an element A[i, j] has the effect of reading or writing the corresponding associatedelement A[i, j]. More than one GPI Matrix can be associated with the same matrix. Furthermore,a GPI Vector can be associated with a row or column of a matrix.

4.1.4 GPI Function

GPI supports both unary (one input argument) and binary (two input arguments) functions. Fur-thermore, functions can be either scalar (taking scalar arguments) or vector (taking vector argu-ments) functions. A unary scalar function f must have the signature f(GPI Scalar) → GPI Scalar.A binary scalar function g must have the signature g(GPI Scalar×GPI Scalar)→ GPI Scalar. Corre-spondingly, a unary vector function f must have the signature f(GPI Vector)→ GPI Vector, whereasa binary vector function g must have the signature g(GPI Vector×GPI Vector)→ GPI Vector. Thetype of a function is defined by the types of its input and output arguments. GPI functions can beeither intrinsic (built-in) to the interface definition or user defined. The set of GPI scalar built-infunctions, is listed in Table 3. When used as vector functions, these built-ins apply element-wise.

9

Page 11: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

Function GPI LOC is a bit of a special case. It is used where a binary function is expected (e.g.,reduction operations on vectors, see GPI reduce below). It returns the location in the vector of thefirst occurrence of the starting value.

Table 3: Built-in functions in GPI.Function meaning

GPI MAX maximumGPI MIN minimumGPI SUM sumGPI PROD productGPI LAND logical andGPI BAND bit-wise andGPI LOR logical orGPI BOR bit-wise orGPI LXOR logical xorGPI BXOR bit-wise xorGPI EQ equalGPI NE not equalGPI GT greater thanGPI GE greater than or equalGPI LT less thanGPI LE less than or equalGPI LOC location of a value

Function meaning

GPI NEG negationGPI LNOT logical notGPI BNOT bit-wise not

(a) binary functions (b) unary functions

4.2 Methods

GPI defines four groups of methods. Vector and matrix building methods (also called entity meth-ods) are used to create and destroy vectors and matrices. Accessor methods are used to manipulatethe internals of vectors and matrices. Base methods are the building blocks of graph algorithmsin linear algebra formulation. Finally, derived methods can be constructed from the base methodsbut can be implemented more efficiently directly and are thus part of the GPI interface.

Several methods accept an optional mask parameter. The mask is a vector of elements thatcontrols conditional execution inside the method. Element values are interpreted as either false (avalue of 0) or true (any value not equal to 0). Whenever a mask parameter is not specified whencalling the method, this is equivalent to passing a mask with all elements true.

4.2.1 Vector and matrix building methods

GPI Vector new(v,type,size)

10

Page 12: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

v GPI Vector OUT a new vectortype GPI type IN type of the vector elementssize GPI size IN number of elements in vector

Creates a new vector v with size elements of type type and associates it with v. The value ofeach element is the corresponding zero value for that type.

GPI Vector copy(v,u[,m])

v GPI Vector OUT output vectoru GPI Vector IN input vectorm GPI Vector IN optional mask

Let v be a vector of type T1 and size n and let u be a vector of type T2 and size n. This methodcopies the value of each element of u into the corresponding element of v, casting the value asnecessary (v[i]← (T1)u[i],∀i | m[i] = true).

GPI Vector delete(v)

v GPI Vector IN A vector

Destroys the vector associated with v. The GPI Vector delete method must be called at leastonce for each vector created with GPI Vector new. The interval between the creation and destructionof a vector is its lifetime. Once a vector is destroyed, it should not be subsequently referenced byany variable that was associated with it.

GPI Matrix new(A,type,rows,cols)

A GPI Matrix OUT new matrixtype GPI type IN type of the matrix elementsrows GPI size IN number of rows in matrixcols GPI size IN number of columns in matrix

Creates a new matrix A with elements of type type and shape rows×cols, and associates it withA. The value of each element is the corresponding zero value for that type.

GPI Matrix copy(B,A)

B GPI Matrix OUT output matrixA GPI Matrix IN input matrix

Let B be a matrix of type T1 and shape m× n. Let A be a matrix of type T2 and shape m× n.This method copies the values of each element of A into the corresponding element of B, performingcasting as necessary (B[i, j]← (T1)A[i, j],∀i, j).

GPI Matrix copy(B,A,dim[,m])

B GPI Matrix OUT output matrixA GPI Matrix IN input matrixdim GPI index IN dimensionm GPI Vector IN optional mask

Let B be a matrix of type T1 and shape m× n. Let A be a matrix of type T2 and shape m× n.If (dim = 0), this method copies each row of A into the corresponding row of B (B[i, :]← A[i, :], ∀i |

11

Page 13: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

m[i] = true). If (dim = 1), this method copies each column of A into the corresponding column ofB (B[:, j]← A[:, j], ∀j | m[j] = true).

GPI Matrix delete(A)

A GPI Matrix IN matrix

Destroys the matrix associated with A. The GPI Matrix delete method must be called at leastonce for each matrix created with GPI Matrix new. The interval between the creation and destruc-tion of a matrix is its lifetime.

4.2.2 Accessor methods

GPI Matrix nrows(nrows,A)

nrows GPI size OUT number of rows in matrixA GPI Matrix IN matrix

Returns in nrows the number of rows of the matrix associated with matrix.

GPI Matrix ncols(ncols,A)

ncols GPI size OUT number of columns in matrixA GPI Matrix IN matrix

Returns in ncols the number of columns of the matrix associated with matrix.

GPI Matrix row(row,A,index)

row GPI Vector OUT vectorA GPI Matrix IN matrixindex GPI index IN the index of a row of the matrix

Associates row with the vector A[index, :].

GPI Matrix col(col,A,index)

col GPI Vector OUT vectorA GPI Matrix IN matrixindex GPI index IN the index of a column of the matrix

Associates col with the vector A[:, index].

GPI Matrix getElement(element,A,row,col)

element GPI Scalar OUT elementA GPI Matrix IN matrixrow GPI index IN the index of a row of the matrixcol GPI index IN the index of a column of the matrix

Returns in element the value of A[row, col].

GPI Matrix setElement(A,row,col,element)

12

Page 14: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

A GPI Matrix IN matrixrow GPI index IN the index of a row of the matrixcol GPI index IN the index of a column of the matrixelement GPI Scalar IN element

Sets the value of A[row, col] to the value of element, casting if necessary.

GPI Vector size(size,v)

size GPI size OUT number of elementsv GPI Vector IN vector

Returns in size the number of elements of v.

GPI Vector getElement(element,v,index)

element GPI Scalar OUT elementv GPI Vector IN vectorindex GPI index IN the index of an element of the vector

Returns in element the value of v[index].

GPI Vector setElement(v,index,element)

v GPI Vector IN vectorindex GPI index IN the index of an element of the vectorelement GPI Scalar IN element

Sets the value of v[index] to the value of element, casting if necessary.

4.2.3 Base methods

Several base methods in GPI are polymorphic. They apply to different combinations of input types,typically vectors and matrices. In the following descriptions, methods are grouped by the particularkind of operation they perform. Within each group, we list the different forms of the methods.

4.2.3.1 Replication

GPI replicate(v,x[,m])

v GPI Vector OUT output vectorx GPI Scalar IN input scalarm GPI Vector IN optional mask

Let x be a scalar of type T and v be a vector of type T and size n. This method set each elementof v to the value of x (v[i]← x,∀i | m[i] = true).

GPI replicate(A,x)

A GPI Matrix OUT output matrixx GPI Scalar IN input scalar

13

Page 15: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

Let x be a scalar of type T and A be a matrix of type T and shape m × n. This method setseach element of A to the value of x (A[i, j]← x, ∀i, j).

GPI replicate(A,x,dim[,m])

A GPI Matrix OUT output matrixx GPI Vector IN input vectordim GPI index IN replicating dimensionm GPI Vector IN optional mask

Let x be a vector of type T and size p and A be a matrix of type T and shape m×n. If (dim = 0),This method sets each row of A to the value of x (A[i, :]← x, ∀i | m[i] = true, p = n). If (dim = 1),This method sets each column of A to the value of x (A[:, j]← x, ∀j | m[j] = true, p = m).

4.2.3.2 Index generation

GPI indices(v[,m])

v GPI Vector OUT output vectorm GPI Vector IN optional mask

Let v be a vector of size n. This function sets the value of each element of v to its index(v[i]← i,∀i | m[i] = true).

4.2.3.3 Filtering

GPI filter(w,x,u,v[,m])

w GPI Vector OUT output vectorx GPI Scalar IN test valueu GPI Vector IN input vectorv GPI Vector IN input vectorm GPI Vector IN optional mask

Given three vectors, w, u and v of type T and size n, and a scalar x of type T , it computesvector w such that w[i]← ((u[i] = x)?v[i] : x),∀i | m[i] = true.

GPI filter(C,x,A,B,dim[,m])

C GPI Matrix OUT output matrixx GPI Vector IN test valuesA GPI Matrix IN input matrixB GPI Matrix IN input matrixdim GPI index IN filtering dimensionm GPI Vector IN optional mask

If dim = 0, this is equivalent to GPI filter(C[i, :], x[i],A[i, :],B[i, :][,m]), ∀i. If dim = 1, this isequivalent to GPI filter(C[:, j], x[j],A[:, j],B[:, j][,m]), ∀j.

GPI filterneg(w,x,u,v[,m])

14

Page 16: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

w GPI Vector OUT output vectorx GPI Scalar IN test valueu GPI Vector IN input vectorv GPI Vector IN input vectorm GPI Vector IN optional mask

Given three vectors, w, u and v of type T and size n, and a scalar x of type T , it computesvector w such that w[i]← ((u[i] 6= x)?v[i] : x),∀i | m[i] = true.

GPI filterneg(C,x,A,B,dim[,m])

C GPI Matrix OUT output matrixx GPI Vector IN test valuesA GPI Matrix IN input matrixB GPI Matrix IN input matrixdim GPI index IN filtering dimensionm GPI Vector IN optional mask

If dim = 0, this is equivalent to GPI filterneg(C[i, :], x[i],A[i, :],B[i, :][,m]), ∀i. If dim = 1, this isequivalent to GPI filterneg(C[:, j], x[j],A[:, j],B[:, j][,m]),∀j.

4.2.3.4 Reduction

GPI reduce(y,f,x,u[,m])

y GPI Scalar OUT output scalarf GPI Function IN functionx GPI Scalar IN starting valueu GPI Vector IN input vectorm GPI Vector IN optional mask

Given a scalar x of type T2, a vector u of type T1 and size n, and a scalar function f(T1, T2)→ T2,compute a scalar y of type T2 using the recurrence y← x; y← f(u[i], y), i = 0, . . . , n−1 | m[i] = true.

GPI reduce(y,f,x,A,dim[,m])

y GPI Vector OUT output vectorf GPI Function IN functionx GPI Vector IN starting valuesA GPI Matrix IN input matrixdim GPI index IN reducing dimensionm GPI Vector IN optional mask

If (dim = 0), then given a vector x of type T2 and size n, a matrix A of type T1 and shape m×n,and a vector function f(T1, T2)→ T2, compute a vector y of type T2 and size n using the recurrencey← x; y← f(A[i, :], y), i = 0, . . . ,m− 1 | m[i] = true. If (dim = 1), then given a vector x of type T2and size m, a matrix A of type T1 and shape m×n, and a vector function f(T1, T2)→ T2, computea vector y of type T2 and size m using the recurrence y ← x; y ← f(A[:, j], y), j = 0. . . . , n − 1 |m[j] = true.

15

Page 17: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

4.2.3.5 Mapping

GPI map(v,f,u[,m])

v GPI Vector OUT output vectorf GPI Function IN functionu GPI Vector IN input vectorm GPI Vector IN optional mask

Given a scalar function f(T1) → T2, a vector u of type T1 and size n, and a vector v of size nand type T2, it computes v[i]← f(u[i]),∀i | m[i] = true.

GPI map(B,f,A,dim[,m])

B GPI Matrix OUT output matrixf GPI Function IN functionA GPI Matrix IN input matrixdim GPI index IN mapping dimensionm GPI Vector IN optional mask

If (dim = 0), then given a vector function f(T1)→ T2, a matrix A of type T1 and shape m× n,and a matrix B of shape m × n and type T2, it computes B[i, :] ← f(A[i, :]),∀i | m[i] = true. If(dim = 1), then given a vector function f(T1)→ T2, a matrix A of type T1 and shape m× n, and amatrix B of shape m× n and type T2, it computes B[:, j]← f(A[:, j]),∀j | m[j] = true.

4.2.3.6 Zipping

GPI zip(w,f,u,v[,m])

w GPI Vector OUT output vectorf GPI Function IN functionu GPI Vector IN input vectorv GPI Vector IN input vectorm GPI Vector IN optional mask

Given a scalar function f(T1, T2) → T3, a vector u of type T1 and size n, a vector v of type T2and size n, and a vector w of size n and type T3, it computes w[i]← f(u[i], v[i]),∀i | m[i] = true.

GPI zip(C,f,A,B,dim[,m])

C GPI Matrix OUT output matrixf GPI Function IN functionA GPI Matrix IN input matrixB GPI Matrix IN input matrixdim GPI index IN zipping dimensionm GPI Vector IN optional mask

If (dim = 0), then given a vector function f(T1, T2)→ T3, a matrix A of type T1 and shape m×n,a matrix B of shape m× n and type T2, and a matrix C of shape m× n and type T3 it computesC[i, :] ← f(A[i, :],B[i, :]), ∀i | m[i] = true. If (dim = 1), then given a vector function f(T1, T2) → T3,

16

Page 18: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

a matrix A of type T1 and shape m × n, a matrix B of shape m × n and type T2, and a matrix Cof shape m× n and type T3 it computes C[:, j]← f(A[:, j],B[:, j]),∀j | m[j] = true.

4.2.3.7 Function application

GPI apply(v,f,u,k[,m])

v GPI Vector OUT output vectorf GPI Function IN functionu GPI Vector IN initial valuesk GPI uint32 IN number of applicationsm GPI Vector IN optional mask

Given a vector function f(T ) → T , an initial value vector u of type T and size n, and a non-negative integer k, it computes an output vector v of size n and type T through ((k = 0)?v[i] ←u[i] | m[i] = true : apply(v, f, f(u), k− 1)).

GPI apply(B,f,A,k,dim[,m])

B GPI Matrix OUT output matrixf GPI Function IN functionA GPI Matrix IN initial valuesk GPI uint32 IN number of applicationsdim GPI index IN applying dimensionm GPI Vector IN optional mask

If (dim = 0), then this computes GPI apply(B[i, :], f,A[i, :], k),∀i | m[i] = true. If (dim = 1), thenthis computes GPI apply(B[:, j], f,A[:, j], k),∀j | m[j] = true.

GPI fixpt(v,f,u[,m])

v GPI Vector OUT output vectorf GPI Function IN functionu GPI Vector IN initial valuesm GPI Vector IN optional mask

Given a vector function f(T )→ T , and an initial value vector u of type T and size n, it computesapply(v, f, u, k[,m]), where k is the smallest non-negative value such that apply(v, f, u, k[,m]) andapply(v, f, u, k + 1[,m]) return the same v[i],∀i | m[i] = true. The value of v is undefined if there isno such k.

GPI fixpt(B,f,A,dim[,m])

B GPI Matrix OUT output matrixf GPI Function IN functionA GPI Matrix IN initial valuesdim GPI index IN applying dimensionm GPI Vector IN optional mask

If (dim = 0), then this computes GPI fixpt(B[i, :], f,A[i, :]),∀i | m[i] = true. If (dim = 1), then

17

Page 19: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

this computes GPI fixpt(B[:, j], f,A[:, j]),∀j | m[j] = true.

4.2.3.8 Transposition

GPI transpose(B,A)

B GPI Matrix OUT output matrixA GPI Matrix IN input matrix

Let A be a matrix of type T and shape m×n. This function associates with B the transpose ofA. That is, B[i, j] ≡ A[j, i],∀i, j. Modifying B[i, j] has the effect of modifying A[j, i] and vice versa.

4.2.4 Derived methods

Derived methods can also be polymorphic. We again group the methods by kind of operations andwithin each group we list the different forms of the methods.

4.2.4.1 Generalized inner product

GPI innerp(y,f,g,x,u,v[,m])

y GPI Scalar OUT result scalarf GPI Function IN reduce functiong GPI Function IN map functionx GPI Scalar IN initial scalaru GPI Vector IN input vectorv GPI Vector IN input vectorm GPI Vector IN optional mask

Let u and v be vector of size n and types T1 and T2 respectively. Let f(T3 × T4) → T4 andg(T1× T2)→ T3 be functions, and let y be of type T4. This method first computes a vector w withGPI zip(w, g, u, v[,m]). Then, it computes and returns y using GPI reduce(y, f, x,w[,m]).

GPI innerp(y,f,g,x,A,B,dim[,m])

y GPI Vector OUT result vectorf GPI Function IN reduce functiong GPI Function IN map functionx GPI Vector IN initial vectorA GPI Matrix IN input matrixB GPI Matrix IN input matrixdim GPI index IN reducing dimensionm GPI Vector IN optional mask

Let A and B be matrices of shape m×n and types T1 and T2 respectively. Let f(T3× T4)→ T4and g(T1 × T2) → T3 be functions, and let y be of type T4. This method first computes a matrix

18

Page 20: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

C of type T3 and shape m × n with GPI zip(C, g,A,B, dim[,m]). Then, it computes and returns yusing GPI reduce(y, f, x, C, dim[,m]).

4.2.4.2 Generalized matrix multiplication

GPI mxv(y,A,x,f,g[,m])

y GPI Vector OUT result vectorA GPI Matrix IN input matrixx GPI Vector IN input vectorf GPI Function IN reduce functiong GPI Function IN map functionm GPI Vector IN optional mask

Given a matrix A of type T1 and shape m × n, a vector x of type T2 and size n, functionsf(T3 × T4)→ T4 and g(T1 × T2)→ T3, and a vector y of type T4 and size n, this method computesy through GPI innerp(y[i], f, g, x,A[i, :]),∀i | m[i] = true. If f = GPI SUM and g = GPI PROD, this isa standard matrix-vector multiply.

GPI vxm(y,x,A,f,g[,m])

y GPI Vector OUT result vectorx GPI Vector IN input vectorA GPI Matrix IN A matrixf GPI Function IN reduce functiong GPI Function IN map functionm GPI Vector IN optional mask

Given a matrix A of type T2 and shape m × n, a vector x of type T1 and size m, functionsf(T3 × T4)→ T4 and g(T1 × T2)→ T3, and a vector y of type T4 and size n, this method computesy through GPI innerp(y[i], f, g, x,A[:, i]),∀i | m[i] = true. If f = GPI SUM and g = GPI PROD, this isa standard vector-matrix multiply.

GPI mxm(C,A,B,f,g)

C GPI Matrix OUT result matrixA GPI Matrix IN An input matrixB GPI Matrix IN An input matrixf GPI Function IN reduce functiong GPI Function IN map function

Given a matrix A of type T1 and shape m×n, a matrix B of type T2 and shape n× p, functionsf(T3 × T4) → T4 and g(T1 × T2) → T3, and a matrix C of type T4 and shape m × p, this methodcomputes C through GPI mxv(C[:, j],A,B[:, j], f, g), ∀j. If f = GPI SUM and g = GPI PROD, this isa standard matrix-matrix multiply.

19

Page 21: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

4.2.4.3 Graph operations

GPI successors(y,A,s)

y GPI Vector OUT result vectorA GPI Matrix IN adjacency matrixs GPI index IN index of a source vertex

Given an adjacency matrix A and a source vertex s for a graph, computes a vector y such thaty[i] = true if vertex i can be reached from vertex s in 0 or more steps, and y[i] = false otherwise.We note that y[s] is always true.

GPI predecessors(y,A,t)

y GPI Vector OUT result vectorA GPI Matrix IN adjacency matrixt GPI index IN index of a target vertex

Given an adjacency matrix A and a target vertex t for a graph, computes a vector y such thaty[i] = true if vertex t can be reached from vertex i in 0 or more steps, and y[i] = false otherwise.We note that y[t] is always true.

GPI reachability(R,A)

R GPI Matrix OUT reachability matrixA GPI Matrix IN adjacency matrix

Given an adjacency matrix A for a graph, computes the reachability matrix R for that graph.R[i, j] = true if vertex j can be reached from vertex i in 0 or more steps, and R[i, j] = false otherwise.We note that R[i, i] is always true.

5 Conclusions

Graph analytics is an important component of modern business computing. We propose a standardinterface called Graph Programming Interface (GPI) to support the development of graph analyticsapplications that are both portable and high performing. GPI is intended to be used in linearalgebra formulation of graph algorithms. For that purpose, it includes objects and methods thatimplement operations in that domain.

GPI supports scalar, vector and matrix objects. Those objects are completely opaque to theapplication programmer and can only be manipulated by GPI methods. This approach ensuresmaximum flexibility for implementations that want to exploit specific machine features and dataorganization. The same GPI application program can execute efficiently on a variety of machines,including multi-core and many-core processors, GPUs, FPGAs and message-passing systems.

GPI’s specification is language independent. A concrete implementation of GPI must conformto a particular programming language binding. In the appendix we present a binding and examplesfor the C programming language.

20

Page 22: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

References

[1] Jeremy Kepner and John Gilbert. Graph algorithms in the language of linear algebra. SIAM2011, ISBN 978-0-898719-90-1.

[2] Reka Albert and Albert-Laszlo Barbasi. Topology of Evolving Networks: Local Events and Uni-versality. Physical Review Letters, Dec. 11 2000, 85(24), pp. 5234-5237.

[3] Deepayan Chakrabarti and Christos Faloutsos. Graph Mining: Laws, Tools, and Case Studies.Morgan & Claypool Publishers, ISBN 978-1-608451-15-9, Chapter 11.

A GPI C bindings

In the C language bindings of GPI, every function returns a value of type GPI int32. When thatvalue is zero (0), the function completed execution without any detected errors. When that returnvalue is any other number, an error was detected during execution of the function. The list of errorcodes for the C language bindings of GPI is shown in Table 4. The corresponding C types for GPIelement types and objects are shown in Table 5.

Table 4: Error codes for the C language bindings of GPI.error code error type

< 0 GPI panic (unknown error)1 inconsistent/invalid parameters2 out of memory

Table 5: Element data types and objects for the C language bindings of GPI.GPI name C name

GPI bool GPI bool

GPI int32 GPI int32

GPI uint32 GPI uint32

GPI fp32 GPI fp32

GPI fp64 GPI fp64

GPI index GPI index

GPI size GPI size

GPI addr GPI addr

GPI Scalar GPI Scalar

GPI Vector GPI Vector

GPI Matrix GPI Matrix

GPI Function GPI Function

C language prototypes for the matrix and vector building methods of GPI (so called entitymethods) are shown in Table 6. C language prototypes for the accessor methods of GPI are shownin Table 7. C language prototypes for the base methods of GPI are shown in Table 8. C language

21

Page 23: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

prototypes for the derived methods of GPI are shown in Table 9. In all cases we omit the returntype, which is always GPI int32.

Table 6: Entity methods prototypes for the C language bindings of GPI.GPI method C language prototype

GPI Vector new GPI Vector new(GPI Vector*,GPI type,GPI size)

GPI Vector copy GPI Vector copy(GPI Vector*,GPI Vector*,GPI Vector*)

GPI Vector delete GPI Vector delete(GPI Vector*)

GPI Matrix new GPI Matrix new(GPI Matrix*,GPI type,GPI size,GPI size)

GPI Matrix copy GPI Matrix copy(GPI Matrix*,GPI Matrix*)

GPI Matrix copy GPI Matrix copy(GPI Matrix*,GPI Matrix*,GPI index[,GPI Vector*])

GPI Matrix delete GPI Matrix delete(GPI Matrix*)

Table 7: Accessor methods prototypes for the C language bindings of GPI.GPI Matrix nrows GPI Matrix nrows(GPI size*,GPI Matrix*)

GPI Matrix ncols GPI Matrix ncols(GPI size*,GPI Matrix*)

GPI Matrix row GPI Matrix row(GPI Vector*,GPI Matrix*,GPI index)

GPI Matrix col GPI Matrix col(GPI Vector*,GPI Matrix*,GPI index)

GPI Matrix getElement GPI Matrix getElement(GPI Scalar*,GPI Matrix*,GPI index,GPI index)

GPI Matrix setElement GPI Matrix setElement(GPI Matrix*,GPI index,GPI index,GPI Scalar)

GPI Vector size GPI Vector size(GPI size*,GPI Vector*)

GPI Vector getElement GPI Vector getElement(GPI Scalar*,GPI Vector*,GPI index)

GPI Vector setElement GPI Vector setElement(GPI Vector*,GPI index,GPI Scalar)

22

Page 24: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

Table 8: Base methods prototypes for the C language bindings of GPI.

GPI method C language prototype

GPI replicate GPI replicate(GPI Matrix*,GPI Scalar*)

GPI replicate GPI replicate(GPI Matrix*,GPI Vector*,GPI index[,GPI Vector*])

GPI replicate GPI replicate(GPI Vector*,GPI Scalar*[,GPI Vector*])

GPI indices GPI indices(GPI Vector*[,GPI Vector*])

GPI filter GPI filter(GPI Vector*,GPI Scalar*,GPI Vector*,GPI Vector*[,GPI Vector*])

GPI filter GPI filter(GPI Matrix*,GPI Vector*,GPI Matrix*,GPI Matrix*,GPI index[,GPI Vector*])

GPI filterneg GPI filterneg(GPI Vector*,GPI Scalar*,GPI Vector*,GPI Vector*[,GPI Vector*])

GPI filterneg GPI filterneg(GPI Matrix*,GPI Vector*,GPI Matrix*,GPI Matrix*,GPI index[,GPI Vector*])

GPI reduce GPI reduce(GPI Scalar*.GPI Function*,GPI Scalar*,GPI Vector*[,GPI Vector*])

GPI reduce GPI reduce(GPI Vector*.GPI Function*,GPI Vector*,GPI Matrix*,GPI index[,GPI Vector*])

GPI map GPI map(GPI Vector*,GPI Function*,GPI Vector*[,GPI Vector*])

GPI map GPI map(GPI Matrix*,GPI Function*,GPI Matrix*,GPI index[,GPI Vector*])

GPI zip GPI zip(GPI Vector*,GPI Function*,GPI Vector*,GPI Vector*[,GPI Vector*])

GPI zip GPI zip(GPI Matrix*,GPI Function*,GPI Matrix*,GPI Matrix*,GPI index[,GPI Vector*])

GPI apply GPI apply(GPI Vector*,GPI Function*,GPI Vector*,GPI uint32[,GPI Vector*])

GPI apply GPI apply(GPI Matrix*,GPI Function*,GPI Matrix*,GPI uint32,GPI index[,GPI Vector*])

GPI fixpt GPI fixpt(GPI Vector*,GPI Function*,GPI Vector*[,GPI Vector*])

GPI fixpt GPI fixpt(GPI Matrix*,GPI Function*,GPI Matrix*,GPI index[,GPI Vector*])

GPI transpose GPI transpose(GPI Matrix*,GPI Matrix*)

Table 9: Derived methods prototypes for the C language bindings of GPI.

GPI method C language prototype

GPI innerp GPI innerp(GPI Scalar*,GPI Function*,GPI Function*,GPI Scalar,GPI Vector*,

GPI Vector*[,GPI Vector*])

GPI innerp GPI innerp(GPI Vector*,GPI Function*,GPI Function*,GPI Vector*,GPI Matrix*,

GPI Matrix*,GPI index[,GPI Vector*])

GPI mxv GPI mxv(GPI Vector*.GPI Matrix*,GPI Vector*,GPI Function*,GPI Function*[,GPI Vector*]

GPI vxm GPI vxm(GPI Vector*.GPI Vector*,GPI Matrix*,GPI Function*,GPI Function*[,GPI Vector*]

GPI mxm GPI mxm(GPI Matrix*.GPI Matrix*,GPI Matrix*,GPI Function*,GPI Function*)

GPI successors GPI successors(GPI Vector*,GPI Matrix*,GPI index)

GPI predecessors GPI predecessors(GPI Vector*,GPI Matrix*,GPI index)

GPI reachability GPI reachability(GPI Matrix*,GPI Matrix*)

23

Page 25: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

B Methods summary

All methods introduced in this report are listed alphabetically in Table 10.

Table 10: GPI Methods.Function Type Page

GPI apply base 17GPI filter base 14GPI filterneg base 14GPI fixpt base 17GPI indices base 14GPI innerp base 18GPI map base 16GPI Matrix col accessor 12GPI Matrix copy entity 11GPI Matrix delete entity 12GPI Matrix getElement accessor 12GPI Matrix ncols accessor 12GPI Matrix new entity 11GPI Matrix nrows accessor 12GPI Matrix row accessor 12GPI Matrix setElement accessor 12GPI mxm derived 19GPI mxv derived 19GPI vxm derived 19GPI predecessors derived 20GPI reachability derived 20GPI reduce base 15GPI replicate base 13GPI successors derived 20GPI transpose derived 18GPI Vector copy entity 11GPI Vector delete entity 11GPI Vector getElement accessor 13GPI Vector new entity 10GPI Vector setElement accessor 13GPI Vector size accessor 13GPI zip base 16

24

Page 26: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

C Example implementation of breadth first search#

inclu

de

<stdio.h>

#in

clu

de

”gpi.h”

GPI

int32

BFS(G

PI

Matrix∗A

,G

PI

index

s,

GPI

Functio

n1∗

visit)

/∗ ∗

Giv

en

an

adjacency

matrix

Aand

asource

node

s,

perform

sa

BFS

traversal

∗and

calls

”visit”

on

the

vertices

visited

(exclu

ding

source).

∗/

{G

PI

uin

t32

n;

GPI

Matrix

nrows(&

n,

A);

//

n=

num

ber

of

vertices

in

graph

GPI

Vector

q,

p,

r,

v;

GPI

Vector

new(&

q,

GPI

int32

,n);

//

int32[n]

qG

PI

replicate(&

q,

GPI

zero);

//

q=

0G

PI

Vector

setEle

ment(&

q,s,G

PI

one);

//

q[s]

=1

im

pli

es

that

node

sis

visited

at

current

level

GPI

Vector

new(&

p,

GPI

int32

,n);

//

int32[n]

pG

PI

Vector

copy(&

p,&

q);

//

p[s]

=1

im

pli

es

that

node

sis

alr

eady

visited

GPI

Matrix

B;

GPI

transpose(&

B,A

);

//

prepare

transpose

of

adjacency

matrix

GPI

Vector

new(&

r,

GPI

int32

,n);

//

int32[n]

rG

PI

Vector

new(&

v,G

PI

int32

,n);

//

v=

0..

n−1

GPI

indices(&

v);

/∗ ∗

BFS

traversal

and

visits

the

vertices.

∗/

bool

done

=false;

//

done

==

true

when

BFS

phase

is

com

ple

te

do{

GPI

replicate(&

r,

GPI

zero);

GPIm

xv(&

r,&

B,&

q,GPILOR

,GPILAND

);

//

r=

(AˆT)∗q,

finds

all

successors

of

nodes

in

qG

PI

filter(&

q,G

PI

zero,&

p,&

r);

//

filter

out

nodes

alr

eady

visited

GPI

zip

(&p,GPILOR,&

p,&

q);

//

accum

ula

te

nodes

visited

GPIm

ap(&

r,visit,&

v,&

q);

//

visit

nodes

in

this

level

GPI

Scala

rsu

m;

GPI

reduce(&

sum

,GPILOR

,G

PI

zero,&

q);

//

find

if

there

is

any

node

in

q.

done

=(su

m==

GPI

zero);

}w

hile

(!done);

//

if

there

is

no

node

in

q,

we

are

done.

GPI

Vector

delete(&

q);

GPI

Vector

delete(&

p);

GPI

Vector

delete(&

r);

GPI

Vector

delete(&

v);

return

0;

}

25

Page 27: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

D Example implementation of Brandes’s betweenness centrality#

inclu

de

<stdio.h>

#in

clu

de

”gpi.h”

GPI

int32

BC(G

PI

Vector∗delt

a,

GPI

Matrix∗A

,G

PI

index

s)

/∗ ∗

Giv

en

an

adjacency

matrix

Aand

asource

node

s,

com

pute

BC−

metric

vector

delta

∗/

{G

PI

uin

t32

n;

GPI

Matrix

nrows(&

n,

A);

//

n=

num

ber

of

vertices

in

graph

GPI

replicate(delt

a,

GPI

zero);

//

delta

=0

GPI

Matrix

sig

ma;

GPI

Matrix

new(&

sig

ma,

GPI

int32

,n,

n);

//

int32[n,n]

sig

ma

GPI

replicate(&

sig

ma,

GPI

zero);

//

sig

ma[d,k]

=shortest

path

count

to

node

kat

level

dG

PI

Vector

q;

GPI

Vector

new(&

q,

GPI

int32

,n);

//

vector

of

path

counts

GPI

replicate(&

q,

GPI

zero);

//

q=

0G

PI

Vector

setEle

ment(&

q,

s,

1);

//

q[s]

=1

GPI

Vector

p;

GPI

Vector

new(&

p,

GPI

int32

,n);

//

shortest

path

counts

to

all

nodes

so

far

GPI

Vector

copy(&

p,

&q);

//

p=

q

/∗ ∗

BFS

phase

∗/

GPI

int32

d=

0;

//

BFS

level

num

ber

bool

done

=false;

//

done

==

true

when

BFS

phase

is

com

ple

te

do{

GPI

Vector

sig

mad;

GPI

Matrix

row(&

sig

mad,&

sig

ma,d);

//

row

sig

ma[d,:]

=q

GPI

Vector

copy(&

sig

mad

,&q);

GPIvxm

(&q,&

q,A

,GPISUM

,GPIPROD

);

//

q=

path

counts

to

nodes

reachable

from

current

level

GP

Ifilter(&

q,G

PI

zero,&

p,&

q);

//

filter

out

nodes

alr

eady

visited

GPI

zip

(&p,GPISUM

,&p,&

q);

//

accum

ula

te

path

counts

on

this

level

GPI

Scala

rsu

m;

GPI

reduce(&

sum

,GPISUM

,G

PI

zero,&

q);

//

sum

path

counts

at

this

level

done

=(su

m==

GPI

zero);

//

if

sum

is

zero

,we

are

done

d++;

}w

hile

(!done);

/∗ ∗

BC

com

putation

phase

∗/

GPI

Vector

t1;

GPI

Vector

new(&

t1,G

PI

fp32,n);

//

tem

porary

vectors

for

com

putation

GPI

Vector

t2;

GPI

Vector

new(&

t2,G

PI

fp32,n);

GPI

Vector

t3;

GPI

Vector

new(&

t3,G

PI

fp32,n);

GPI

Vector

t4;

GPI

Vector

new(&

t4,G

PI

fp32,n);

for(int

i=d−

1;

i>

0;

i−−)

{G

PI

replicate(&

t1,

GPI

one);

//

add

1to

delta

GPI

zip

(&t1,GPISUM

,&t1,delta);

GPI

Vector

sig

mai;

GPI

Matrix

row(&

sig

mai,&

sig

ma,i);

//

divide

it

by

sig

ma

for

this

level

GPI

zip

(&t2,GPIDIV

,&t1,&

sig

mai);

GPIm

xv(&

t3,A

,&t2,GPISUM

,GPIPROD

);

//

for

each

node,

add

contributions

made

by

successors

GPI

Vector

sig

maim

1;

GPI

Matrix

row(&

sig

maim

1,&

sig

ma,i−

1);

//

multiply

by

corresponding

sig

ma

GPI

zip

(&t4,GPIPROD,&

sig

maim

1,&

t3);

GPI

zip

(delt

a,GPISUM

,delt

a,&

t4);

//

accum

ula

te

into

delta

} GPI

Matrix

dele

te(&

sig

ma);

GPI

Vector

dele

te(&

p);

GPI

Vector

dele

te(&

q);

GPI

Vector

dele

te(&

t1);

GPI

Vector

delete(&

t2);

GPI

Vector

delete(&

t3);

GPI

Vector

dele

te(&

t4);

return

0;

}

26

Page 28: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

E Example implementation of Prim’s minimum spanning tree#

inclu

de

<stdio.h>

#in

clu

de

<stdlib

.h>

#in

clu

de

”gpi.h”

GPI

Scala

rm

innz(G

PI

Scala

rx,

GPI

Scala

ry)

{if

(x

==

0)

return

y;

if

(y

==

0)

return

x;

if

(x

>y)

return

y;

} GPI

int32

MST(G

PI

Vector∗p,

GPI

Matrix∗A

)/∗ ∗

Giv

en

aweighted

adjacency

matrix

A,

com

pute

the

min

imum

spanning

tree

p∗/

{G

PI

siz

en;

GPI

Matrix

nrows(&

n,

A);

//

n=

num

ber

of

vertices

in

graph

GPI

Vector

s;

GPI

Vector

new(&

s,

GPI

int32

,n);

//

vertex

iis

in

the

spanning

tree

iff

s[i]

=1

GPI

replicate(&

s,0);

GPI

Vector

setEle

ment(&

s,

0,

1);

//

vertex

0is

the

root

of

the

spanning

tree

GPI

replicate(p,0);

//

if

(p[i]>

0)

then

arc

p[i]−

>i

is

in

the

spanning

tree

GPI

Vector

vi;

GPI

Vector

new(&

vi,

GPI

int32

,n);

GPI

Vector

vx;

GPI

Vector

new(&

vx,

GPI

int32

,n);

GPI

Vector

vv;

GPI

Vector

new(&

vv,

GPI

int32

,n);

for(int

i=

1;

i<n;

i++)

{G

PI

Matrix

M;

GPI

Matrix

new(&

M,

GPI

int32

,n,

n);

for(int

j=

0;j<n;j++)

//

multiply

each

colu

mn

of

Aby

sto

rem

ove

{//

all

weights

of

arcs

not

originating

from

sG

PI

Vector

Mj;

GPI

Matrix

col(&

Mj,

&M

,j);

GPI

Vector

Aj;

GPI

Matrix

col(&

Aj,

A,

j);

GPI

zip

(&M

j,GPIPROD,&

s,&

Aj);

} for(int

k=

0;k

<n;k++)

//

for

each

colu

mn

kof

M{

GPI

Vector

Mk;

GPI

Matrix

col(&

Mk,

&M

,k);

GPI

Scala

rvik

;G

PI

Scala

rvxk;

GPI

reduce(&

vik

,m

innz

,0,&

Mk);

//

find

the

min

imum

non−

zero

valu

ein

that

colu

mn

GPI

reduce(&

vxk,GPILOC

,vik

,&M

k);

//

and

its

location

(row

index)

if

(vik

==

0)

vxk

=0;

GPI

Vector

setEle

ment(&

vi,k,vik

);

//

vi[k]

=min

imum

non−

zero

of

colu

mn

M[:,k]

GPI

Vector

setEle

ment(&

vx,k,vxk);

//

vx[k]

=row

index

of

that

min

imum

} GP

Ifilter(&

vv,0

,&s,&

vi);

//

get

the

vector

of

valu

es

GPI

Scala

rm

in,

j;

GPI

reduce(&

min

,m

innz,0

,&vv);

//

get

the

min

imal

valu

ein

vv

GPI

reduce(&

j,GPILOC

,m

in,&

vv);

//

jis

the

destination

of

the

selected

arc

//

(associated

to

the

min

imal

valu

ein

vv)

GPI

Scala

ri;

GPI

Vector

getEle

ment(&

i,&

vx,j);

GPI

Vector

setEle

ment(&

s,j,1);

GPI

Vector

setEle

ment(p,j,i);

//

iis

the

source

of

the

selected

arc

} GPI

Vector

dele

te(&

s);

GPI

Vector

dele

te(&

vv);

GPI

Vector

delete(&

vi);

GPI

Vector

delete(&

vx);

return

0;

}

27

Page 29: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

F Example implementation of strongly connected components#

inclu

de

<stdio.h>

#in

clu

de

”gpi.h”

GPI

int32

SCC(G

PI

Vector∗

q,

GPI

Matrix∗

A,

GPI

Vector∗b,

GPI

int32

prefix)

/∗ ∗

Giv

en

an

adjacency

matrix

A,

avector

brepresenting

asubgraph

(b[i]

==

true

means

vertex

iis

present

∗in

the

subset),

and

an

integer

prefix

,returns

avector

qof

labels

for

the

subgraph

∗so

that

all

nodes

belo

nging

to

astrongly

connected

com

ponent

have

aunique

label

∗/

{G

PI

Vector

replicate(q,0);

GPI

siz

en;

GPI

Matrix

nrows(&

n,

A);

//

n=

num

ber

of

vertices

in

graph

int

k=−

1;

GPI

Scala

rfo

und;

//

find

ksuch

that

b[k]

==

true

do{

k++;

GPI

Vector

getEle

ment(&

found

,b,k);}

while

((!fo

und)

&&

(k

<n−

1));

if

(!fo

und)

return

0;

GPI

Vector

p0;

GPI

Vector

new(&

p0,G

PI

bool,n);

//

p0

=predecessors

of

kG

PI

predecessors(&

p0,A

,k);

GPI

Vector

s0;

GPI

Vector

new(&

s0,G

PI

bool,n);

//

s0

=successors

of

kG

PI

successors(&

s0,A

,k);

GPI

Vector

p;

GPI

Vector

new(&

p,G

PI

bool,n);

//

p=

p0

&b

GPI

Vector

s;

GPI

Vector

new(&

s,G

PI

bool,n);

//

s=

s0

&b

GPI

filterneg(&

p,0,b,&

p0);

GPI

filterneg(&

s,0,b,&

s0);

GPI

Vector

bset;

GPI

Vector

new(&

bset,G

PI

bool,n);

//

bset

=vertices

that

are

both

GPI

zip

(&bset,GPILAND,&

p,&

s);

//

predessors

and

successors

of

k

GPI

Vector

pset;

GPI

Vector

new(&

pset,G

PI

bool,n);

//

pset

=vertices

that

are

only

GP

Ifilter(&

pset,0

,&bset,&

p);

//

predecessors

of

k

GPI

Vector

sset;

GPI

Vector

new(&

sset,G

PI

bool,n);

//

sset

=vertices

that

are

only

GP

Ifilter(&

sset,0

,&bset,&

s);

//

successors

of

k

GPI

Vector

blabels;

GPI

Vector

new(&

blabels,G

PI

int32

,n);

//

label

the

nodes

in

common

GPI

Vector

replicate(&

blabels,1

+prefix∗10);

GPI

filterneg(&

blabels,0

,&bset,&

blabels);

GPI

Vector

plabels;

GPI

Vector

new(&

plabels,G

PI

int32

,n);

//

com

ponents

in

predecessors

subgraph

SCC(&

plabels,A

,&pset,2

+prefix∗10);

GPI

Vector

slabels;

GPI

Vector

new(&

slabels

,G

PI

int32

,n);

//

com

ponents

in

succcessors

subgraph

SCC(&

slabels

,A,&

sset,3

+prefix∗10);

GPI

zip

(q,GPISUM

,&blabels,&

plabels);

//

SCC

labels

of

all

nodes

GPI

zip

(q,GPISUM

,q,&

slabels);

GPI

Vector

delete(&

p);

GPI

Vector

delete(&

s);

return

0;

}

28

Page 30: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

G Graph generation

When studying and developing graph algorithms, it is desirable to test them with graphs of con-trolled size and characteristics. This is usually done through synthetic graphs produced throughwell defined processes. This section covers two commonly used graph generation algorithms: thepreferential attachment algorithm, known to generate true power law distributions for vertices in-degree, and the RMat algorithm, known to offer control over the existence of communities in agraph.

G.1 The preferential attachment algorithm

Preferential attachment is a generative approach in which the end point of a new edge, or of anexisting edge being moved, is selected from the existing vertices with a probability proportional tothe existing vertices in-degree. Intuitively, it makes the highly connected vertices even more highlyconnected, giving rise to the power law distribution for the vertices in-degree, resulting in scale-freegraphs.

As described in [2], the algorithm takes four parameters: m0, m, p, and q. Parameter m0 is thestarting number of disconnected vertices. The algorithm then iterates on the following procedureuntil the desired number of vertices or edges has been reached. In each iteration, one of the followingthree actions is performed, with probabilities p, q, and 1− p− q respectively.

1. A source vertex is chosen from the existing vertices randomly (uniform distribution oververtices), and an edge is added to a target vertex selected preferentially (the probably distri-bution function is in-degree of the vertex divided by the total number of edges in the graph).This is repeated m times for each iteration.

2. A source vertex and one of its edges are chosen randomly, and the target vertex of that edgeis changed to a new target vertex chosen preferentially. This is repeated m times for eachiteration.

3. A new vertex is created and m edges from it are added to one of the existing vertices selectedpreferentially.

G.2 Implementation of preferential attachment algorithm

The key challenge in implementing the preferential attachment algorithm is the selection of atarget vertex with probability proportional to its in-degree. Conceptually it is accomplished bymaintaining an array ci, 0 ≤ i ≤ n, where n is the number of vertices, defined as:

c0 = 0, (1)

ci>0 =i−1∑j=0

dj . (2)

29

Page 31: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

Where dj is the degree for vertex j for 0 ≤ j < n, A random number x is taken using the uniformdistribution over the range [0, n). Vertex i is selected as the target of a new edge if ci ≤ x < ci+1. Anaive implementation of this approach would have quadratic complexity. So instead of maintainingthe ci explicitly we maintain a binary tree of partial sums. The leaf nodes of the tree are thein-degree values and each internal node has the sum of the in-degrees of all its leaf nodes, orequivalently, the sum of the values of its children. The binary tree is represented as a completebinary tree and therefore traversal towards the root or towards the leaf can be done with simpleshift operation and increments.

2 3 3 2 4

5

10

5

14

4

4

6

6

1

1

(a) (b)

Figure 4: Probability for edge attachment (a), and partial sum tree (b).

Figure 4(a) illustrates a five node graph and a new shaded node from which we need to insert anedge to an existing node. The binary tree of partial sums is illustrated in Figure 4(b). Leaf nodesof the tree, in rectangles, correspond to the nodes of the graph, with their degree written insidethe rectangle. Internal nodes have the sum of the degrees of all their leaf nodes written inside thecircle. The dashed lines in Figure 4(a) illustrate the probability of attaching to various nodes.

To find the vertex i such that ci < x < ci+1, starting with an x taken from the uniformdistribution over the range [0, n) we start at the root and recursively take the left or right edgeuntil we reach a leaf. The left edge is taken if x is less than the value of the node, otherwise the rightedge is taken and value of x is decremented by the value of the left child. Figure 4(b) illustratesthe selection of vi when x = 6. The value of x is in italics outside the nodes and the path takenfrom root to leaf is shown in bold edges.

When a new edge is added to an end-vertex, we increment the value in the leaf node of theend-vertex of the edge, and all internal nodes on the path from the leaf to the root, includingthe root, in the tree of partial sums. Similarly, when an edge is removed from an end-vertex, wedecrement the value in the leaf node of the end-vertex of the edge, and all internal nodes on thepath from the leaf to the root, including the root, in the tree of partial sums.

30

Page 32: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

G.3 The RMat algorithm

The RMat algorithm [3] offers better control over the existence of communities in a graph. Thenumber of nodes is assumed to be an integral power of 2. It is a three parameter algorithm: A0,B0, and C0. A derived parameter D0 = 1−A0−B0−C0 is also used. For most use cases B0 = C0.The algorithm starts with an empty adjacency matrix and adds edges (entries in the adjacencymatrix) until the desired number of edges have been added. The row and column indices for theedge to be inserted into the adjacency matrix are computed by recursively choosing the north-west,north-east, south-west and south-east quadrants with probabilities Ai, Bi, Ci, and Di, respectively,where i is the depth of recursion. The probabilities are computed as follows. Let X stand for oneof A, B, C, or D. We first compute

Xi+1 = Xi · (0.95 + 0.1 · rand(0, 1)), X ∈ {A,B,C,D}, (3)

where rand(x, y) produces a uniformly distributed random number in the range [x, y). We thennormalize the intermediate results so that the probabilities add to 1:

Xi+1 =Xi+1

Ai+1 +Bi+1 + Ci+1 +Di+1, X ∈ {A,B,C,D}. (4)

A0

B0

C0. A

1

D0

C0. B

1. C

2

Figure 5: RMat edge insertion example.

Figure 5 illustrates the insertion of edge (5,2) as a result of quadrant selection outcomes asso-ciated with probabilities C0, B1, C2.

The sequence of Xi generated in the above procedure can be viewed as columns of a 2× log2NBoolean matrix. Then the rows correspond to the source and destination addresses of the edge tobe inserted in a N node graph. Therefore the source and destination addresses can be computedby drawing Xi from uniform [0,3] distribution and left shifting its upper and lower bits into thesource and destination addresses respectively, the two addresses having been initially set to 0.

31

Page 33: IBM Research Reportdomino.research.ibm.com/library/cyberdig.nsf/... · and high performance across a wide spectrum of computing platforms. The application developer ... must follow

Sn-1

Sn-2

Si

S1

S0

Dn-1

Dn-2

Di

D1

D0

Source Address

Destination Address

Xi, the two bits

defined in ith iteration

1 0 1

0 1 0

Source Address

Destination Address

Figure 6: RMat selection of a node pair.

The selection of C0, B1, C2 in Figure 5 corresponds to the random draws for Xi being 2, 3and 2 as illustrated in Figure 6, and this draw of Xi corresponds to source and destination addressbeing 5 and 2 respectively.

32