Galois Performance Mario Mendez-Lojo Donald Nguyen.

Galois Performance

Mario Mendez-LojoDonald Nguyen

Overview

• Galois system is a test bed to explore opts– Safe but not fast out of the box

• Important optimizations– Select least transactional overhead– Select right scheduling– Select appropriate data structure

• Quantify optimizations on applications

Algorithms

irregularalgorithms

topology

operator

ordering

local computation

reader

general graph

unordered

ordered

1. Barnes-Hut

2. Delaunay Mesh Refinement

3. Preflow-push

MethodologyTh

IdleSerial GC

Compute

• Abort Ratio: Aborted It/Total it

• GC options• UseParallelGC• UseParallelOldGC• NewRatio=1

• Base– Default scheduling, Default graph

• Serial– Galois classes => No concurrency control classes

• Speedup– Best mean performance of a serial variant

• Throughput– # Serial Iterations / time

Numbers

• Runtime– Last of 5 runs in same VM– Ignore time to read and construct initial graph

• Other statistics– Last of 5 runs

Test Environment

• 2 x Xeon X5570 (4 core, 2.93 GHz)• Java 1.6.0_0-b11• Linux 2.6.24-27 x86_64• 20GB heap size

BARNES-HUT

Most Distant Galaxy Candidates in the Hubble Ultra Deep Field

Barnes-Hut• N-body algorithm

– Oct-tree acceleration structure– Serial

• Tree build, center of mass, particle update

– Parallel• Force computation

• Structure– Reader on tree

• Variants– Splash2, Reader Galois

Reader Optimization

child = octree.getNeighbor(nn, 1);

child = octree.getNeighbor(nn, 1, MethodFlag.NONE);

ParaMeter Profile

Barnes-Hut Results

100,000 points, 1 time step

Best serial: baseSerial time: 10271 msBest // time: 1553 msBest speedup: 6.6X

Barnes-Hut Results

100,000 points, 1 time step

Best serial: baseSerial time: 10271 msBest // time: 1553 msBest speedup: 6.6X

Barnes-Hut Scalability

DELAUNAY MESH REFINEMENT

Delaunay Mesh Refinement

• Refine “bad” triangles– Maintained in worklist

• Structure– Cautious operator on graph

• Variants– Flag optimized, locallifo

base: Priority.defaultOrder()

local lifo: Priority.first(ChunkedFIFO.class). thenLocally(LIFO.class)

Cautious Optimization

mesh.contains(item);...

mesh.remove(preNodes.get(i));...

mesh.add(node);

mesh.contains(item, MethodFlag.CHECK_CONFLICT);...

mesh.remove(preNodes.get(i), MethodFlag.NONE);...

mesh.add(node, MethodFlag.NONE);

• No need to save undo info• Only check conflicts up to first write

LIFO Optimization

GaloisRuntime.foreach(...,

Priority.defaultOrder());

GaloisRuntime.foreach(...,

Priority.first(ChunkedFIFO.class).thenLocally(LIFO.class));

ParaMeter Profile

DMR Results

0.5M triangles, 0.25M bad triangles

Best serial: locallifo.flagoptSerial time: 17002 msBest // time: 3745 msBest speedup: 4.5X

PREFLOW-PUSH

Preflow-push

• Max-flow algorithm– Nodes push flow downhill

• Structure– Cautious, local computation

• Variants– Flag optimized, local computation graph

base (discharge): Priority.first(Bucketed.class, numHeight+1, false, indexer). then(FIFO.class)

base (relabel): Priority.first(ChunkedFIFO.class, 8)

Local Computation Optimization

graph = ...

graph = ...b = new LocalComputationGraph.ObjectGraphBuilder();

graph = b.from(graph).create()

ParaMeter Profile

Preflow-push Results

From challenge problem (genmf-wide)14 linearly connected grids(194x194), 526,904 nodes, 2,586,020 edgeshttp://avglab.com/andrew/CATS/maxflow_synthetic.htm

C: 11450 msJava: 30234 ms

Best serial: lc.flagoptSerial time: 57121 msBest // time: 18242 msBest speedup: 3.1X

Preflow-push Scalability

What performance did we expect?Th

IdleSerial GC//Compute Miss-Speculation

Measured Indirectly

Synchronization, …

What performance did we expect?

• Naïve: r(x) = t1 / x

• Amdahl: r(x) = tp / x + ts

t1 = tp + ts

ts = tidle + tgc+ tserial

• Simple: r(x) = (tp (ix / i1)) / x + ts

Barnes-Hut

Delaunay Mesh Refinement

Preflow-push

Summary

• Many profitable optimizations– Selecting among method flags, worklists, graph

variants

• Open topics– Automation– Static, dynamic and performance analysis– Efficient ordered algorithms

Galois Performance Mario Mendez-Lojo Donald Nguyen.

Documents

Transcript of Galois Performance Mario Mendez-Lojo Donald Nguyen.

Galois field

Galois and Hopf-Galois Theory for Associative S-Algebrasediss.sub.uni-hamburg.de/volltexte/2009/4292/pdf/Dissertation... · Galois and Hopf-Galois Theory for Associative S-Algebras

1. Galois groups and Galois representations 2. Geometric ...gaetan.chenevier.perso.math.cnrs.fr › ...lecture2.pdf · 1. Galois groups and Galois representations 2. Geometric Galois

Kevin mendez

Model-theoretic Galois theory · Galois theory Jesse Han Introduction Poizat’s imaginary Galois theory The Lascar group Grothendieck’s Galois theory Internal covers and the Tannakian

Sylvia mendez

Laura Lojo Rodríguez Virginia Woolf's Monday or Tuesday ...babelafial.webs.uvigo.es/pdf/06/art05.pdf · Laura Lojo Rodríguez "Virginia Woolf's "Monday or Tuesday ": An Approach"

Hopf-Galois and Bi-Galois Extensions

Ever mendez

Manolo Mendez

Galois theory for Hopf-Galois extensions · construction involves techniques of lattice theory and of Galois connections. Such a ’Galois Theory’ generalises the classical Galois

Jimenez Mendez

More Correlation-Immune and Resilient Functions over Galois Fields and Galois Rings · 2017-08-25 · More Correlation-Immune and Resilient Functions over Galois Fields and Galois

Baez Mendez

Galois Química

CURRICULUM VITAE ACADÉMICO MARÍA ROSA LOJO CALATRAVA · CURRICULUM VITAE ACADÉMICO.MARÍA ROSA LOJO CALATRAVA. TITULOS OBTENIDOS. Profesora de Enseñanza Secundaria, Normal y Especial

Galois Sections

A Galois theory of commutative rings - CORE · satisfy Galois correspondence theorems which support Galois descent. This generalizes the Galois theory of ﬁelds to a Galois theory

Tony Mendez

Galois Theory - UniTrentodegraaf/galois/galois.pdf · 2019. 9. 6. · Chapter 5 Galois Theory IntheworkofGaloisonrootsofpolynomialsgroupsappearedfortheﬁrsttimeinhistory. Forthis