Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static...

31
Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben Bergen Applied Computer Science (CCS-7) Los Alamos National Laboratory

Transcript of Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static...

Page 1: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for DOE/NNSA

DC Reviewed by Kei Davis

SKA – Static Kernel Analysis using LLVM IR

Kartik Ramkrishnan and Ben BergenApplied Computer Science (CCS-7)

Los Alamos National Laboratory

Page 2: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

SKA – Static Kernel Analyzer SKA is a very useful tool to improve the

development process. Performs static architecture aware analysis of

kernels. Outputs code metrics during the development

process. Visualizes the code execution on the specified

pipeline.

What is SKA

Slide 2

Page 3: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

SKA-Enhanced Development Cycle

Slide 3

Page 4: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

define i32 @main(i32 %argc, i8** nocapture %argv) nounwind uwtable readnone { entry: %a1 = alloca [32 x float], align 4 %b2 = alloca [32 x float], align 4 %c3 = alloca [32 x float], align 4 br label %"3" "3": ; preds = %"3", %entry %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %"3" ] %0 = getelementptr [32 x float]* %a1, i64 0, i64 %indvars.iv %1 = load float* %0, align 4 %2 = getelementptr [32 x float]* %b2, i64 0, i64 %indvars.iv %3 = load float* %2, align 4 %4 = getelementptr [32 x float]* %c3, i64 0, i64 %indvars.iv %5 = load float* %4, align 4 %6 = fmul float %3, %5 %7 = fadd float %1, %6 store float %7, float* %4, align 4 %indvars.iv.next = add i64 %indvars.iv, 1 %lftr.wideiv = trunc i64 %indvars.iv.next to i32 %exitcond = icmp eq i32 %lftr.wideiv, 32 br i1 %exitcond, label %"5", label %"3" "5": ; preds = %"3" ret i32 0

Example kernel – saxpy.ll

Slide 4

Page 5: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

LLVM IR is SSA (single static assignment) which has infinite register count.

ISAs(instruction set architectures) have a limited number of registers.

We improve SKA’s fidelity by allocating registers to the IR based on the target ISA.

Register allocation support for SKA

Slide 5

Page 6: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Simple register allocation algorithm.

Register Allocation algorithm

Slide 6

Page 7: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Build Liveness Tables

Slide 7

Page 8: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

SKA takes an LLVM IR module as input and builds a liveness table.

Build Liveness Tables

Slide 8

Partial liveness table for saxpy.ll

Page 9: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Build Liveness Tables

Slide 9

Top level loop

Single BB liveness calculation

Populate liveness table

Page 10: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Build Interference Graph

Slide 10

Page 11: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Traverse the liveness table to create the interference graph.

Build Interference Graph

Slide 11

Partial igraph for saxpy.ll

Page 12: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Build Interference Graph

Slide 12

Top level loop Populate igraph

Page 13: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Simplify Interference Graph

Slide 13

Page 14: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Populate a stack which records whether a register (node) is simple or not.

Simplify Interference Graph

Slide 14

Partial node stack for saxpy.ll

Page 15: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Simplify Interference Graph

Slide 15

Populate simple node stack

Page 16: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Assign ISA Registers to IR

Slide 16

Page 17: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Assign ISA Registers to IR

Slide 17

Assign ISA registers to IR, if no true spill. We choose between int, float and vector.

Partial register allocation for saxpy.ll

Page 18: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Assign ISA registers to IR

Slide 18

Assign register if no true spill

Page 19: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Rewrite IR

Slide 19

Page 20: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

The live range of %a1 is shown in red. It reduces after rewriting the IR.

Rewrite IR

Slide 20

Page 21: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Rewrite IR

Slide 21

Store instruction into stack

Load, use and store

Page 22: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Register allocation done !

Slide 22

Page 23: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Specified in an xml file. Specifies logical units, instructions they process,

latencies, issue width …

Virtual architecture specification

Slide 23Partial architecture example

Page 24: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Pipeline simulation

Slide 24

Pipeline simulation of saxpy.ll

Page 25: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Skaview

Slide 25

Graphical visualization of saxpy.ll

Page 26: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

SKA outputs useful metrics about the code.

Primitive statistics include basic performance counters, such as instructions, cycles and stalls.

Derived statistics are obtained from primitive statistics.

Code metrics

Slide 26

Page 27: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

CPI prediction is better after register allocation.

Results for residual.ll

Slide 27

Page 28: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

No change in CPI prediction. Why ?

Results for ef_operator.ll

Slide 28

Page 29: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Predicts CPI > 1.0 for KNC for single threaded workloads.

Results for KNC (Knights corner)

Slide 29

Page 30: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

SKA now supports register allocation. Register allocation improves SKA’s fidelity by 5-

10% across three architectures for a compute intensive benchmark.

Dynamic scheduling and cache models can further improve SKA fidelity.

Conclusion

Slide 30

Page 31: Operated by Los Alamos National Security, LLC for DOE/NNSA DC Reviewed by Kei Davis SKA – Static Kernel Analysis using LLVM IR Kartik Ramkrishnan and Ben.

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Questions ?

Thank You !

Slide 31