libperf
description
Transcript of libperf
libperf
libperf provides a tracing interface into the Linux Kernel Performance Counters (LKPC) subsystem recently introduced into the Linux Kernel mainline. This interface provides a unified API abstracting hardware-based performance counters, kernel trace points, and software-defined trace points. The counters are maintained by the kernel and maintain statistics per thread and per core. All counters are “virtual” 64-bit integers and are accessed via special file descriptors obtained from the kernel within libperf.
Features and Highlights
• System Call Wrapper Library• First API for LKPC• First User Space Library Interfacing with LKPC• Simple C API – 2 Calls Required by Default• Efficient Kernel Implementation• Low Overhead• Feasible for Dynamic Feedback• Preparing for Open Source GPLv2 Release
Code Example
… /* start of tracing */struct perf_data* pd = libperf_initialize(-1,-1);
… /* do work */libperf_finalize(pd, UUID);… /* end of tracing */
Performance Overhead
• Evaluated Using sysbench• 10 Runs Averaged on an Intel Centrino 2• Overhead Significant for Threading (Context Switching)
• Worst Case: 3.63 %• Average Case: 3.25 %• Best Case: 2.87 %
LightSpeed: Thread Scheduling for Multiple CoresKarl Naden ([email protected]) Wolfgang Richter ([email protected]) Ekaterina Taralova ([email protected])
Introduction
Parallel applications have a hard time taking advantage of specifics of hardware. Operating Systems have greater knowledge of the hardware, but lose application-specific data. Solutions cutting across the stack from software to hardware may offer compelling paths in the future.
Approach
Provide the application layer more control over scheduling tasks and provide detailed information about hardware performance to make informed decisions based on application knowledge.
Questions:
1. How could statistics about the underlying architecture’s performance be delivered efficiently to applications?
2. How could applications take advantage of this additional information?
Target Workload
Overview
• Machine Learning parallel algorithm framework• Tailored to iterative algorithms on graph data structures
GraphLab Key Components and Inputs
Why GraphLab?
• Existing parallel scheduling problem• Specific problem formulation (graphs)• Significant variation in algorithms gives potential for generality
References
• Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin. GraphLab: A New Framework for Parallel Machine Learning. Conference on Uncertainty in Artificial Intelligence (UAI), 2010
Scheduling
Consistency Model
Data Graph Update Functions
Shared Data Table