Pivot Tracing: Dynamic Causal Monitoring for...

Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems

Student: Hunter Ingle

Original Paper

Jonathon Mace, Ryan Roelke, and Rodrigo Fonseca. “Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems.” In ACM Transactions on Computer Systems, Vol. 35, Issue 4, pp 11:1-11:28. December 2018.

Distributed Systems

• Several machines (nodes) working together to perform a certain task

• Great for large scale data processing and parallel processing• However, some problems and issues exist

§ Data encryption and transmission§ Fault detectionoMonitoring and troubleshooting functionality

Main Problem

• Monitoring and troubleshooting distributed systems is both hard and time-consuming§ Hardware and software failures§ Misconfigurations across the system§ Unrealistic expectations

• Current tools:§ Logs, counters, and metrics

• Limitations:§ Recorded at deployment (a priori)oMay not always contain necessary information

§ Captured via components or machinesoDifficult to correlate between them

Solution: Pivot Tracing

• Combines dynamic instrumentation with causal tracing• Provides metrics at any one point of a system• Selects, filters, and groups events for other points• Allows crossing component and machine boundaries, as

mentioned before

Four Contributions

• “Happened-before join”• Query optimization for the join at runtime by combining

dynamic instrumentation and causal tracing• Prototype implementation of Pivot Tracing

§ Applied to Hadoop distributed system framework• Evaluation based on diagnosing problems at runtime

Overview

• Requirements in a system:§ Dynamic code injection§ Causal metadata propagation

• Based on tracepoints, where PT can insert instrumentation§ Instructions based on the location and methods needed for

changing the system

Overview Cont.

• Queries (1) – sent into DS via dataset (2) and utilize a vocab defined by tracepoints

• Queries are compiled into advice (3) – instruction set that processes queries

• Advice is mapped to code that PT injects to tracepoints(4). Each time execution reaches the tracepoint the advice is called as well.

Overview Cont.

• The happened-before join is based on advice in a tracepoint delivering information through the execution path to advice in other tracepoints. This uses causal metadata propagation known as baggage (5).

• Advice can also carry tuples (6) that are aggregated and sent to the client via a message bus (7) and (8)

Design

• Tracepoints§ Act as an entry point to the system for PT queries§ Based on some eventoRequests, I/O operation completion, etc.

§ Only reference the locations for entry points, so not defined or limited by a priori modifications

§ Only compiled and installed at runtime whenever a query is sent

§ When request reaches the tracepoint, instrumentation at that point is called, exporting some variables needed for a tupleoHost, timestamp, process ID, process name, and

definition

Tracepoint Example

Design – Happened-Before Joins

• Allows tuples from different PT queries to be joined based on the “happened before” relation. § If a and b are events on the same process and a happens

first, then a -> b and vice versao Same if the event a was the cause of event b

§ Based on Lambert’s study on timing in systems (see references slide)

• The join is based on queries and their effects on events• Using the join between two queries ( ) results in the

tuples t1 and t2 where all t1 in Q1 happened before t2 in Q2 (t1->t2)

• Provides insight into the relationships between events being monitored in a system

Design – Advice

• Intermediate representation of PT queries• Determines the operations to be performed at tracepoints

and provides monitoring code to be installed at those points• Based on an advice API that can

Design – Advice Cont.

Optimization - Baggage

• Optimizes happened-before joins during request execution• Container for an instance of a tuple

§ Propagated with a request through thread, application, and machine boundaries

§ Also observes the happened-before events of the request

Attached to Hadoop

• Previously mentioned design was attached to multiple aspects of the Hadoop framework§ HBase – non-relational database that runs atop the HDFS

• Extended Hadoop functionality and protocols to allow baggage and tracepoints§ Tracepoints were implemented inoDataNode’s DataTransferProtocoloNameNode’s ClientProtocol

Evaluation

• Experiment led to finding a bug in the HDFS for uneven load

distribution for replicated data

§ 8 DataNode and 1 NameNode cluster on HDFS

§ Used 96 stress test clients to determine high load levels on

two hosts and almost zero loads on the rest for replication

factor of 3

§ Replica Selection Bug (HDFS-6268) has since been fixed in

subsequent Hadoop release

Evaluation Cont.

• Based on experiments with the PT-infused Hadoop framework, results show:§ It is dynamic and extensible§ It is scalable with low overhead§ It allows cross-boundary analysis§ It uses event causality for diagnosis of errors§ It provides analysis even with minimal tracepoints

• However, it is not meant to replace all functionality of logs§ Security auditing§ Forensics§ Debugging

Conclusion

• Pivot Tracing is the first monitoring system of its kind§ i.e. combining dynamic instrumentation and causal tracing§ Provided a happened-before join to boost efficiency of both

• Low overhead for cross-boundary analysis§ Extremely effective method for error diagnosis within

distributed systems

References

• Leslie Lamport. 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (1978), 558–565.

Pivot Tracing: Dynamic Causal Monitoring for...

Documents

Transcript of Pivot Tracing: Dynamic Causal Monitoring for...

PIVOT POINT

Tracing 1. Contents Why Tracing Why Tracing Tracing in ASP.NET Tracing in ASP.NET Page Level tracing Page Level tracing Application.

Performance Optimization for Short MapReduce Job Execution ...myweb.astate.edu/dhkim/seminar/hunter_performance.pdf · •Default heartbeat value is 3 seconds in Hadoop (can be extended

Pivot Help - Pivot Animator

Enplug pivot

Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems ...

Pivot Yourself

Pivot Sampling in Dual-Pivot Quicksort

Top Pivot Assembly Side Plates Base Pivot Mechanism

Electric Fence, Ear Tags, Sheep & Goat Equipment, Clippers and … · Centre pivot bolt. Centre pivot nut. Actuating pivot bolt Actuating pivot nut Hexagon headed eccentric bush Centre

The Pivot before the Pivot The Pivot before Nina Silove ...

learnwithbijay.files.wordpress.com€¦ · Web viewSample Pivot Table 1. Sample Pivot Table 2. Sample Pivot Table 3. Sample Pivot Table 4. Sample Pivot Table 5

PIVOT TABLE CHEAT SHEET - d3ptjqwi47yj0c.cloudfront.netd3ptjqwi47yj0c.cloudfront.net/.../Pivot-Table-Cheat... · pivot table options des/activate getpivotdata summarize values by

APP.039589.01.01 - Response to request for further ......Pivot A Pivot A-centre Pivot A-N Pivot A-S Houhora 70 300,000 Pivot B Pivot B Houhora 40 180,000 Pivot C Pivot C-E Pivot C-W

Pivot Tracing PROGRAMMING - Brown Universityjcmace/articles/login_spring16_04... · 2017-05-22 · collect arbitrary statistics from one point in the system while being able to select,

Pivot Range

COVID-19 Contact-Tracing Mobile Apps: Evaluation …of contact-tracing tools offers an opportunity to sharply pivot to solutions using privacy-first principles and collaborative, open-source

Pivot and Pivot Sets - ASSA ABLOY Door Security …extranet.assaabloydss.com/library/catalogs/Rixson/pdf/web_44041...Pivot and Pivot Sets Architectural Solutions for ... Reproduction

Pivot storyboard

Pivot Table Training. Agenda Purpose of a Pivot Table Creating a Pivot Table: Count Tailoring Your Information Cloning Pivot Tables Behind the Scenes: