Scheduling in Staged- DB Systems

21
Staged-DB IC-65 Advances in Data Management Systems 1 Scheduling in Staged- DB Systems Nicolas Bonvin, Rammohan Narendula, and Surender Reddy Yerva

description

Scheduling in Staged- DB Systems. Nicolas Bonvin, Rammohan Narendula, and Surender Reddy Yerva. Organization. What is Staged-DB? Scheduling in Staged-DB Our Contribution Scheduling in Execution Phase System Modeling System Design Details Performance Study Future Work. Motivation. - PowerPoint PPT Presentation

Transcript of Scheduling in Staged- DB Systems

Page 1: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 1

Scheduling in Staged- DB Systems

Nicolas Bonvin, Rammohan Narendula, and Surender Reddy Yerva

Page 2: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 2

Organization

• What is Staged-DB?

• Scheduling in Staged-DB

• Our Contribution

–Scheduling in Execution Phase

–System Modeling• System Design Details

• Performance Study

• Future Work

Page 3: Scheduling in Staged- DB Systems

MotivationResponse time: time needed to produce the first page as output

Big advantage for the overlapping case ('1')

Page 4: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 4

QueryPARSER

OPTIMIZER

EXECUTION

Answer

Querytree

Queryplan

Data

catalogs and

statisticsoperators

Query Lifetime in DBMS

EXECUTION(Disk-IO) : 90% OF TIME

Page 5: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 5

DB Paradigm So Far..

• Query Query Execution Plan (Tree of Operators)• Multiple Queries

– Each query handled by a DIFFERENT THREAD• No cross communication/sharing across threads • Sharing Opportunity is missed

DBMS

thread pool

xno

coordination

D

CD

C

One Query Multiple Operators

Page 6: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 6

Staged-DB Paradigm

• DB is remodeled as various stages

• Stage

– “Common execution logic” grouped into a stage

– Each operator in QEP can be seen as a stage

• Query passed through all the needed stages to get an output

• Common Data needs Detected by the Stage

DBMS

thread pool

D

CD

C

StagedDB

One Operator Multiple queries

Page 7: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 7

Staged Database Systems

• DB Stages ; Execution Stage microEngine• Each Stage has a queue, Also each microEngine has a request queue.

DBMS

queries

Stage 3

Stage 2

Stage 1

StagedDB

queries

Conventional

High concurrency locality across requests

Page 8: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 8

Scheduling In Staged-DB

• Scheduling at Different levels– Stages (Parser, Optimizer, Execution)– Across MicroEngines (Execution Engine has

SCAN,JOIN etc micro-engines) – Within MicroEngine

• We Consider only scheduling “across microEngines”

• Scheduling Policies:– Round-Robin– Heavy Load First– Light Load First

Page 9: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 9

Detailed System Design

• Based on Discrete Event Simulation technique• All the computation, data needs, dependencies

are modeled using events• System components

– Global System Queue– Dispatcher– Operator (or) Engine – Global Scheduler– Main Memory– Overlap Detector

Page 10: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 10

QueryArrival

Dispatcher

Scheduler

Disk-Fetch

EngineInsert

EngineExec-Begin

EngineExec-EndMemory

Global System Queue

event

eventId

componentId

functionId

firingTime

packet

Page 11: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 11

Engine

EngineInsert

EngineExecution Begin

EngineExecution End

Input Packet Queue

Packet format

queryId list

queryPlans

pageId

contextInfo

Request packet from parent node/ dispatcher

Call Overlap detector

Insert packet

Pick packet from Q

Send packet to

Child OR execute and produce output

Insert event into

Event queue for the scheduler

Page 12: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 12

Engines

• Join• Sort• Aggregation• Scan• Wait and Scan• Index Scan

Page 13: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 13

Overlap detection

• With memory• With input queue• Two types

– Linear– Spike

Page 14: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 14

Memory Manager

• Pinning and unpinning• Put()• pageExists()• consumePage()

Page 15: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 15

Performance study

• 5 queries• 5 runs• Uniform arrival rate

Page 16: Scheduling in Staged- DB Systems

Effect of OverlappingResponse time: time needed to produce the first page as output

Big advantage for the overlapping case ('1')

Page 17: Scheduling in Staged- DB Systems

Effect of OverlappingMemory consumption: max # of pages consumed in memory during the life time of the query

Higher memory consumption with Overlapping !

Page 18: Scheduling in Staged- DB Systems

Effect of OverlappingThroughput: # of queries completed in a unit of time

Clear advantage with Overlap detection !

Page 19: Scheduling in Staged- DB Systems

Comparing scheduling policiesMean response time

Round Robin seems to perform a little better

Page 20: Scheduling in Staged- DB Systems

Comparing scheduling policiesMemory consumption

No differences !

Page 21: Scheduling in Staged- DB Systems

Staged-DB IC-65 Advances in Data Management Systems 21

Future Work

• Few more interesting global scheduling policies are possible.

• The system did not consider a local scheduling policy to pick one packet among many in the input packet queue, for processing next. It picks the fist packet in the queue at the moment.

• Regarding implementation, experimentation should be done with more Engines and a bench mark style input queries.