Scientific Data Management Research Group National Energy Research Scientific Computing Center, L B...

5
Scientific Data Management Research Group National Energy Research Scientific Computing Center, L B N L 1 Henrik Nordberg, June 1998 Query Estimator Query Estimator Henrik Nordberg [email protected] Lawrence Berkeley National Laboratory

Transcript of Scientific Data Management Research Group National Energy Research Scientific Computing Center, L B...

Page 1: Scientific Data Management Research Group National Energy Research Scientific Computing Center, L B N L 1 Henrik Nordberg, June 1998 Query Estimator Henrik.

Scientific Data Management Research GroupNational Energy Research Scientific Computing Center, L B N L

1Henrik Nordberg, June 1998

Query EstimatorQuery Estimator

Henrik Nordberg

[email protected]

Lawrence Berkeley National Laboratory

Page 2: Scientific Data Management Research Group National Energy Research Scientific Computing Center, L B N L 1 Henrik Nordberg, June 1998 Query Estimator Henrik.

Scientific Data Management Research GroupNational Energy Research Scientific Computing Center, L B N L

2Henrik Nordberg, June 1998

Purpose of the Query EstimatorPurpose of the Query Estimator

• Provide estimates of how “big” a query is

• Execute a query

Primary purpose:

Also needs to handle:

•Indexing of large amounts of data•Simultaneous access

Page 3: Scientific Data Management Research Group National Energy Research Scientific Computing Center, L B N L 1 Henrik Nordberg, June 1998 Query Estimator Henrik.

Scientific Data Management Research GroupNational Energy Research Scientific Computing Center, L B N L

3Henrik Nordberg, June 1998

Query Estimator FunctionsQuery Estimator Functions

• Build bit-sliced index, and tag index

• Accept multiple asynchronous query requests

• Quick query estimate - use bitmap in-memory index

• Full query estimate & execute - use tag index

• Invoke Query Monitor for execution

• Act on “query abort” and “query done”

Page 4: Scientific Data Management Research Group National Energy Research Scientific Computing Center, L B N L 1 Henrik Nordberg, June 1998 Query Estimator Henrik.

Scientific Data Management Research GroupNational Energy Research Scientific Computing Center, L B N L

4Henrik Nordberg, June 1998

•Quick Estimate (use bit-sliced index) (index in memory)• no_of_events: (min,max) -- nearest bin boundaries• no_of_files to be cached• %_of_events_in_files that qualify for a query (max)

• Full Estimate - Execute (use tag index) (Index on disk)• precise list_of_events that qualify • set of (file: event_list)• total_MBs_to_be_moved• no_of_events_in_cache (also files_in_cache)• time_to_process_query

Estimation

Page 5: Scientific Data Management Research Group National Energy Research Scientific Computing Center, L B N L 1 Henrik Nordberg, June 1998 Query Estimator Henrik.

Scientific Data Management Research GroupNational Energy Research Scientific Computing Center, L B N L

5Henrik Nordberg, June 1998

MDC-1 RAM RequirementsMDC-1 RAM Requirements

Assumptions:

•2100,000 events •20 properties (tags)•2 bytes per property

8 MB per index = 16 MB

•2 10,000 average number of hits•100 concurrent queries•8 bytes per query

16 MB for queries

Total = 16 + 16 + 10 (code) = 42 MB