Service-Oriented Local And Global Visualization with Sorting On-demand for Climate Data
description
Transcript of Service-Oriented Local And Global Visualization with Sorting On-demand for Climate Data
Service-Oriented Local And Global Visualization with Sorting On-demand for Climate Data
Xusheng [email protected]
Huge amount of climate simulation data are collected from different areas (e.g., cities, countries). Climate scientists keep trying to predict the trends of the variation of climate both locally and globally.Exploring visualization of data mining (e.g., histogram) has been used more and more frequently to get a general view ahead of predicting.Climate experts would like to analyze data by navigating among levels of data ranging from the most summarized (drill-up) to the most detailed (drill-down) (e.g., drill-down shown in Figure 1).
Table 1 [1]
Solution 1: Service-Oriented Histogram [2]
Solution 2: On-demand Sorting [3]
http://csc.ncsu.edu/ NCSU Computer Science
Cache data and parameters (min, max, count) locallyIndex data with break number (e.g., 0.5 is in the break [0, 1] )Check whether the data in the requested breaks are sorted or notIf sorted, transfer data directlyIf data is not sorted, sort only the data in the corresponding break and mark the break as sortedTransfer local histogram data (min, max, count) for global computationMerge data from different sources
Table 2
Result
Challenge
References1. http://www.esrl.noaa.gov/psd/psd3/cruises/2. Felix Halim, Panagiotis Karras, and Roland H.C. Yap. 2009. Fast and effective
histogram construction. ACM, New York, NY, USA, 1167-1176.3. C. A. R. Hoare. Quicksort. The Computer Journal, 5(1):10–16, January 1962.
Globally transferring caused problems:Time-consuming (see Table 1)Package Lost during data transfer (see Table 1)
Frequently drill-up and drill-down navigation of data consumes computation resources. (e.g., scanning same data set multiple times see Table 2)
MotivationLocally And Global Visualization
Locally compute min, max, and countTransmitting the local min, max and count to compute global min, max and countEach data sources compute the histogram based on the global min, max and countOnly transferring the computed histogram data, which is much smaller compared to all the climate dataMerge the transmitted histograms to show the global histograms
Figure 1: Drill-down to interval [-1,1]
Here are the raw data in multiple domains have already collected, we can see the latest data sets are all for year 2008.
Data DomainSingle data
set sizeNumber of data sets
Total SizeCollecting Time
In Best Case
VOCALS 2008 ~70000 KB 56 ~3920 MB ~10 Hrs
ASCOS 2008 ~140000 KB 25 ~3500 MB ~10 Hrs
AEROSE 2008 ~ 80000 KB 36 ~2880 MB ~7 Hrs
STRATUS 2007 ~ 70000KB 21 ~1470 MB ~5 Hrs
Data SizeRun Once Histogram
Discovery Histogram Run log(n) Times
User specified 30 Times
~1500 MB 2 Mins ~17 * 2 = 34 Mins 60 Mins
~3000 MB 4 Mins ~18 * 2 = 36 Mins 120 Mins
~4500 MB 6 Mins ~19 * 2 =38 Mins 180 Mins
Total time needed to discovery meaningful or user specified parameters visualization results, we need to speed up those visualization algorithms.
x-value
y-co
un
ts
-40 -35 -30 -25 -20 -15 -10
01
02
03
04
0
x-value
y-co
un
ts
10 12 14 16 18 20
05
10
15
20
25
x-value
y-co
un
ts
30 35 40 45 50
01
23
45
6
x-value
y-co
un
ts
-10 -8 -6 -4 -2 0
05
10
15
20
25
30
x-value
y-co
unt
s
-30 -20 -10 0 10 20 30 40
02
04
06
08
0
x-value
y-co
un
ts
-20 -18 -16 -14 -12 -10
05
10
15
x-value
y-co
un
ts
-35 -30 -25 -20 -15 -10
02
46
81
0
Figure 2: System Framework