050405 Epa Info System

15
Data Flow and Flow Control in AQ Management Provider Push User Pull Data are supplied by the provider and exposed on the ‘smorgasbord’ However, the choice of data and processes is made by the user Thus, the autonomous data consumers, providers and mediators form the info system Flow of Data Flow of Control AQ DATA METEOROLOGY EMISSIONS DATA Informing Public AQ Compliance Status and Trends Network Assess. Tracking Progress Data to Knowledge ‘Refinery’ The data ‘refining’ process is not a chain but network connection processing nodes. Like on the Internet, new nodes and connections are added continuously Thus, the infosystem needs to support the dynamic addition of new nodes and connections

description

http://capitawiki.wustl.edu/index.php/20050421_Data_Flow_and_Flow_Control_in_AQ_Management

Transcript of 050405 Epa Info System

Page 1: 050405 Epa Info System

Data Flow and Flow Control in AQ Management

Provider Push User Pull

Data are supplied by the provider and exposed on the ‘smorgasbord’

However, the choice of data and processes is made by the userThus, the autonomous data consumers, providers and mediators form the info system

Flow of DataFlow of Control

AQ DATA

METEOROLOGY

EMISSIONS DATA

Informing Public

AQ Compliance

Status and Trends

Network Assess.

Tracking Progress

Data to Knowledge ‘Refinery’

The data ‘refining’ process is not a chain but network connection processing nodes. Like on the Internet, new nodes and connections are added continuouslyThus, the infosystem needs to support the dynamic addition of new nodes and connections

Hence – there is a need for loosely coupled ‘plug-and-play’ architecture

Page 2: 050405 Epa Info System

Next Process

Next Process

Why? How?

When? Where?CATT: A Community Tool!

Part of an Analysis Value Chain

Aerosol Data

Collection IMP. EPA

Aerosol Sensors

Integration VIEWS

Integrated AerData

AEROSOL

Weather Data

Assimilate NWS

Gridded Meteor.

Trajectory ARL

Traject.Data

TRANSPORT

TrajData Cube

Aggreg. Traject.

AerData Cube

CATT

Aggreg.Aerosol

CATT-In CAPITA

CATT-In CAPITA

There!

Not There! Further Analysis

GIS

Grid Processing

Emission

Comparison

Page 3: 050405 Epa Info System

Weather Serv.

Upper Air Data

NOAA ARL

ATAD ATAD Traject

Gebhart (2002)

NPS-CIRA

IMPROVEData

PMF Tool

Pareto (2001) PMF “Sources”

Coutant (2002)

CATT Tool

Husar (2003)

Aggregation

Poirot (2003)

Direction of Dust Origin at 5 IMPROVE Sites

Ad hoc Data Processing Value Chain

High ‘dust’ concentration at 5 sites indicate the same airmass pathway from

the tropical Atlantic

Page 4: 050405 Epa Info System

Background• Atmospheric aerosol system has three extra dimensions (red), compared to gases (blue):

– Spatial dimensions (X, Y, Z) – Temporal Dimensions (T)– Particle size (D)– Particle Composition ( C ) – Particle Shape (S)

• Bad news: The mere characterization of the 7D aerosol system is a challenge– Spatially dense network -X, Y, Z(??)– Continuous monitoring (T)– Size segregated sampling (D) – Speciated analysis ( C )– Shape (??)

• Good news: The aerosol system is self-describing. – Once the aerosol is characterized (Speciated monitoring) and multidimensional aerosol data are

organized, (see RPO VIEWS effort), unique opportunities exists for extracting information about the aerosol system (sources, transformations) from the data directly.

• Analysts challenge: Deciphering the handwriting contained in the data – Chemical fingerprinting/source apportionment– Meteorological back-trajectory analysis– Dynamic modeling

Page 5: 050405 Epa Info System

SeaWiFS Satellite

SeaWiFS Satellite

Aerosol Chemical

Air Trajectory

Map Boarder

VIEW by Web Service Composition

Page 6: 050405 Epa Info System

The Researcher’s Challenge

“The researcher cannot get access to the data;if he can, he cannot read them;if he can read them, he does not know how good they are;and if he finds them good he cannot merge them with other data.”

Information Technology and the Conduct of Research: The Users ViewNational Academy Press, 1989

These resistances can be overcome through

• A catalog of distributed data resources for easy data ‘discovery’

• Uniform data coding and formatting for easy access, transfer and merging

• Rich and flexible metadata structure to encode the knowledge about data

• Powerful shared tools to access, merge and analyze the data

Page 7: 050405 Epa Info System

Petabytes 1015Terabytes 1012 Gigabytes 109 Megabytes 106

Calibration, Transformation To Characterized

Geophysical Parameters

Filtering, Aggregation, Fusion, Modeling,

Trends, Forecasting

InteractiveDissemination

ACCESS

Multi-platform/parameter, high space/time resolution,

remote & in-situ sensing

Sensing Analysis & Synthesis

Earth Science Data to Knowledge Transformation:Value-Adding Processes

Data Acquisition Value Chain (Network)

InfoSystem Goal: Add as much value to the data as possible to benefit all users

Data Usage Value Network

Flexible data selection, and processing to to deliver right knowledge, right place right time

Data - L1 Information – L2 Knowledge – L3-6? Usable Knowledge

Query

Data

Distributed, DynamicMore Local, DAAC

Processing Knowledge Use

Page 8: 050405 Epa Info System

Assertions on Web Services Technology• Currently Web Services are the leading (and only?) technologies for building software applications in autonomous,

networked, dynamic environment

• The future is promising since businesses are driving the WS technologies and the community is benefiting from the increasingly ‘semantic web’

• A growing resource pool is exposed as ‘services’ and WS-based ES applications development frameworks are being developed/evaluated (e.g. SciFlo, DataFed)

WS Adaptation Issues

• Catalogs for finding and using services are grossly inadequate

• The semantic layers of the interoperability stack are not yet available

• General ‘fallacies of distributed computing’:– Network is reliable

– Latency is zero

– Bandwidth infinite

– Network is secure

– Topology stable

– One administrator

– No transport costs

– Network uniform

Page 9: 050405 Epa Info System

Interoperability Stack

Layer Description Standards

Semantics Meaning WSDL ext., Policy, RDF

Data Types Schema, WSDL

Protocol Communication behavior SOAP, WS-* ext.

Syntax Data format XML

Transport Addressing, Data flow HTTP, SMTP

Kickoff Questions• What is a Web Service?

– e.g. 'A programming module with a well-defined, web-based I/O interface' (operating on well structured data??)– Examples of what is/is not a WS

• WS Classification by Interoperability Layer– Transport– Interface Syntax

• Strongly typed interface (e.g. SOAP, WSDL)• Weakly typed interface (e.g. arbitrary CGI? URL interface)

– Protocol/Data– Semantics

• WS Classification by Architecture– Services for Tightly Coupled applications (e.g. URL service called from IDL)

– Services for Loosely Coupled (e.g. application composed from SOAP services)

Page 10: 050405 Epa Info System

Data Flow and Flow Control in AQ Management

Relationship between different information activities

States

Regions

AIRS AQSEPA Air Portal

EPA Science Portal

VIEWS

AIRNOW

Page 11: 050405 Epa Info System

Information Techology Vision Scenario: Smoke ImpactREASoN Project: Application of NASA ESE Data and Tools to Particulate Air Quality Management (PPT/PDF)

• Scenario: Smoke form Mexico causes record PM over the Eastern US.

• Goal: Detect smoke emission and predict PM and ozone concentrationSupport air quality management and transportation safety

• Impacts: PM and ozone air quality episodes, AQ standard exceedanceTransportation safety risks due to reduced visibility

• Timeline: Routine satellite monitoring of fire and smokeThe smoke event triggers intensified sensing and analysisThe event is documented for science and management use

• Science/Air Quality Information Needs:Quantitative real-time fire & smoke emission monitoring PM, ozone forecast (3-5 days) based on smoke emissions data

• Information Technology Needs:Real-time access to routine and ad-hoc data and modelsAnalysis tools: browsing, fusion, data/model integrationDelivery of science-based event summary/forecast to air quality and

aviation safety managers and to the public

Record Smoke Impact on PM Concentrations

[email protected], [email protected]

Smoke Event

Page 12: 050405 Epa Info System

Smoke Scenario: IT needs and Capabilities

IT need vision Current state New capabilities How to get there

Real-time access to routine and ad-hoc fire, smoke,

transport data/ and models

Human analysts access a fraction of a subset of qualitative satellite images and some surface monitoring dataLimited real-time datasets are downloaded from providers, extracted, geo-time-param-coded, etc. by each analyst

Agents (services) to seamlessly access distributed data and provide uniformly presented views of the smoke.

Web services for data registration, geo-time-parameter referencing,

non-intrusive addition of ad hoc data; communal tools for data finding, extracting

Analysis tools for data browsing, fusion and data/model integration

Most tools are personal, dataset specific and ‘hand made’

Tools for navigating spatio-temporal data;

User-defined views of the smoke; Conceptual framework for merging satellite, surface and modeling data

Services linking tools

Service chaining languages for building web applications; Data browsers, data processing chains;

Smoke event summary and forecast for managers (air quality, aviation safety)

and the public

Uncoordinated event monitoring, serendipitous

and limited analysis. Event summary by qualitative description and illustration

Smoke event summary and forecast suitably packaged and delivered for agency and public decision makers

Community interaction during events through virtual workgroup sites; quantitative now-casting and observation-augmented forecasting

Page 13: 050405 Epa Info System

Data Analysis and Decision Support

  Retrospective Anal.

Months-years

Now AnalysisDays

Predictive AnalysisDays-years

Data Sources & Types

All the Real-Time data +NPS IMPROVE Aer. Chem.EPA SpeciationEPA PM10/PM2.5EPA CMAQ Full Chem. Model

EPA PM2.5MassNWS ASOS Visibility, WEBCAMsNASA MODIS, GOES, TOMS, MPLNOAA Fire, Weather & Wind NAAPS MODEL Simulation

NAAPS MODEL ForecastNOAA/EPA CMAQ?

Data Analysis Tools & Methods

Full chemical model simulationDiagnostic & inverse modelingChemical source apportionmentMultiple event statistics

Spatio-temporal overlaysMulti-sensory data integrationBack & forward trajectories, CATTPattern analysis

Emission and met. forecastsFull chemical modelData assimilationParcel tagging, tracking

Communication Collab. & Coord. Methods

Tech Reports for reg. supportPeer reviewed scientific papers Science-AQ mgmt. interactionReconciliation of perspectives

Analyst and managers consolesOpen, inclusive communicationData assimilation methodsCommunity data & idea sharing

Open, public forecastsModel-data comparisonModeler-data analyst comm.

Analysis Products Quantitative natural aer. concr.Natural source attributionComparison to manmade aer.

Current Aerosol PatternEvolving Event SummaryCausality (dust, smoke, sulfate)

Future natural emissionsSimulated conc. patternFuture location of high conc.

Decision Support Jurisdiction: nat./manmade State Implementation Plans, (SIP)PM/Haze Crit. Documents, Regs

Jurisdiction: nat./manmadeTriggers for management actionPublic information & decisions

Statutory & policy changes Management action triggersProgress tracking

Page 14: 050405 Epa Info System

Data Acquisition and Usage Value Chain

Monitor StoreData 1

Monitor StoreData 2

Monitor StoreData n

Monitor StoreData m

IntData1

IntDatan

IntData2 Virtual Int. Data

Page 15: 050405 Epa Info System

Information ‘Refinery’ Value Chain (Taylor, 1985)

Informing Knowledge

ActionProductive Knowledge

InformationData

Organizing

Grouping Classifying Formatting Displaying

Analyzing

SeparatingEvaluating Interpreting

Synthesizing

Judging

Options Quality

Advantages Disadvantages

Deciding

Matching goals, Compromising

Bargaining Deciding

e.g. CIRA VIEWS

e.g. Langley IDEA

FASTNET Summary Rpt

e.g. RPO Manager