Organizing the Data Chaos of Scientists
-
Upload
andreas-schreiber -
Category
Technology
-
view
2.812 -
download
0
description
Transcript of Organizing the Data Chaos of Scientists
![Page 1: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/1.jpg)
Folie 1PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
DataFinder: Organizing the Data Chaos of Scientists
PyCon UK 2008 (September 12th, 2008, Birmingham)
Andreas Schreiber <[email protected]>
German Aerospace Center (DLR), Cologne
http://www.dlr.de/sc
![Page 2: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/2.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 2
The DLRGerman Aerospace Research Center Space Agency of the Federal Republic of Germany
![Page 3: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/3.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 3
5,700 employees working in 29 research institutes and facilities
at 13 sites.
Offices in Brussels, Paris and Washington. Köln
Lampoldshausen
Stuttgart
Oberpfaffenhofen
Braunschweig
Göttingen
Berlin-
Bonn
Trauen
Hamburg
Neustrelitz
Weilheim
Bremen-
Sites and employees
![Page 4: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/4.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 4
Short Overview
DataFinder is a software for efficient management of scientific and technical data
Focus on huge data sets
Development by DLR
Primary functionality
Structuring of data through assignment of meta information and self-defined data models
Flexible usage of heterogeneous storage resources
Integration in the working environment
![Page 5: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/5.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 5
Introduction
DataFinder founded by DLR
National Grid project AeroGrid
![Page 6: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/6.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 6
IntroductionBackground
Large-scale simulations
aerodynamics
material science
climate
…
Tons of measured data
wind-tunnel experiments
earth observations
traffic data
…
![Page 7: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/7.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 7
IntroductionData Management Problem
Typical organizational situations
No central data management policy
Every employee organizes his/her data individually
Researchers spend about 30% of their time searching for data
Problem with data left behind by temporary staff
Increase of data size and regulations
Rapidly growing volume of simulation and experimental data
Legal requirements for long-term availability of data (up to 50 years!)
Situation similar at many organizations
All ~30 DLR institutes
Other research labs and agencies
Industry
![Page 8: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/8.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 8
DataFinder HistorySearch for solution for scientific data management
Definition of “standard problem” (helicopter simulation)
Test case for evaluation of software
Evaluation of commercial product data management (PDM) systems
PDM systems could manage data but with huge amount of costs
PDM systems have many unneeded functionalities
PDM systems have self-defined or unreadable scripting languages for extension and customization (Tcl etc.)
Development of DataFinder
Lightweight data management client and existing server solution
Just enough functionality for our problems (no paid but unused features!)
![Page 9: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/9.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 9
DataFinder DevelopmentFrom Java Prototype to Python Product…
Development of prototype in Java
Data could be manages with prototype successfully
Drawbacks: Java problems on important platforms (e.g., SGI IRIX)
Embedded Jython interpreter great feature for users
User: “The Java GUI is like shit, but the Python scripting is great. We want a pure Python solution!”
Development of DataFinder product from scratch in Python
![Page 10: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/10.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 10
Python for Scientists and EngineersReasons for Python in Research and Industry
Observations:
Scientists and engineers don’t want to write software but just
solve their problems
If they have to write code, it must be as easy as possible
Why Python is perfect?
Very easy to learn and easy to use
( = steep learning curve)
Allows rapid development
( = short development time)
Inherent great maintainability
“I want to design planes,
not software!”
![Page 11: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/11.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 11
“Python has the cleanest, most-scientist- or engineer friendly syntax and semantics.
Paul F. Dubois
Paul F. Dubois. Ten good practices in scientific programming. Comp. In Sci. Eng., Jan/Feb 1999, pp.7-11
![Page 12: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/12.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 12
DataFinder OverviewBasic Concept
Client-Server solution
Based on open and stable standards, such as XML and WebDAV
Extensive use of standard software components (open source / commercial), limited own development at client side
![Page 13: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/13.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 13
WebDAVWeb-based Distributed Authoring & Versioning
Extension of HTTP
Allows to manage files on remote servers collaboratively
WebDAV supports
Resources (“files”)
Collections (“directories”)
Properties (“meta data”, in XML format)
Locking
WebDAV extensions
Versioning (DeltaV)
Access control (ACP)
Search (DASL)
![Page 14: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/14.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 14
DataFinder OverviewClient and Server
Client
User client
Administrator client
Implementation: Python with Qt
Server
WebDAV server for meta data and data structure
Data Store concept
Abstracts access to managed data
Flexible usage of heterogeneous storage resources
Implementation: Various existing server solutions (third-party)
![Page 15: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/15.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 15
DataFinder ClientGraphical User Interfaces
User Client Administrator Client
Implementation in Python with Qt/PyQt
Implementation in Python with Qt/PyQt
![Page 16: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/16.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 16
DataFinder ServerSupported WebDAV servers
Commercial Server Solution
Tamino XML database (Software AG)
Open Source Server Solutions
Apache HTTP Web server and module mod_dav
Default storage: file system (mod_dav_fs)
Module Catacomb (mod_dav_repos) + Relational database
(http://catacomb.tigris.org)
![Page 17: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/17.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 20External Medias
(CD, DVD,…)
Mass Data StorageData Stores
Meta Data Server
Department
Employee
Simulation
Geometry
Grid Generation
Flow Solution
Visualisation
Data Access
WebDAV Server
FTP/GridFTP Server
Tivoli StorageManager
Storage Resource Broker
File System
Amazon S3
Logical View User ClientStorage
Locations
![Page 18: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/18.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 21
DataFinder Technical Aspects
Access privilege management
Authentication using WebDAV and LDAP
Authorization for users and groups based on WebDAV (ACP)
Client available on many platforms
Linux, Windows, …
Restricted by availability of Python 2.5 and Qt 3 + PyQt
Extensible through Python scripts
Python application programming interface (API)
Accessing data and meta data
![Page 19: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/19.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 22
Python API User Client Extension with GUI
import threadingfrom datafinder.application import search_supportfrom datafinder.gui.user import facade
def searchAndDisplayResult(): """Searches and displays the result in the search result logging window. """ query = "displayname contains ‘test’ OR displayname == ‘ab’" result = search_support.performSearch(query) resultLogger = facade.getSearchResultLogger() for path in result.keys(): resultLogger.info("Found item %s." % path)
thread = threading.Thread(target=searchAndDisplayResult)thread.start()
import threadingfrom datafinder.application import search_supportfrom datafinder.gui.user import facade
def searchAndDisplayResult(): """Searches and displays the result in the search result logging window. """ query = "displayname contains ‘test’ OR displayname == ‘ab’" result = search_support.performSearch(query) resultLogger = facade.getSearchResultLogger() for path in result.keys(): resultLogger.info("Found item %s." % path)
thread = threading.Thread(target=searchAndDisplayResult)thread.start()
![Page 20: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/20.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 23
Python API Command Line Example (without GUI)
# Get APIfrom datafinder.application import ExternalFacade
externalFacade = ExternalFacade.getInstance()
# Connect to a repositoryexternalFacade.performBasicDatafinderSetup(username, password, startUrl)
# Download the whole contentrootItem = externalFacade.getRootWebdavServerItem()items = externalFacade.getCollectionContents(rootItem)for item in items: externalFacade.downloadFile(item, baseDirectory)
# Get APIfrom datafinder.application import ExternalFacade
externalFacade = ExternalFacade.getInstance()
# Connect to a repositoryexternalFacade.performBasicDatafinderSetup(username, password, startUrl)
# Download the whole contentrootItem = externalFacade.getRootWebdavServerItem()items = externalFacade.getCollectionContents(rootItem)for item in items: externalFacade.downloadFile(item, baseDirectory)
![Page 21: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/21.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 24
Additional “Batteries”…Used Libraries beyond the Python Standard Library (1)
PyQt (http://www.riverbankcomputing.co.uk/software/pyqt)Interface to the Qt GUI framework (currently Qt 3)Used for DataFinder UI layer
Pyparsing (http://pyparsing.wikispaces.com/)Creating and executing simple grammarsUsed for highlighting search expressions
python-ldap (http://python-ldap.sourceforge.net/)Object-oriented API to access LDAP serversAuthentication against LDAP / ActiveDirectory server
paramiko (http://www.lag.net/paramiko)SSH2 protocol implementation
![Page 22: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/22.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 25
Additional “Batteries”…Used Libraries beyond the Python Standard Library (2)
PyGlobus (http://www-itg.lbl.gov/gtg/projects/pyGlobus)
Interface to The Globus Toolkit
Used for GridFTP Data Store
Boto (http://code.google.com/p/boto)
Interfaces to Amazon Web Services
Used for S3 (Simple Storage Service) Data Store
davlib (http://www.webdav.org/mod_dav/davlib.py)
WebDAV client library
Used for core WebDAV functions
![Page 23: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/23.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 26
WebDAV Client LibrarySupport for DAV Extensions
Provides an object-oriented interface for accessing WebDAV server
Extracted from DataFinder source
WebDAV client-side library supports
Core WebDAV specification
Access Control Protocol
Basic Versioning (experimental)
DAV Searching and Locating
Secure HTTP connections
Implementation based on davlib and standard httplib
Apache License Version 2
Project Site: http://sourceforge.net/projects/pythonwebdavlib
![Page 24: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/24.jpg)
Folie 27PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Simple Use Case:File Upload and Search
![Page 25: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/25.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 28
![Page 26: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/26.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 29
![Page 27: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/27.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 30
![Page 28: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/28.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 31
![Page 29: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/29.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 32
![Page 30: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/30.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 33
![Page 31: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/31.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 34
![Page 32: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/32.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 35
![Page 33: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/33.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 36
![Page 34: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/34.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 37
![Page 35: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/35.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 38
![Page 36: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/36.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 39
![Page 37: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/37.jpg)
Folie 40PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Working with DataFinder…
![Page 38: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/38.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 41
Configuration and CustomizationPreparing DataFinder for certain “use cases”
Requirements Analysis
Analyze data, working environment, and users workflows
Configuration
Define and configure data model
Configure distributed storage resources (Data Stores)
Customization
Write functional extensions with Python scripts
![Page 39: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/39.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 42
DataFinder ConfigurationData Model and Data Stores
Logical view to data
Definition of data structuring and meta data(“data model”)
Separated storage of data structure / meta data and actual data files
Flexible use of (distributed) storage resources
File system, WebDAV, FTP, GridFTP
Amazon S3 (Simple Storage Service)
Tivoli Storage Manager (TSM)
Storage Resource Broker (SRB)
Complex search mechanism to find data
Department
Employee
Simulation
Geometry
Grid Generation
Flow Solution
Visualisation
![Page 40: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/40.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 43
Data StructureMapping of Organizational Data Structures
User
Project A
Project B
Project C
File 1
File 2
Simulation I
Experiment
Simulation II
Project MegaCode UltraUser EddieKey Value
Object(collection)
Object(file)
Relation Project MegaCode UltraUser EddieKey Value
Project MegaCode UltraUser EddieKey Value
Project MegaCode UltraUser EddieKey Value
Project MegaCode UltraUser EddieKey Value
Project MegaCode UltraUser EddieKey Value
Project MegaCode UltraUser EddieKey Value
Project MegaCode UltraUser EddieKey Value
Project MegaCode UltraUser EddieKey Value
Attributes(meta data)
![Page 41: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/41.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 44
Meta Data
Describe and annotate data (“files”) and collections (“directories”)
Different levels of meta data
Required attributes defined by administrator
User is free to choose additional ones
Different types of meta data
String
Numbers (float, double, …)
Lists
Pictures
Links
Stored in XML format
User can search in meta data
![Page 42: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/42.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 45
Impact for Users
“Damn! I’m a great scientist!I want freedom to have
my own directory layout…”
DataFinder restricts the rights of users!
Enforcement of “good behavior”
User must comply to organizational standards
Data is stored in defined (directory) hierarchy on data server
Required meta data must be set prior upload
User have certain access rights within hierarchy
![Page 43: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/43.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 46
Customization Python-Scripting for Extension and Automation
Integration of DataFinder with environment
User, infrastructure, software, …
Extension of DataFinder by Python scripts
Actions for resources (i.e., files, directories)
User interface extensions
Typical automations and customizations
Data migration and data import
Start of external application (with downloaded data files)
Extraction of meta data from result files
Automation of recurring tasks (“workflows”)
![Page 44: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/44.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 47
DataFinder Scripting Downloading File and Starting Application# Download the selected file and try to execute it.from datafinder.application import ExternalFacadefrom guitools.easygui import *import osfrom tempfile import *from win32api import ShellExecute # Get instance of ExternalFacade to access DataFinder APIfacade = ExternalFacade.getInstance() # Get currently selected collection in DataFinder Server-View resource = facade.getSelectedResource()
if resource != None: tmpFile = mktemp(ressource.name) facade.downloadFile(resource, tmpFile)
if os.path.exists(tmpFile): ShellExecute(0, None, tmpFile, "", "", 1)else: msgbox("No file selected to execute.")
# Download the selected file and try to execute it.from datafinder.application import ExternalFacadefrom guitools.easygui import *import osfrom tempfile import *from win32api import ShellExecute # Get instance of ExternalFacade to access DataFinder APIfacade = ExternalFacade.getInstance() # Get currently selected collection in DataFinder Server-View resource = facade.getSelectedResource()
if resource != None: tmpFile = mktemp(ressource.name) facade.downloadFile(resource, tmpFile)
if os.path.exists(tmpFile): ShellExecute(0, None, tmpFile, "", "", 1)else: msgbox("No file selected to execute.")
![Page 45: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/45.jpg)
Folie 48PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Examples…
![Page 46: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/46.jpg)
Folie 49PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Example 1:Example 1:Turbine SimulationTurbine Simulation
![Page 47: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/47.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 50
Example 1: Fluid Dynamics SimulationTurbine Simulation
Design of new turbine engines
High-resolution simulation of flow
Computational Fluid Dynamics (CFD)
Use of high-performance computing resources (Cluster / Grid)
Huge amounts of data (>100 GByte)
DataFinder used for
Management of results
Automation of simulation runs
Starting pre-/post processing
Used for CFD-code TRACE (DLR)
See http://www.aero-grid.de
![Page 48: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/48.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 51
Simulation steps (example):
1. splitCGNSPreparing data for TRACE
2. TRACE (CFD solver)Main computation
3. fillCGNSConflating results
4. Post ProcessingData reduction and visualization
Automation with customized DataFinder
Turbine SimulationData Model
![Page 49: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/49.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 52
Turbine Simulation: Graphical User Interface
![Page 50: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/50.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 53
Turbine Simulation: Customized GUI Extensions11
22
3
4
55
1.1. Create new simulationCreate new simulation
2.2. Start a simulation Start a simulation
3.3. Query statusQuery status
4.4. Cancel simulationCancel simulation
5.5. Project overviewProject overview
![Page 51: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/51.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 54
Turbine Simulation Starting External Applications
1. CGNS Infos / ADFview / CGNS Plot
2. TRACE GUI
3. Gnuplot
1
2
3
![Page 52: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/52.jpg)
Folie 55PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Example 2:Example 2:Automobile SupplierAutomobile Supplier
![Page 53: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/53.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 56
Example 2: Automobile SupplierDataFinder for Simulation and Data Management
Tasks
• Automation and management of simulation of customers
• Mapping of specific work sequence
• High flexibility regarding customers requirements
![Page 54: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/54.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 57
Automobile SupplierData Model
![Page 55: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/55.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 58
Automobile SupplierConfiguration of Customers Parameters
![Page 56: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/56.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 59
Automobile SupplierManagement of Simulations
Status overview
Create, change, and deletedata sets
Manage versions of datafiles
Parameter overview
![Page 57: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/57.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 60
Automobile SupplierUpload, Download, and Versioning of Files
Upload/download of results
Versioning of results
Script store results in DataFinder data structures
![Page 58: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/58.jpg)
Folie 61PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Example 3:Example 3:Air Traffic Air Traffic ManagementManagement
Example 3:Example 3:Air Traffic Air Traffic ManagementManagement
![Page 59: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/59.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 62
Example 3: Air Traffic Monitoring Database for Air Traffic Monitoring
Air traffic monitoring is important for researchPredictions of air trafficNew traffic management approaches
Usage of DataFinderDatabase for traffic data and reportsProject oriented view
![Page 60: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/60.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 63
Database for Air Traffic MonitoringData Model and Data Migration
![Page 61: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/61.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 64
Database for Air Traffic MonitoringData Import Wizard
Import of all data sources (PDF/Word/text files, Excel, Access, …)
Classification into multiple categories
Prevention of duplicated data and consistent naming
![Page 62: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/62.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 65
Database for Air Traffic MonitoringSearch Results
![Page 63: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/63.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 66
Current Work and Future Plans
Current work
Migration to Qt 4
Improved usage (e.g., search dialogs)
Integration with Shibboleth
Future
Web interfaces
Jython
Embedding in Java/Eclipseapplications
Reuse of custom GUI dialogs
Migration to Py3k
![Page 64: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/64.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 68
Availability
DataFinder core available as Open Source
BSD License
http://sourceforge.net/projects/datafinder
Extended versions / extensions are proprietary
![Page 65: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/65.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 69
Links
DataFinder Web sitehttp://www.dlr.de/datafinder
DataFinder Open Sourcehttp://sourceforge.net/projects/datafinder
Python WebDAV libraryhttp://sourceforge.net/projects/pythonwebdavlib
Catacombhttp://catacomb.tigris.org
AeroGrid Projecthttp://www.aero-grid.de
![Page 66: Organizing the Data Chaos of Scientists](https://reader038.fdocuments.in/reader038/viewer/2022110115/54c207d04a7959750b8b45ac/html5/thumbnails/66.jpg)
PyCon UK 2008 > Andreas Schreiber > DataFinder > 12.09.2008
Folie 70
Questions?Questions?