Open Source Spatial ETL - talendforge.org
Transcript of Open Source Spatial ETL - talendforge.org
![Page 1: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/1.jpg)
camptocamp SA / GeoNetwork Workshop, 6 November 2007 / www.camptocamp.com / [email protected]
Spatial Data Integrator (SDI) powered by Open Source Spatial ETL
![Page 2: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/2.jpg)
2
Agenda Camptocamp and Talend presentation Why data integration in the geospatial domain? Talend overview Spatial Data Integrator (SDI) powered by Talend Tutorial & sample jobs Conclusion & questions
![Page 3: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/3.jpg)
3
Camptocamp, an Open Source Base Camp ! 35 employees
Switzerland & France About 50 to 70 % of growth per year since 2002 3 activity domains
Spatial solutions Business solutions Infrastructure solutions
4 services poles Consulting Engeneering Supporting Training
Geo-spatial Solutions
Infrastructure Solutions
Business Solutions
CONSULTING
ENGENEERING
SUPPORT
TRAINING
WebmappingGIS / MetadataSpatial Data InfrastructuresWeb Services
ERPBusiness IntelligenceETL
SecurityLinux ServerVoIP
![Page 4: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/4.jpg)
4
Talend overview Talend is the first provider of open source data integration software
Located in France, USA, Germany, China VC-funded 50 employees
First product release: 2006 Leader in open source data integration
Rival large established proprietary players
![Page 5: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/5.jpg)
5
What is ETL? Extract / Transform / Load ETL is a process in Data Warehousing. « How to get data in ? » is ETL process name.
Extract : extract data from source system where data originates.
Transform : apply series of rules or functions to the extracted data (selecting, translating, encoding, deriving, joining, summarizing, splitting, ... more on http://en.wikipedia.org/wiki/Extract,_transform,_load)
Load : once data transformed and cleaned, load the data in a data warehouse.
![Page 6: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/6.jpg)
6
Why Spatial Data Integration Data integration is a key process
Data volumes in exponential growth Diversity and heterogeneity of data sources Data processing plays a major role in implementing GIS projects Consolidating and aggregating spatial data with data from other
sources is often required GIS data integration situation
Use command or hand-made script from various tools and libraries gdal/ogr commands, fwtools, postgis command, ...
Proprietary Spatial ETL such as FME Lack of Open Source global geo-spatial data integrator
Spatial Data Integrator, Powered by Talend is now available! Prototyped summer 2007, Presented in FOSS4G2007
![Page 7: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/7.jpg)
7
Talend makes data integration solutions available to organizations of all sizes, and for all data integration needs
On-demandinternaldevelopments
LargeOrganizations
SMBs
Analytics
Operational
Democratize Data Integration
IBM/AscentialInformatica
OracleSunopsis
Data Mirror
PervasiveBusiness Objects, Cognos
Ab Initio
![Page 8: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/8.jpg)
8
Talend Data integrationSynchronize and check integrity
of your applications data
ExternalData Files
Migrate legacyapplications
Sales
AccountingFinance Production Budgeting
EDWH
Extract, Transform and Load Data
ERP/CRM
Replicate subset of datainto subject matter DM
Datamart
Datamart
Exchange / sharedata with customers
or suppliers
eCommerce
eExchange
![Page 9: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/9.jpg)
9
Spatial Data integrationSynchronize and check integrity
of your applications data
ExternalData Files
Migrate legacyapplications
Parcel
RoadsNetwork Production SoE
CentralGeodata
warehouse
Extract, Transform and Load Data
GeospatialDatabase
Replicate subset of datainto subject matter DM
Datamart
Datamart
Exchange / sharedata with customers
or suppliers
eCommerce
Govt agency
![Page 10: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/10.jpg)
10
Spatial Data Integrator Spatial Data Integrator is one component of the SDI useful for ...
Data manipulation (Extraction, Quality checking, Conversion, Projection)
Data & metadata production (vector and Raster analysis) Data & metadata manager (Network files and database
manipulation, archiving) Data dissemination (WWW publication, Deploy jobs as
webservice) Data reporting (Indicators, Analysis, ...)
... End user tools to define common tasks (ie. Job, process, script) usually made by hand or scripting in desktop GIS.
![Page 11: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/11.jpg)
11
Spatial Data Integrator in SDI
![Page 12: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/12.jpg)
12
The Talend offeringTalend Integration Suite - Enterprise Edition• Grid Conductor• CPU Balancer
Talend Integration Suite - Professional Edition• Distant Run• Job Conductor Advanced• Activity Monitoring Dashboard
Talend Integration Suite - Team Edition• Shared Repository• Job Conductor• Activity Monitoring Console
Talend On Demand• Hosted Repository
Talend Open Studio
Business ModelerJob DesignerMetadata Manager
Subscription
GPL
Spatial Data Integrator powered by Talend
Input/output spatial dataComplex and simple spatial components
SDIAdvancedSuite
![Page 13: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/13.jpg)
13
Talend Open Studio Key features
Business-oriented process modeling Graphical development Robust and scalable execution Broadest connectivity to support all systems Project repository for design and execution Real-time debugging
A high adoption rate 100,000 product downloads 20% register as users
Active community 1,000 beta testers 500 forum contributors
![Page 14: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/14.jpg)
14
Productivity & Ease of Use Graphical development
Dramatically increased productivity & ramp up Combined graphical & technical views Drag-and-drop mapping interface Large library of components & connectors
Leverage industry-standard languages Java, Perl, SQL
![Page 15: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/15.jpg)
15
Performance and robustness Highest performance, robust and scalable execution
Grid-distributed processing Industry-standard code generated (Java or Perl) Leverage both ETL and ELT architectures Process data closest to the source
p 9 # :2 7 / 0 4 / 2 0 0 7
J o b D e s i g n e r : b e s t p r a c t i c e s
A job: components connected together
Job
Sub-Job
![Page 16: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/16.jpg)
16
Versatility through Connectivity Broadest connectivity to support all systems
100+ connectors available out of the box RDBMS:
Oracle, PostgreSQL, MySQL, DB2, SQL Server, Sybase, Ingres, …
Web: Web Services, FTP, HTTP, POP, SMTP…
Files: Delimited, positional, XML, Excel…
Business Applications: SugarCRM, SalesForce.com, LDAP…
![Page 17: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/17.jpg)
17
Job pannel
Componentspalette
Job start/stopComponentproperties
tab
![Page 18: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/18.jpg)
18
Talend Project repository
p 9 # :2 7 / 0 4 / 2 0 0 7
T h e v i e w s : t h e R e p o s i t o r y ( c o n t i n u e d )
The Repository
shared code
non-technical graphicalrepresentation of a business requirement
graphical representation of the technical process
Metadata (stream definitions)
documentation
recycle bin
Context variables
![Page 19: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/19.jpg)
19
Job designer Job: components connected together
p 9 # :2 7 / 0 4 / 2 0 0 7
J o b D e s i g n e r : b e s t p r a c t i c e s
A job: components connected together
Job
Sub-Job
![Page 20: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/20.jpg)
20
Spatial Data Integrator - SDI Talend Open Studio with geo-spatial extensions SDI integrates a new family of vector and raster geo components
Based on reliable open source tools: Java Topology Suite (JTS) GeoTools GRASS
![Page 21: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/21.jpg)
21
Spatial Data Integrator Flow Architecture Uses GeoTools / Java Topology Suite (JTS) librairies
SDI Input Component SDI Transform Component SDI Output Component
INPUT (1..n) OUTPUT (1..n)PROCESSING
Files- Text files- GIS (ESRI, MapInfo)- RASTER (Gdal)Databases- All (JDBC) Databases- GIS PostgisTalend RowGenerator (build input using user criteria)
![Page 22: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/22.jpg)
22
Spatial Data Integrator Architecture SDI Components architecture
GeoTools Lib
jts.Geometry
sdi.Geometry
SDI Input/Output Component
GeoTools Lib
SDI Transformation Component
Talend flow
ref
(Object)
ref
Vector Datasets- files- databases
Talend flow ...
![Page 23: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/23.jpg)
23
Geospatial componentsFeaturemanipulation
Vectorformat
Rasterprocessing
Metadatamanagment
(experimental)
![Page 24: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/24.jpg)
camptocamp SA / GeoNetwork Workshop, 6 November 2007 / www.camptocamp.com / [email protected]
Getting started with Spatial Data IntegratorOpen Source Spatial ETL
![Page 25: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/25.jpg)
25
Data used in the tutorialMonitoring stations and rivers in the french part of the Alpes (mainly Rhône river basin)
![Page 26: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/26.jpg)
26
Convert Textfile to common GIS format Start Talend Create a workspace named SDI Copy tutorial datasets in TALEND_HOME/workspace/sdi/data
![Page 27: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/27.jpg)
camptocamp SA / GeoNetwork Workshop, 6 November 2007 / www.camptocamp.com / [email protected]
Getting started with Spatial Data Integrator
Tutorial n°1 : Convert Textfile to common GIS formatTutorial n°1 : Convert Textfile to common GIS format
![Page 28: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/28.jpg)
28
Convert Textfile to common GIS format
Input: CSV file with x,y and attributes columns for id and name of
monitoring stations Output:
Shapefile and Mapinfo file (optional) PostGIS table
Process: Create a point geometry using the x and y column of the text file
Objectives of this job is to produce ESRI Shapefile, Mapinfo filefrom a text file describing monitoring stations and their geographic location.
![Page 29: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/29.jpg)
29
Convert Textfile to common GIS format First step is to create a new job.
1.Start Talend SDI2.On the repository Tab, in Job
design, click on create a new Job
This will create a new pane where the job will be drawn.
![Page 30: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/30.jpg)
30
Convert Textfile to common GIS format Create metadata about the current job. Talend is able to produce metadata and versionning about jobs.
The Name is mandatory.
![Page 31: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/31.jpg)
31
Convert Textfile to common GIS format Open the « Palette » tab Open the « File/Input » family section
Add a tFileInputCSV component to the job
In the name of the component, the firstletter « t » stands for Talend initial components, « s » for Spatial ones, « u » for Users ones.
If a panel could not be find in the Talendworkspace (eg. « Palette »), click on menu « Window>Show view », and then search for the « Palette ».
![Page 32: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/32.jpg)
32
Convert Textfile to common GIS format Select the « Properies » tab, and select the file « TALEND_HOME/workspace/sdi/data/stations.txt »
tFileInputCSV component
Properties :- filename- separator
- ...- schema
![Page 33: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/33.jpg)
33
Convert Textfile to common GIS format The stations text file is composed of 4 columns:
Id : text Name : text X : double Y : double
... where coordinates are in WGS84.
... and text separator is ',' and decimal is '.'.
![Page 34: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/34.jpg)
34
Convert Textfile to common GIS format On the properties tab of the tFileInputCSV component, click on « edit schema »
Then add 4 fields Change name and type for each column.
Schema could be import & export when used frequently.
![Page 35: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/35.jpg)
35
Convert Textfile to common GIS format (Optional) add a tLogRow component (in the log & error family)
Connect the tFileInputCSV to the tLogRow (right click the component, select « row>main » and connect to the output component)
Run the job (F6) ... tLogRow is useful for debugging !
![Page 36: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/36.jpg)
36
Convert Textfile to common GIS format Objectives: Create a point from X and Y column.
Add a s2DPointReplacer to the job.
Connect the tFileInputCSV component
Move to the properties of 2DPointReplacer
Select column to use to create the point geometry
Try tLogRow 2.Select X and Y
column
1.Add the component
![Page 37: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/37.jpg)
37
Convert Textfile to common GIS format Objectives: Add output components.
Add a sShapefileOutput to the job.
Connect the s2DPointReplacer component
Display the properties of sShapefileOutput
Define the file name Run the job (F6) ... Add a MapInfo output
![Page 38: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/38.jpg)
38
Convert Textfile to common GIS format Run the job (F6)
Test some options :
Turn statistics on
Turn Traces on
Try to open the layers produced in a GIS
![Page 39: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/39.jpg)
39
Convert Textfile to common GIS format In tutorial n°1, user learned how-to:
Create a new job Add components to a job Link main flow between components Run a job (using statistics, traces and tLogRow for debuging)
![Page 40: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/40.jpg)
camptocamp SA / GeoNetwork Workshop, 6 November 2007 / www.camptocamp.com / [email protected]
Getting started with Spatial Data IntegratorOpen Source Spatial ETL
Tutorial n°2 : Publish GeoRSS feeds Tutorial n°2 : Publish GeoRSS feeds to the webto the web
![Page 41: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/41.jpg)
41
Publish GeoRSS feeds to the web
Input: Use previous job flow
Output: GeoRSS output
Process: Create a new attribute named link http://hydro.eaufrance.fr/
+CODE Rename attribute name to title (optional) Filter station where id start with 06
Objectives of this job is to define a mapping between elementcoming from a GIS layer to a GeoRSS output to be published to the web.
![Page 42: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/42.jpg)
42
Publish GeoRSS feeds to the web Output GeoRSS feed. Geometry format could be simple georss or gml point.
Attributes will be output using attribute's name (ie. to set the "title" element of the item, set the name of that attribute as "title"). To do so, use a tMap component.
This GeoRss output could be used in OpenLayers (for those of you who are going to attend to the OpenLayers lab ! ;)
![Page 43: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/43.jpg)
43
Publish GeoRSS feeds to the web In the previous job, ... Add a tMap and a sGeoRssOutput component
Link the component Define the properties of the output georss file (File name, channel description)
Then open the tMap interface ...
![Page 44: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/44.jpg)
44
Publish GeoRSS feeds to the web
Input (one or more)
Output (one or more)
Filtereg. Id starting
with « W »
Expressioneg. create an
attribute named link composed of 2
strings : « http://hydro.eaufrance.fr/stations/ »
concatenate with id
tMap interface:
![Page 45: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/45.jpg)
45
Publish GeoRSS feeds to the web
![Page 46: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/46.jpg)
46
Publish GeoRSS feeds to the web Run the job Open the GeoRss feed ... try this feed later with OpenLayers
... add a tFtpPut component to publish the file to a webserver
![Page 47: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/47.jpg)
47
Publish GeoRSS feeds to the web In tutorial n°2, user learned how-to:
Define a mapping between input/output columns Filter data using a tMap component Create new field using expression
![Page 48: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/48.jpg)
camptocamp SA / GeoNetwork Workshop, 6 November 2007 / www.camptocamp.com / [email protected]
Getting started with Spatial Data IntegratorOpen Source Spatial ETL Sample jobs :Sample jobs :
- Nearest Neighbour- Nearest Neighbour- Dissolve geometry- Dissolve geometry- Metadata convert existing xml file - Metadata convert existing xml file to/from ISO 19115, 19139, to/from ISO 19115, 19139, ArcCatalogueArcCatalogue
![Page 49: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/49.jpg)
49
Find nearest river for each station
Input: Monitoring stations (format
Shapefile) Rivers (format Shapefile)
Objectives of this job is to find the nearest river for eachmonitoring stations
Output: Updated monitoring stations
![Page 50: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/50.jpg)
50
Dissolver
Input: Catchments
Dissolve geometry based on an attribute
Output: Main catchements
![Page 51: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/51.jpg)
51
Dissolver
Input: Catchments
Dissolve geometry based on an attribute
Output: Main catchements
![Page 52: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/52.jpg)
52
BoundingBox and ConvexHull aggregator
Input: Monitoring stations
Compute boundingBox and convexHull polygon for a layer
Output: BoundingBox ConvexHull polygon
![Page 53: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/53.jpg)
53
Metadata batch conversion
Input: One or more files (tFileList is
used to iterate over files in a specific directory)
Convert metadata file from on format to another (ISO19115, ISO19139, ArcCatalogue)
Output: XML files
Thanks GeoNetwork & GeoSource projects for the XSL styleSheets !
![Page 54: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/54.jpg)
54
... and more using Geospatial components !Featuretransformation
Vectorformat
Rasterprocessing
Metadatamanagment
(experimental)
... and all others components !In the community, user components are also available (eg. geolocalize)
![Page 55: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/55.jpg)
camptocamp SA / GeoNetwork Workshop, 6 November 2007 / www.camptocamp.com / [email protected]
What's up for the future?Open Source Spatial ETL
![Page 56: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/56.jpg)
56
What's up for the future? Raster components Raster components use GRASS tools GRASS components:
![Page 57: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/57.jpg)
57
What's up for the future? Metadata Objectivs: Establish a closer link between data & metadata during production step.
Quick metadata entry (title and abstract + automatic fields) Do not create metadata after data creation / better improving
metadata New component to compute metadata during the job :
User editor: title / abstract / purpose / category SDI generate: Bbox / Number of objects / Geometry Type
Support for ISO & DCLITE4G standards Component to publish the metadata into an existing catalogue (support only GeoNetwork catalogue)
Status: Beta version
![Page 58: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/58.jpg)
58
What's up for the future? MetadataStart working on metadata during the creation ofa new dataset.
A GIS layer is describe at least by a title, an abstract and a bounding box. This is, in most case, enough to enable search in a catalogue and be able to know if that layer matchs users needs.
In all SDI output component a form to create metadata is available
Using metadata templates as in GeoNetwork Analysis the use of tag (like DATE) to have consistent metadata
![Page 59: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/59.jpg)
59
What's up for the future? Metadata
Metadata publication steps: Login to the GeoNetwork node Select group and category Publish the metadata
Published metadata produced in a job in a catalogue (eg. GeoNetwork).
![Page 60: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/60.jpg)
60
What's up for the future? Metadata
![Page 61: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/61.jpg)
61
Spatial Data Integrator strengths Fast and efficient User-friendly Interface Easily customizable jobs (code generation) Benefits of « classical » ETL features Fully Open Source (GPL licence) Scalable High level of support by Camptocamp and Talend
![Page 62: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/62.jpg)
62
Perspectives Development of new components:
Simple and complex components New input and output formats Community contribution very welcomed
Spatial data viewer (uDig) Raster components optimization (Jgrass) Metadata components Integration of high-end Talend features:
Load balancing, Job conductor, Grid conductor Integration in Entreprise Service Bus (ESB) systems (PEtALS)
![Page 63: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/63.jpg)
63
Spatial Data Integrator project Community infrastructure is being set up (mailing list, forum, wiki, download area, tutorial, ...).
Register your interest to be informed: http://www.camptocamp.com/sdi
![Page 64: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/64.jpg)
64
Contacts Camptocamp:
François-Xavier Prunayre [email protected]
David Jonglez, [email protected] Claude Philipona, claude.philipona@camptocamp http://www.camptocamp.com/sdi
![Page 65: Open Source Spatial ETL - talendforge.org](https://reader030.fdocuments.in/reader030/viewer/2022012710/61aa423c581fc012ae099c33/html5/thumbnails/65.jpg)
65